Data Models

HpcConfig

Property

Type

Description

Required

Default

hpc_type

Any

Type of HPC queueing system (such as ‘slurm’)

True

job_prefix

string

Prefix added to each HPC job name

False

job

hpc

Union[SlurmConfig, LocalHpcConfig, FakeHpcConfig]

Interface-specific config options

True

SlurmConfig

Property

Type

Description

Required

Default

account

string

Project account to use

True

partition

string

HPC partition on which to submit

False

reservation

string

HPC reservation on which to submit

False

qos

string

Set to high to get faster node allocations at twice the cost

False

walltime

string

Maximum time allocated to each node

False

4:00:00

gpu gres

string

Request nodes that have at least this number of GPUs. Ex: ‘gpu:2’

False

mem

string

Request nodes that have at least this amount of memory

False

tmp

string

Request nodes that have at least this amount of storage scratch space

False

nodes

integer

Number of nodes to use for each job

False

ntasks

integer

Number of tasks per job (nodes is not required if this is provided)

False

ntasks_per_node

integer

Number of tasks per job (max in number of CPUs)

False

SubmitterParams

This object can be automatically generated with this command:

$ jade config submitter-params

Property

Type

Description

Required

Default

Generate Reports generate_reports

boolean

Controls whether to generate reports after completion

False

True

Hpc Config hpc_config

HpcConfig

HPC config options

True

Max Nodes max_nodes

integer

Max number of compute nodes to use simultaneously, default is unbounded

False

Num Processes num_processes

integer

Number of processes to run in parallel on each node

False

Per Node Batch Size per_node_batch_size

integer

How many jobs to assign to each node

False

500

Node Setup Script node_setup_script

string

Script to run on each node before starting jobs

False

Node Shutdown Script node_shutdown_script

string

Script to run on each node after completing jobs

False

Poll Interval poll_interval

integer

Interval in seconds on which to poll jobs for status

False

10

Resource Monitor Interval resource_monitor_interval

integer

Interval in seconds on which to collect resource stats. Disable monitoring by setting this to None/null.summaries of stats.

False

10

resource_monitor_type

Any

Type of resource monitoring to perform. Options: [‘aggregation’, ‘periodic’, ‘none’]

False

aggregation

Resource Monitor Stats resource_monitor_stats

Any

Resource utilization stats to monitor

False

{‘cpu’: True, ‘disk’: False, ‘memory’: True, ‘network’: False, ‘process’: False, ‘include_child_processes’: True, ‘recurse_child_processes’: False}

Try Add Blocked Jobs try_add_blocked_jobs

boolean

Add blocked jobs to a batch if all blocking jobs are in the batch. Be aware of time constraints.

False

True

Time Based Batching time_based_batching

boolean

Use time-based batching instead of job-count-based batching

False

False

Dry Run dry_run

boolean

Dry run mode; don’t start any jobs

False

False

Verbose verbose

boolean

Enable debug logging

False

False

Singularity Params singularity_params

Any

Singularity container parameters

False

Distributed Submitter distributed_submitter

boolean

Submit new jobs and update status on compute nodes.

False

True

SubmissionGroup

Property

Type

Description

Required

Default

name

string

User-defined name of the group

True

submitter_params

SubmitterParams

Submission parameters for the group

True