jade.jobs.job_configuration.JobConfiguration

class jade.jobs.job_configuration.JobConfiguration(container=None, job_global_config=None, job_post_process_config=None, user_data=None, submission_groups=None, setup_command=None, teardown_command=None, node_setup_command=None, node_teardown_command=None, **kwargs)[source]

Bases: ABC

Base class for any simulation configuration.

Constructs JobConfiguration.

Parameters:

Methods

add_job(job)

Add a job to the configuration.

add_user_data(key, data)

Add user data referenced by a key.

append_submission_group(submission_group)

Append a submission group.

assign_default_submission_group(submitter_params)

check_job_dependencies()

Check for impossible conditions with job dependencies.

check_job_estimated_run_minutes(group_name)

Check that estimated_run_minutes is set for all jobs in a group.

check_job_runtimes()

Check for any job with a longer estimated runtime than the walltime.

check_spark_config()

If Spark jobs are present in the config, configure the params to run one job at a time.

check_submission_groups()

Check for invalid job submission group assignments.

clear()

Clear all configured jobs.

create_from_result(job, output_dir)

Create an instance from a result file.

deserialize(filename_or_data[, ...])

Create a class instance from a saved configuration file.

dump([filename, stream, indent])

Convert the configuration to structured text format.

dumps([fmt_module])

Dump the configuration to a formatted string.

get_default_submission_group()

Return the default submission group.

get_job(name)

Return the job matching name.

get_num_jobs()

Return the number of jobs in the configuration.

get_submission_group(name)

Return the submission group matching name.

get_user_data(key)

Get the user data associated with key.

iter_jobs()

Yields a generator over all jobs.

job_execution_class(extension_name)

Return the class used for job execution.

job_parameters_class(extension_name)

Return the class used for job parameters.

list_jobs()

Return a list of all jobs.

list_user_data_keys()

List the stored user data keys.

reconfigure_jobs(jobs)

Reconfigure with a list of jobs.

remove_job(job)

Remove a job from the configuration.

remove_user_data(key)

Remove the key from the user data config.

serialize([include])

Create data for serialization.

serialize_for_execution(scratch_dir[, ...])

Serialize config data for efficient execution.

serialize_jobs(directory)

Serializes main job data to job-specific files.

show_jobs()

Show the configured jobs.

shuffle_jobs()

Shuffle the job order.

Attributes

FILENAME_DELIMITER

FORMAT_VERSION

job_global_config

Return the global configs applied to all jobs.

node_setup_command

Command to run on each node before starting jobs

node_teardown_command

Command to run on each node after completing jobs

setup_command

Command to run by submitter before submitting jobs

submission_groups

Return the submission groups.

teardown_command

Command to run by last node before completing jobs

add_user_data(key, data)[source]

Add user data referenced by a key. Must be JSON-serializable

Parameters:
  • key (str)

  • data (any)

Raises:

InvalidParameter – Raised if the key is already stored.

get_user_data(key)[source]

Get the user data associated with key.

Parameters:

key (str)

Return type:

any

remove_user_data(key)[source]

Remove the key from the user data config.

Parameters:

key (str)

list_user_data_keys()[source]

List the stored user data keys.

Returns:

list of str

Return type:

list

check_job_dependencies()[source]

Check for impossible conditions with job dependencies.

Raises:

InvalidConfiguration – Raised if job dependencies have an impossible condition.

check_job_estimated_run_minutes(group_name)[source]

Check that estimated_run_minutes is set for all jobs in a group.

check_job_runtimes()[source]

Check for any job with a longer estimated runtime than the walltime.

Raises:

InvalidConfiguration – Raised if any job is too long.

check_spark_config()[source]

If Spark jobs are present in the config, configure the params to run one job at a time.

check_submission_groups()[source]

Check for invalid job submission group assignments. Make a default group if none are defined and assign it to each job.

Raises:

InvalidConfiguration – Raised if submission group assignments are invalid.

abstract create_from_result(job, output_dir)[source]

Create an instance from a result file.

Parameters:
Return type:

class

add_job(job)[source]

Add a job to the configuration.

Parameters:

job (JobParametersInterface)

clear()[source]

Clear all configured jobs.

dump(filename=None, stream=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, indent=2)[source]

Convert the configuration to structured text format.

Parameters:
  • filename (str | None) – Write configuration to this file (must be .json or .toml). If None, write the text to stream. Recommend using .json for large files. .toml is much slower.

  • stream (file) – File-like interface that supports write().

  • indent (int) – If JSON, use this indentation.

Raises:

InvalidParameter – Raised if filename does not have a supported extenstion.

dumps(fmt_module=<module 'toml' from '/opt/hostedtoolcache/Python/3.9.19/x64/lib/python3.9/site-packages/toml/__init__.py'>, **kwargs)[source]

Dump the configuration to a formatted string.

classmethod deserialize(filename_or_data, do_not_deserialize_jobs=False)[source]

Create a class instance from a saved configuration file.

Parameters:
  • filename (str | dict) – path to configuration file or that file loaded as a dict

  • do_not_deserialize_jobs (bool) – Set to True to avoid the overhead of loading all jobs from disk. Job_names will be stored instead of jobs.

Return type:

class

Raises:

InvalidParameter – Raised if the config file has invalid parameters.

get_job(name)[source]

Return the job matching name.

Return type:

namedtuple

get_num_jobs()[source]

Return the number of jobs in the configuration.

Return type:

int

property job_global_config

Return the global configs applied to all jobs.

iter_jobs()[source]

Yields a generator over all jobs.

Yields:

iterator over JobParametersInterface

list_jobs()[source]

Return a list of all jobs.

Returns:

list of JobParametersInterface

Return type:

list

append_submission_group(submission_group)[source]

Append a submission group.

Parameters:

submission_group (SubmissionGroup)

get_default_submission_group()[source]

Return the default submission group.

Return type:

SubmissionGroup

get_submission_group(name)[source]

Return the submission group matching name.

Parameters:

name (str)

Return type:

SubmissionGroup

property submission_groups

Return the submission groups.

Return type:

list

reconfigure_jobs(jobs)[source]

Reconfigure with a list of jobs.

Parameters:

DistributionConfiguration.parameter_type (list of)

remove_job(job)[source]

Remove a job from the configuration.

Parameters:

job (JobParametersInterface)

serialize(include=ConfigSerializeOptions.JOBS)[source]

Create data for serialization.

serialize_jobs(directory)[source]

Serializes main job data to job-specific files.

Parameters:

directory (str)

serialize_for_execution(scratch_dir, are_inputs_local=True)[source]

Serialize config data for efficient execution.

Parameters:
  • scratch_dir (str) – Temporary storage space on the local system.

  • are_inputs_local (bool) – Whether the existing input data is local to this system. For many configurations accessing the input data across the network by many concurrent workers can cause a bottleneck and so implementations may wish to copy the data locally before execution starts. If the storage access time is very fast the question is irrelevant.

Returns:

Name of serialized config file in scratch directory.

Return type:

str

property setup_command

Command to run by submitter before submitting jobs

property teardown_command

Command to run by last node before completing jobs

property node_setup_command

Command to run on each node before starting jobs

property node_teardown_command

Command to run on each node after completing jobs

shuffle_jobs()[source]

Shuffle the job order.

show_jobs()[source]

Show the configured jobs.

job_execution_class(extension_name)[source]

Return the class used for job execution.

Parameters:

extension_name (str)

Return type:

class

job_parameters_class(extension_name)[source]

Return the class used for job parameters.

Parameters:

extension_name (str)

Return type:

class