jade.jobs.job_configuration.JobConfiguration¶

class jade.jobs.job_configuration.JobConfiguration(container=None, job_global_config=None, job_post_process_config=None, user_data=None, submission_groups=None, setup_command=None, teardown_command=None, node_setup_command=None, node_teardown_command=None, **kwargs)[source]¶

Bases: ABC

Base class for any simulation configuration.

Constructs JobConfiguration.

Parameters:

inputs (JobInputsInterface)
container (JobContainerInterface)

Methods

`add_job`(job)	Add a job to the configuration.
`add_user_data`(key, data)	Add user data referenced by a key.
`append_submission_group`(submission_group)	Append a submission group.
`assign_default_submission_group`(submitter_params)
`check_job_dependencies`()	Check for impossible conditions with job dependencies.
`check_job_estimated_run_minutes`(group_name)	Check that estimated_run_minutes is set for all jobs in a group.
`check_job_runtimes`()	Check for any job with a longer estimated runtime than the walltime.
`check_spark_config`()	If Spark jobs are present in the config, configure the params to run one job at a time.
`check_submission_groups`()	Check for invalid job submission group assignments.
`clear`()	Clear all configured jobs.
`create_from_result`(job, output_dir)	Create an instance from a result file.
`deserialize`(filename_or_data[, ...])	Create a class instance from a saved configuration file.
`dump`([filename, stream, indent])	Convert the configuration to structured text format.
`dumps`([fmt_module])	Dump the configuration to a formatted string.
`get_default_submission_group`()	Return the default submission group.
`get_job`(name)	Return the job matching name.
`get_num_jobs`()	Return the number of jobs in the configuration.
`get_submission_group`(name)	Return the submission group matching name.
`get_user_data`(key)	Get the user data associated with key.
`iter_jobs`()	Yields a generator over all jobs.
`job_execution_class`(extension_name)	Return the class used for job execution.
`job_parameters_class`(extension_name)	Return the class used for job parameters.
`list_jobs`()	Return a list of all jobs.
`list_user_data_keys`()	List the stored user data keys.
`reconfigure_jobs`(jobs)	Reconfigure with a list of jobs.
`remove_job`(job)	Remove a job from the configuration.
`remove_user_data`(key)	Remove the key from the user data config.
`serialize`([include])	Create data for serialization.
`serialize_for_execution`(scratch_dir[, ...])	Serialize config data for efficient execution.
`serialize_jobs`(directory)	Serializes main job data to job-specific files.
`show_jobs`()	Show the configured jobs.
`shuffle_jobs`()	Shuffle the job order.

Attributes

`FILENAME_DELIMITER`
`FORMAT_VERSION`
`job_global_config`	Return the global configs applied to all jobs.
`node_setup_command`	Command to run on each node before starting jobs
`node_teardown_command`	Command to run on each node after completing jobs
`setup_command`	Command to run by submitter before submitting jobs
`submission_groups`	Return the submission groups.
`teardown_command`	Command to run by last node before completing jobs

add_user_data(key, data)[source]¶

Add user data referenced by a key. Must be JSON-serializable

Parameters:

key (str)
data (any)

Raises:

InvalidParameter – Raised if the key is already stored.

get_user_data(key)[source]¶

Get the user data associated with key.

Parameters:: key (str)
Return type:: any

remove_user_data(key)[source]¶

Remove the key from the user data config.

Parameters:: key (str)

list_user_data_keys()[source]¶

List the stored user data keys.

Returns:: list of str
Return type:: list

check_job_dependencies()[source]¶

Check for impossible conditions with job dependencies.

Raises:: InvalidConfiguration – Raised if job dependencies have an impossible condition.

check_job_estimated_run_minutes(group_name)[source]¶: Check that estimated_run_minutes is set for all jobs in a group.

check_job_runtimes()[source]¶

Check for any job with a longer estimated runtime than the walltime.

Raises:: InvalidConfiguration – Raised if any job is too long.

check_spark_config()[source]¶: If Spark jobs are present in the config, configure the params to run one job at a time.

check_submission_groups()[source]¶

Check for invalid job submission group assignments. Make a default group if none are defined and assign it to each job.

Raises:: InvalidConfiguration – Raised if submission group assignments are invalid.

abstract create_from_result(job, output_dir)[source]¶

Create an instance from a result file.

Parameters:

job (JobParametersInterface)
output_dir (str)

Return type:

class

add_job(job)[source]¶

Add a job to the configuration.

Parameters:: job (JobParametersInterface)

clear()[source]¶: Clear all configured jobs.

dump(filename=None, stream=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, indent=2)[source]¶

Convert the configuration to structured text format.

Parameters:

filename (str | None) – Write configuration to this file (must be .json or .toml). If None, write the text to stream. Recommend using .json for large files. .toml is much slower.
stream (file) – File-like interface that supports write().
indent (int) – If JSON, use this indentation.

Raises:

InvalidParameter – Raised if filename does not have a supported extenstion.

dumps(fmt_module=<module 'toml' from '/opt/hostedtoolcache/Python/3.9.19/x64/lib/python3.9/site-packages/toml/__init__.py'>, **kwargs)[source]¶: Dump the configuration to a formatted string.

classmethod deserialize(filename_or_data, do_not_deserialize_jobs=False)[source]¶

Create a class instance from a saved configuration file.

Parameters:

filename (str | dict) – path to configuration file or that file loaded as a dict
do_not_deserialize_jobs (bool) – Set to True to avoid the overhead of loading all jobs from disk. Job_names will be stored instead of jobs.

Return type:

class

Raises:

InvalidParameter – Raised if the config file has invalid parameters.

get_job(name)[source]¶

Return the job matching name.

Return type:: namedtuple

get_num_jobs()[source]¶

Return the number of jobs in the configuration.

Return type:: int

property job_global_config¶: Return the global configs applied to all jobs.

iter_jobs()[source]¶

Yields a generator over all jobs.

Yields:: iterator over JobParametersInterface

list_jobs()[source]¶

Return a list of all jobs.

Returns:: list of JobParametersInterface
Return type:: list

append_submission_group(submission_group)[source]¶

Append a submission group.

Parameters:: submission_group (SubmissionGroup)

get_default_submission_group()[source]¶

Return the default submission group.

Return type:: SubmissionGroup

get_submission_group(name)[source]¶

Return the submission group matching name.

Parameters:: name (str)
Return type:: SubmissionGroup

property submission_groups¶

Return the submission groups.

Return type:: list

reconfigure_jobs(jobs)[source]¶

Reconfigure with a list of jobs.

Parameters:: DistributionConfiguration.parameter_type (list of)

remove_job(job)[source]¶

Remove a job from the configuration.

Parameters:: job (JobParametersInterface)

serialize(include=ConfigSerializeOptions.JOBS)[source]¶: Create data for serialization.

serialize_jobs(directory)[source]¶

Serializes main job data to job-specific files.

Parameters:: directory (str)

serialize_for_execution(scratch_dir, are_inputs_local=True)[source]¶

Serialize config data for efficient execution.

Parameters:

scratch_dir (str) – Temporary storage space on the local system.
are_inputs_local (bool) – Whether the existing input data is local to this system. For many configurations accessing the input data across the network by many concurrent workers can cause a bottleneck and so implementations may wish to copy the data locally before execution starts. If the storage access time is very fast the question is irrelevant.

Returns:

Name of serialized config file in scratch directory.

Return type:

str

property setup_command¶: Command to run by submitter before submitting jobs

property teardown_command¶: Command to run by last node before completing jobs

property node_setup_command¶: Command to run on each node before starting jobs

property node_teardown_command¶: Command to run on each node after completing jobs

shuffle_jobs()[source]¶: Shuffle the job order.

show_jobs()[source]¶: Show the configured jobs.

job_execution_class(extension_name)[source]¶

Return the class used for job execution.

Parameters:: extension_name (str)
Return type:: class

job_parameters_class(extension_name)[source]¶

Return the class used for job parameters.

Parameters:: extension_name (str)
Return type:: class