jade.jobs.cluster.Cluster

class jade.jobs.cluster.Cluster(config, job_status=None, lock_timeout=300)[source]

Bases: object

Represents the state of the nodes running jobs.

Internal constructor. Use create() or deserialize().

Methods

all_jobs_submitted()

Return true if all jobs have been submitted.

am_i_submitter()

Return True if the current system is the submitter.

are_all_jobs_complete()

Return True if all jobs are complete.

complete_hpc_job_id(job_id)

Complete an HPC job.

create(path, jade_config[, pipeline_stage_num])

Create a new instance of a Cluster.

delete_files_internal()

demote_from_submitter([serialize])

Clear the submitter, which must be the current system.

deserialize(path[, ...])

Deserialize an existing Cluster from a file.

deserialize_jobs()

Deserialize the current job status.

deserialize_submission_groups(directory)

Return the submission groups being used by the cluster.

do_action_under_lock(path, func, *args, **kwargs)

Run a function while holding the lock.

get_config_file(path)

Return the path to the cluster config file.

get_job_status_file(path)

Return the path to the job status file.

get_lock_file(path)

Return the path to the lock file for the cluster config.

get_status_summary([include_jobs])

Return a dict that summarizes current status.

has_submitter()

Return True if the config has a submitter assigned.

is_canceled()

Return True if the submission is canceled.

is_complete()

Return True if the submission is complete.

iter_hpc_job_ids()

Yields each Job.

iter_jobs([state])

Yields each Job.

mark_canceled()

Mark the submission as being canceled.

mark_complete()

Mark the submission as being complete.

prepare_for_resubmission(jobs_to_resubmit, ...)

Reset the state of the cluster for resubmission of jobs.

promote_to_submitter([serialize])

Promote the current system to submitter.

serialize(reason)

Serialize the config to a file.

serialize_jobs(reason)

Serialize the job status to a file.

serialize_submission_groups(directory)

Serialize the submission groups so that they can be read without acquiring a lock.

update_job_status(submitted_jobs, ...)

Update the job status in the config file.

Attributes

CLUSTER_CONFIG_FILE

CONFIG_VERSION_FILE

JOB_STATUS_FILE

JOB_STATUS_VERSION_FILE

LOCK_FILE

SUBMITTER_GROUP_FILE

config

Return the ClusterConfig

config_file

Return the path to the cluster config file.

job_status

Return the JobStatus

classmethod create(path, jade_config: JobConfiguration, pipeline_stage_num=None)[source]

Create a new instance of a Cluster. Promotes itself to submitter.

Parameters:
  • path (str) – Base directory for a JADE submission

  • jade_config (JobConfiguration)

  • pipeline_stage_num (None | int) – True if the config is one stage of a pipeline

classmethod deserialize(path, try_promote_to_submitter=False, deserialize_jobs=False)[source]

Deserialize an existing Cluster from a file.

Parameters:
  • path (str) – Base directory for a JADE submission

  • try_promote_to_submitter (bool) – Attempt to promote to submitter

  • deserialize_jobs (bool) – Deserialize current job status

Returns:

cluster and a bool indicating whether promotion occurred

Return type:

tuple

are_all_jobs_complete()[source]

Return True if all jobs are complete.

Return type:

bool

all_jobs_submitted()[source]

Return true if all jobs have been submitted.

am_i_submitter()[source]

Return True if the current system is the submitter.

complete_hpc_job_id(job_id)[source]

Complete an HPC job.

Parameters:

job_id (str)

property config

Return the ClusterConfig

property config_file

Return the path to the cluster config file.

Return type:

str

demote_from_submitter(serialize=True)[source]

Clear the submitter, which must be the current system.

deserialize_jobs()[source]

Deserialize the current job status.

static do_action_under_lock(path, func, *args, **kwargs)[source]

Run a function while holding the lock.

static get_config_file(path)[source]

Return the path to the cluster config file.

Parameters:

path (str) – Base directory for a JADE submission

static get_job_status_file(path)[source]

Return the path to the job status file.

Parameters:

path (str) – Base directory for a JADE submission

static get_lock_file(path)[source]

Return the path to the lock file for the cluster config.

Parameters:

path (str) – Base directory for a JADE submission

get_status_summary(include_jobs=False)[source]

Return a dict that summarizes current status.

Parameters:

include_jobs (bool) – Whether to include individual job status

Return type:

dict

has_submitter()[source]

Return True if the config has a submitter assigned.

iter_jobs(state=None)[source]

Yields each Job.

Parameters:

state (JobState) – If not None, only return jobs that match this state.

Yields:

Job

iter_hpc_job_ids()[source]

Yields each Job.

Yields:

str – HPC job ID

property job_status

Return the JobStatus

is_canceled()[source]

Return True if the submission is canceled.

is_complete()[source]

Return True if the submission is complete.

mark_canceled()[source]

Mark the submission as being canceled.

mark_complete()[source]

Mark the submission as being complete.

prepare_for_resubmission(jobs_to_resubmit, updated_blocking_jobs_by_name)[source]

Reset the state of the cluster for resubmission of jobs.

Parameters:
  • jobs_to_resubmit (set) – job names that will be resubmitted

  • updated_blocking_jobs_by_name (dict) – contains the blocking jobs for each job to be resubmitted

promote_to_submitter(serialize=True)[source]

Promote the current system to submitter.

Returns:

Returns True if promotion was successful

Return type:

bool

serialize(reason)[source]

Serialize the config to a file.

serialize_jobs(reason)[source]

Serialize the job status to a file.

serialize_submission_groups(directory)[source]

Serialize the submission groups so that they can be read without acquiring a lock.

Parameters:

directory (Path)

static deserialize_submission_groups(directory)[source]

Return the submission groups being used by the cluster.

Parameters:

directory (Path)

Returns:

list of SubmissionGroup

Return type:

list

update_job_status(submitted_jobs, blocked_jobs, canceled_jobs, completed_job_names, hpc_job_ids, batch_index)[source]

Update the job status in the config file.

Parameters:
  • submitted_jobs (list) – list of Job

  • blocked_jobs (list) – list of Job

  • canceled_jobs (list) – list of Job

  • completed_job_names (set) – set of str

  • hpc_job_ids (list) – list of str of newly submitted job IDs

  • batch_index (int) – next batch index