jade.jobs.cluster.Cluster¶
- class jade.jobs.cluster.Cluster(config, job_status=None, lock_timeout=300)[source]¶
Bases:
object
Represents the state of the nodes running jobs.
Internal constructor. Use create() or deserialize().
Methods
Return true if all jobs have been submitted.
Return True if the current system is the submitter.
Return True if all jobs are complete.
complete_hpc_job_id
(job_id)Complete an HPC job.
create
(path, jade_config[, pipeline_stage_num])Create a new instance of a Cluster.
delete_files_internal
()demote_from_submitter
([serialize])Clear the submitter, which must be the current system.
deserialize
(path[, ...])Deserialize an existing Cluster from a file.
Deserialize the current job status.
deserialize_submission_groups
(directory)Return the submission groups being used by the cluster.
do_action_under_lock
(path, func, *args, **kwargs)Run a function while holding the lock.
get_config_file
(path)Return the path to the cluster config file.
get_job_status_file
(path)Return the path to the job status file.
get_lock_file
(path)Return the path to the lock file for the cluster config.
get_status_summary
([include_jobs])Return a dict that summarizes current status.
Return True if the config has a submitter assigned.
Return True if the submission is canceled.
Return True if the submission is complete.
Yields each Job.
iter_jobs
([state])Yields each Job.
Mark the submission as being canceled.
Mark the submission as being complete.
prepare_for_resubmission
(jobs_to_resubmit, ...)Reset the state of the cluster for resubmission of jobs.
promote_to_submitter
([serialize])Promote the current system to submitter.
serialize
(reason)Serialize the config to a file.
serialize_jobs
(reason)Serialize the job status to a file.
serialize_submission_groups
(directory)Serialize the submission groups so that they can be read without acquiring a lock.
update_job_status
(submitted_jobs, ...)Update the job status in the config file.
Attributes
CLUSTER_CONFIG_FILE
CONFIG_VERSION_FILE
JOB_STATUS_FILE
JOB_STATUS_VERSION_FILE
LOCK_FILE
SUBMITTER_GROUP_FILE
Return the ClusterConfig
Return the path to the cluster config file.
Return the JobStatus
- classmethod create(path, jade_config: JobConfiguration, pipeline_stage_num=None)[source]¶
Create a new instance of a Cluster. Promotes itself to submitter.
- Parameters:
path (str) – Base directory for a JADE submission
jade_config (JobConfiguration)
pipeline_stage_num (None | int) – True if the config is one stage of a pipeline
- classmethod deserialize(path, try_promote_to_submitter=False, deserialize_jobs=False)[source]¶
Deserialize an existing Cluster from a file.
- Parameters:
path (str) – Base directory for a JADE submission
try_promote_to_submitter (bool) – Attempt to promote to submitter
deserialize_jobs (bool) – Deserialize current job status
- Returns:
cluster and a bool indicating whether promotion occurred
- Return type:
tuple
- property config¶
Return the ClusterConfig
- property config_file¶
Return the path to the cluster config file.
- Return type:
str
- demote_from_submitter(serialize=True)[source]¶
Clear the submitter, which must be the current system.
- static do_action_under_lock(path, func, *args, **kwargs)[source]¶
Run a function while holding the lock.
- static get_config_file(path)[source]¶
Return the path to the cluster config file.
- Parameters:
path (str) – Base directory for a JADE submission
- static get_job_status_file(path)[source]¶
Return the path to the job status file.
- Parameters:
path (str) – Base directory for a JADE submission
- static get_lock_file(path)[source]¶
Return the path to the lock file for the cluster config.
- Parameters:
path (str) – Base directory for a JADE submission
- get_status_summary(include_jobs=False)[source]¶
Return a dict that summarizes current status.
- Parameters:
include_jobs (bool) – Whether to include individual job status
- Return type:
dict
- iter_jobs(state=None)[source]¶
Yields each Job.
- Parameters:
state (JobState) – If not None, only return jobs that match this state.
- Yields:
Job
- property job_status¶
Return the JobStatus
- prepare_for_resubmission(jobs_to_resubmit, updated_blocking_jobs_by_name)[source]¶
Reset the state of the cluster for resubmission of jobs.
- Parameters:
jobs_to_resubmit (set) – job names that will be resubmitted
updated_blocking_jobs_by_name (dict) – contains the blocking jobs for each job to be resubmitted
- promote_to_submitter(serialize=True)[source]¶
Promote the current system to submitter.
- Returns:
Returns True if promotion was successful
- Return type:
bool
- serialize_submission_groups(directory)[source]¶
Serialize the submission groups so that they can be read without acquiring a lock.
- Parameters:
directory (Path)
- static deserialize_submission_groups(directory)[source]¶
Return the submission groups being used by the cluster.
- Parameters:
directory (Path)
- Returns:
list of SubmissionGroup
- Return type:
list
- update_job_status(submitted_jobs, blocked_jobs, canceled_jobs, completed_job_names, hpc_job_ids, batch_index)[source]¶
Update the job status in the config file.
- Parameters:
submitted_jobs (list) – list of Job
blocked_jobs (list) – list of Job
canceled_jobs (list) – list of Job
completed_job_names (set) – set of str
hpc_job_ids (list) – list of str of newly submitted job IDs
batch_index (int) – next batch index