jade.hpc.hpc_manager.HpcManager

class jade.hpc.hpc_manager.HpcManager(submission_groups, output)[source]

Bases: object

Manages HPC job submission and monitoring.

Methods

am_i_manager()

Return True if the current node is the manager node.

cancel_job(job_id)

Cancel job.

check_status([name, job_id])

Return the status of a job by name or ID.

check_statuses()

Check the statuses of all user jobs.

create_hpc_interface(config)

Returns an HPC implementation instance appropriate for the current environment.

get_hpc_config(submission_group_name)

Returns the HPC interface instance.

get_job_stats(job_id)

get_local_scratch()

Get path to local storage space.

get_manager_node(job_id)

Return the first node in the job.

list_active_nodes(job_id)

Return the nodes currently participating in the job.

submit(directory, name, script, ...[, wait, ...])

Submits scripts to the queue for execution.

Attributes

hpc_type

Return the type of HPC management system.

am_i_manager()[source]

Return True if the current node is the manager node.

Return type:

bool

cancel_job(job_id)[source]

Cancel job.

Parameters:

job_id (str)

Returns:

return code

Return type:

int

check_status(name=None, job_id=None)[source]

Return the status of a job by name or ID.

Parameters:
  • name (str) – job name

  • job_id (str) – job ID

Return type:

HpcJobStatus

check_statuses()[source]

Check the statuses of all user jobs.

Returns:

key is job_id, value is HpcJobStatus

Return type:

dict

get_hpc_config(submission_group_name)[source]

Returns the HPC interface instance.

Parameters:

submission_group_name (str)

Return type:

HpcManagerInterface

get_local_scratch()[source]

Get path to local storage space.

Return type:

str

get_manager_node(job_id)[source]

Return the first node in the job.

Parameters:

job_id (str)

Returns:

list of node hostnames

Return type:

list

property hpc_type

Return the type of HPC management system.

Return type:

HpcType

list_active_nodes(job_id)[source]

Return the nodes currently participating in the job. Order should be deterministic.

Parameters:

job_id (str)

Returns:

list of node hostnames

Return type:

list

submit(directory, name, script, submission_group_name, wait=False, keep_submission_script=True, dry_run=False)[source]

Submits scripts to the queue for execution.

Parameters:
  • directory (str) – directory to contain the submission script

  • name (str) – job name

  • script (str) – Script to execute.

  • submission_group_name (str)

  • wait (bool) – Wait for execution to complete.

  • keep_submission_script (bool) – Do not delete the submission script.

  • dry_run (bool) – Do not actually submit jobs. Just create the files.

Returns:

(job_id, submission status)

Return type:

tuple

static create_hpc_interface(config)[source]

Returns an HPC implementation instance appropriate for the current environment.