rex.utilities.hpc.HpcJobManager

class HpcJobManager(user=None, queue_dict=None)[source]

Bases: rex.utilities.execution.SubprocessManager, abc.ABC

Abstract HPC job manager framework

Parameters
  • user (str | None) – HPC username. None will get your username using getpass.getuser()

  • queue_dict (dict | None) – Parsed HPC queue dictionary (qstat for PBS or squeue for SLURM) from parse_queue_str(). None will get the queue from PBS or SLURM.

Methods

check_status([job_id, job_name])

Check the status of an HPC job using the HPC queue.

format_walltime(hours)

Get the SLURM walltime string in format "HH:MM:SS"

make_path(d)

Make a directory tree if it doesn't exist.

make_sh(fname, script)

Make a shell script (.sh file) to execute a subprocess.

parse_queue_str(queue_str[, keys])

Parse the qstat or squeue output string into a dict format keyed by integer job id with nested dictionary of job properties (queue printout columns).

query_queue([job_name, user, qformat, skip_rows])

Run the HPC queue command and return the raw stdout string.

rm(fname)

Remove a file.

s(s)

Format input as str w/ appropriate quote types for python cli entry.

submit(cmd[, background, background_stdout])

Open a subprocess and submit a command.

Attributes

MAX_NAME_LEN

QCOL_ID

QCOL_NAME

QCOL_STATUS

QSKIP

USER

queue

Get the HPC queue parsed into dict format keyed by integer job id

queue_job_ids

Get a list of the job integer ids in the queue

queue_job_names

Get a list of the job names in the queue

classmethod parse_queue_str(queue_str, keys=0)[source]

Parse the qstat or squeue output string into a dict format keyed by integer job id with nested dictionary of job properties (queue printout columns).

Parameters
  • queue_str (str) – HPC queue output string (qstat for PBS or squeue for SLURM). Typically a space-delimited string with line breaks.

  • keys (list | int) – Argument to set the queue job attributes (column headers). This defaults to an integer which says which row index contains the space-delimited column headers. Can also be a list to explicitly set the column headers.

Returns

queue_dict (dict) – HPC queue parsed into dictionary format keyed by integer job id with nested dictionary of job properties (queue printout columns).

abstract query_queue(job_name=None, user=None, qformat=None, skip_rows=None)[source]

Run the HPC queue command and return the raw stdout string.

Parameters
  • job_name (str | None) – Optional to check the squeue for a specific job name (not limited to the 8 shown characters) or None to show user’s whole queue.

  • user (str | None) – HPC username. None will get your username using getpass.getuser()

  • qformat (str | None) – Queue format string specification. Changing this form the default (None) could have adverse effects!

  • skip_rows (int | list | None) – Optional row index values to skip.

Returns

stdout (str) – HPC queue output string. Can be split on line breaks to get list.

property queue

Get the HPC queue parsed into dict format keyed by integer job id

Returns

queue (dict) – HPC queue parsed into dictionary format keyed by integer job id with nested dictionary of job properties (queue printout columns).

property queue_job_names

Get a list of the job names in the queue

property queue_job_ids

Get a list of the job integer ids in the queue

check_status(job_id=None, job_name=None)[source]

Check the status of an HPC job using the HPC queue.

Parameters
  • job_id (int | None) – Job integer ID number (preferred input)

  • job_name (str) – Job name string.

Returns

status (str | NoneType) – Queue job status str or None if not found. SLURM status strings: PD, R, CG (pending, running, complete). PBS status strings: Q, R, C (queued, running, complete).

static format_walltime(hours)

Get the SLURM walltime string in format “HH:MM:SS”

Parameters

hours (float | int) – Requested number of job hours.

Returns

walltime (str) – SLURM walltime request in format “HH:MM:SS”

static make_path(d)

Make a directory tree if it doesn’t exist.

Parameters

d (str) – Directory tree to check and potentially create.

static make_sh(fname, script)

Make a shell script (.sh file) to execute a subprocess.

Parameters
  • fname (str) – Name of the .sh file to create.

  • script (str) – Contents to be written into the .sh file.

static rm(fname)

Remove a file.

Parameters

fname (str) – Filename (with path) to remove.

static s(s)

Format input as str w/ appropriate quote types for python cli entry.

Examples

list, tuple -> “[‘one’, ‘two’]” dict -> “{‘key’: ‘val’}” int, float, None -> ‘0’ str, other -> ‘string’

static submit(cmd, background=False, background_stdout=False)

Open a subprocess and submit a command.

Parameters
  • cmd (str) – Command to be submitted using python subprocess.

  • background (bool) – Flag to submit subprocess in the background. stdout stderr will be empty strings if this is True.

  • background_stdout (bool) – Flag to capture the stdout/stderr from the background process in a nohup.out file.

Returns

  • stdout (str) – Subprocess standard output. This is decoded from the subprocess stdout with rstrip.

  • stderr (str) – Subprocess standard error. This is decoded from the subprocess stderr with rstrip. After decoding/rstrip, this will be empty if the subprocess doesn’t return an error.