gaps.hpc.SLURM#

class SLURM(user=None, queue_dict=None)[source]#

Bases: HpcJobManager

Subclass for SLURM subprocess jobs.

Parameters:
  • user (str | None, optional) – HPC username. None will get your username using getpass.getuser(). By default, None.

  • queue_dict (dict | None, optional) – Parsed HPC queue dictionary from parse_queue_str(). None will get the queue info from the hardware. By default, None.

Methods

cancel(arg)

Cancel a job.

check_status_using_job_id(job_id)

Check the status of a job using the HPC queue and job ID.

check_status_using_job_name(job_name)

Check the status of a job using the HPC queue and job name.

make_script_str(name, cmd, allocation, walltime)

Generate the SLURM submission script.

parse_queue_str(queue_str)

Parse the hardware queue string into a nested dictionary.

query_queue()

Run the HPC queue command and return the raw stdout string.

reset_query_cache()

Reset the query dict cache so that hardware is queried again.

submit(name[, keep_sh])

Submit a job on the HPC.

Attributes

COLUMN_HEADERS

COMMANDS

MAX_NAME_LEN

Q_SUBMITTED_STATUS

SHELL_FILENAME_FMT

USER

queue

HPC queue keyed by job ids with values as job properties.

query_queue()[source]#

Run the HPC queue command and return the raw stdout string.

Returns:

stdout (str) – HPC queue output string. Can be split on line breaks to get a list.

make_script_str(name, cmd, allocation, walltime, qos='normal', memory=None, feature=None, stdout_path='./stdout', conda_env=None, sh_script=None)[source]#

Generate the SLURM submission script.

Parameters:
  • name (str) – SLURM job name.

  • cmd (str) –

    Command to be submitted in SLURM shell script. Example:

    ‘python -m reV.generation.cli_gen’

  • allocation (str) – HPC allocation account. Example: ‘rev’.

  • walltime (int | float) – Node walltime request in hours. Example: 4.

  • qos ({“normal”, “high”}) – Quality of service specification for job. Jobs with “high” priority will be charged at 2x the rate. By default, "normal".

  • memory (int , optional) – Node memory request in GB. By default, None.

  • feature (str, optional) – Additional flags for SLURM job. Format is “–partition=debug” or “–depend=[state:job_id]”. Do not use this input to specify QOS. Use the ``qos`` input instead. By default, None.

  • stdout_path (str, optional) – Path to print .stdout and .stderr files. By default, DEFAULT_STDOUT_PATH.

  • conda_env (str, optional) – Conda environment to activate. By default, None.

  • sh_script (str, optional) – Script to run before executing command. By default, None.

Returns:

str – SLURM script to submit.

cancel(arg)#

Cancel a job.

Parameters:

arg (int | list | str) – Integer job id(s) to cancel. Can be a list of integer job ids, ‘all’ to cancel all jobs, or a feature (-p short) to cancel all jobs with a given feature

check_status_using_job_id(job_id)#

Check the status of a job using the HPC queue and job ID.

Parameters:

job_id (int) – Job integer ID number.

Returns:

status (str | None) – Queue job status string or None if not found.

check_status_using_job_name(job_name)#

Check the status of a job using the HPC queue and job name.

Parameters:

job_name (str) – Job name string.

Returns:

status (str | None) – Queue job status string or None if not found.

classmethod parse_queue_str(queue_str)#

Parse the hardware queue string into a nested dictionary.

This function parses the queue output string into a dictionary keyed by integer job ids with values as dictionaries of job properties (queue printout columns).

Parameters:

queue_str (str) – HPC queue output string. Typically a space-delimited string with line breaks.

Returns:

queue_dict (dict) – HPC queue parsed into dictionary format keyed by integer job ids with values as dictionaries of job properties (queue printout columns).

property queue#

HPC queue keyed by job ids with values as job properties.

Type:

dict

reset_query_cache()#

Reset the query dict cache so that hardware is queried again.

submit(name, keep_sh=False, **kwargs)#

Submit a job on the HPC.

Parameters:
  • name (str) – HPC job name.

  • keep_sh (bool, optional) – Option to keep the submission script on disk. By default, False.

  • **kwargs – Extra keyword-argument pairs to be passed to make_script_str().

Returns:

  • out (str) – Standard output from submission. If submitted successfully, this is the Job ID.

  • err (str) – Standard error. This is an empty string if the job was submitted successfully.