Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

HPC Profiles Reference

Complete reference for HPC profile system and CLI commands.

Overview

HPC profiles contain pre-configured knowledge about High-Performance Computing systems, enabling automatic Slurm scheduler generation based on job resource requirements.

CLI Commands

torc hpc list

List all available HPC profiles.

torc hpc list [OPTIONS]

Options:

OptionDescription
-f, --format <FORMAT>Output format: table or json

Output columns:

  • Name: Profile identifier used in commands
  • Display Name: Human-readable name
  • Partitions: Number of configured partitions
  • Detected: Whether current system matches this profile

torc hpc detect

Detect the current HPC system.

torc hpc detect [OPTIONS]

Options:

OptionDescription
-f, --format <FORMAT>Output format: table or json

Returns the detected profile name, or indicates no match.


torc hpc show

Display detailed information about an HPC profile.

torc hpc show <PROFILE> [OPTIONS]

Arguments:

ArgumentDescription
<PROFILE>Profile name (e.g., kestrel)

Options:

OptionDescription
-f, --format <FORMAT>Output format: table or json

torc hpc partitions

List partitions for an HPC profile.

torc hpc partitions <PROFILE> [OPTIONS]

Arguments:

ArgumentDescription
<PROFILE>Profile name (e.g., kestrel)

Options:

OptionDescription
-f, --format <FORMAT>Output format: table or json

Output columns:

  • Name: Partition name
  • CPUs/Node: CPU cores per node
  • Mem/Node: Memory per node
  • Max Walltime: Maximum job duration
  • GPUs: GPU count and type (if applicable)
  • Shared: Whether partition supports shared jobs
  • Notes: Special requirements or features

torc hpc match

Find partitions matching resource requirements.

torc hpc match <PROFILE> [OPTIONS]

Arguments:

ArgumentDescription
<PROFILE>Profile name (e.g., kestrel)

Options:

OptionDescription
--cpus <N>Required CPU cores
--memory <SIZE>Required memory (e.g., 64g, 512m)
--walltime <DURATION>Required walltime (e.g., 2h, 4:00:00)
--gpus <N>Required GPUs
-f, --format <FORMAT>Output format: table or json

Memory format: <number><unit> where unit is k, m, g, or t (case-insensitive).

Walltime formats:

  • HH:MM:SS (e.g., 04:00:00)
  • <N>h (e.g., 4h)
  • <N>m (e.g., 30m)
  • <N>s (e.g., 3600s)

torc slurm generate

Generate Slurm schedulers for a workflow based on job resource requirements.

torc slurm generate [OPTIONS] --account <ACCOUNT> <WORKFLOW_FILE>

Arguments:

ArgumentDescription
<WORKFLOW_FILE>Path to workflow specification file (YAML, JSON, or JSON5)

Options:

OptionDescription
--account <ACCOUNT>Slurm account to use (required)
--profile <PROFILE>HPC profile to use (auto-detected if not specified)
-o, --output <FILE>Output file path (prints to stdout if not specified)
--no-actionsDon’t add workflow actions for scheduling nodes
--forceOverwrite existing schedulers in the workflow

Generated artifacts:

  1. Slurm schedulers: One for each unique resource requirement
  2. Job scheduler assignments: Each job linked to appropriate scheduler
  3. Workflow actions: on_workflow_start/schedule_nodes actions (unless --no-actions)

Scheduler naming: <resource_requirement_name>_scheduler


Built-in Profiles

NREL Kestrel

Profile name: kestrel

Detection: Environment variable NREL_CLUSTER=kestrel

Partitions:

PartitionCPUsMemoryMax WalltimeGPUsNotes
debug104240 GB1h-Quick testing
short104240 GB4h-Short jobs
standard104240 GB48h-General workloads
long104240 GB240h-Extended jobs
medmem104480 GB48h-Medium memory
bigmem1042048 GB48h-High memory
shared104240 GB48h-Shared node access
hbw104240 GB48h-High-bandwidth memory, min 10 nodes
nvme104240 GB48h-NVMe local storage
gpu-h1002240 GB48h4x H100GPU compute

Node specifications:

  • Standard nodes: 104 cores (2x Intel Xeon Sapphire Rapids), 240 GB RAM
  • GPU nodes: 4x NVIDIA H100 80GB HBM3, 128 cores, 2 TB RAM

Configuration

Custom Profiles

Don’t see your HPC? Please request built-in support so everyone benefits. See the Custom HPC Profile Tutorial for creating a profile while you wait.

Define custom profiles in your Torc configuration file:

# ~/.config/torc/config.toml

[client.hpc.custom_profiles.mycluster]
display_name = "My Cluster"
description = "Description of the cluster"
detect_env_var = "CLUSTER_NAME=mycluster"
detect_hostname = ".*\\.mycluster\\.org"
default_account = "myproject"

[[client.hpc.custom_profiles.mycluster.partitions]]
name = "compute"
cpus_per_node = 64
memory_mb = 256000
max_walltime_secs = 172800
shared = false

[[client.hpc.custom_profiles.mycluster.partitions]]
name = "gpu"
cpus_per_node = 32
memory_mb = 128000
max_walltime_secs = 86400
gpus_per_node = 4
gpu_type = "A100"
shared = false

Profile Override

Override settings for built-in profiles:

[client.hpc.profile_overrides.kestrel]
default_account = "my_default_account"

Configuration Options

[client.hpc] Section:

OptionTypeDescription
profile_overridestableOverride settings for built-in profiles
custom_profilestableDefine custom HPC profiles

Profile override options:

OptionTypeDescription
default_accountstringDefault Slurm account for this profile

Custom profile options:

OptionTypeRequiredDescription
display_namestringNoHuman-readable name
descriptionstringNoProfile description
detect_env_varstringNoEnvironment variable for detection (NAME=value)
detect_hostnamestringNoRegex pattern for hostname detection
default_accountstringNoDefault Slurm account
partitionsarrayYesList of partition configurations

Partition options:

OptionTypeRequiredDescription
namestringYesPartition name
cpus_per_nodeintYesCPU cores per node
memory_mbintYesMemory per node in MB
max_walltime_secsintYesMaximum walltime in seconds
gpus_per_nodeintNoGPUs per node
gpu_typestringNoGPU model (e.g., “H100”)
sharedboolNoWhether partition supports shared jobs
min_nodesintNoMinimum required nodes
requires_explicit_requestboolNoMust be explicitly requested

Resource Matching Algorithm

When generating schedulers, Torc uses this algorithm to match resource requirements to partitions:

  1. Filter by resources: Partitions must satisfy:

    • CPUs >= required CPUs
    • Memory >= required memory
    • GPUs >= required GPUs (if specified)
    • Max walltime >= required runtime
  2. Exclude debug partitions: Unless no other partition matches

  3. Prefer best fit:

    • Partitions that exactly match resource needs
    • Non-shared partitions over shared
    • Shorter max walltime over longer
  4. Handle special requirements:

    • GPU jobs only match GPU partitions
    • Respect requires_explicit_request flag
    • Honor min_nodes constraints

Generated Scheduler Format

Example generated Slurm scheduler:

slurm_schedulers:
  - name: medium_scheduler
    account: myproject
    nodes: 1
    mem: 64g
    walltime: 04:00:00
    gres: null
    partition: null  # Let Slurm choose based on resources

Corresponding workflow action:

actions:
  - trigger_type: on_workflow_start
    action_type: schedule_nodes
    scheduler: medium_scheduler
    scheduler_type: slurm
    num_allocations: 1

Runtime Format Parsing

Resource requirements use ISO 8601 duration format for runtime:

FormatExampleMeaning
PTnHPT4H4 hours
PTnMPT30M30 minutes
PTnSPT3600S3600 seconds
PTnHnMPT2H30M2 hours 30 minutes
PnDTnHP1DT12H1 day 12 hours

Generated walltime uses HH:MM:SS format (e.g., 04:00:00).


See Also