Quick Start (HPC)
This guide walks you through running your first Torc workflow on an HPC cluster with Slurm. Jobs are submitted to Slurm and run on compute nodes.
For local execution (testing, development, or non-HPC environments), see Quick Start (Local).
Prerequisites
- Access to an HPC cluster with Slurm
- A Slurm account/allocation for submitting jobs
- Torc installed (see Installation)
Start the Server
On the login node, start a Torc server with a local database:
torc-server run --database torc.db --completion-check-interval-secs 5
Note: For larger deployments, your team may provide a shared Torc server. In that case, skip this step and set
TORC_API_URLto the shared server address.
Check Your HPC Profile
Torc includes built-in profiles for common HPC systems. Check if your system is detected:
torc hpc detect
If detected, you’ll see your HPC system name. To see available partitions:
torc hpc partitions <profile-name>
Note: If your HPC system isn’t detected, see Custom HPC Profile or request built-in support.
Create a Workflow with Resource Requirements
Save this as workflow.yaml:
name: hpc_hello_world
description: Simple HPC workflow
resource_requirements:
- name: small
num_cpus: 4
memory: 8g
runtime: PT30M
jobs:
- name: job1
command: echo "Hello from compute node!" && hostname
resource_requirements: small
- name: job2
command: echo "Hello again!" && hostname
resource_requirements: small
depends_on: [job1]
Key differences from local workflows:
- resource_requirements: Define CPU, memory, and runtime needs
- Jobs reference these requirements by name
- Torc matches requirements to appropriate Slurm partitions
Submit the Workflow
Submit with your Slurm account:
torc submit-slurm --account <your-account> workflow.yaml
Torc will:
- Detect your HPC system
- Match job requirements to appropriate partitions
- Generate Slurm scheduler configurations
- Create and submit the workflow
Monitor Progress
Check workflow status:
torc workflows list
torc jobs list <workflow-id>
Or use the interactive TUI:
torc tui
Check Slurm queue:
squeue -u $USER
View Results
Once jobs complete:
torc results list <workflow-id>
Job output is stored in the output/ directory by default.
Example: Multi-Stage Pipeline
A more realistic workflow with different resource requirements per stage:
name: analysis_pipeline
description: Data processing pipeline
resource_requirements:
- name: light
num_cpus: 4
memory: 8g
runtime: PT30M
- name: compute
num_cpus: 32
memory: 64g
runtime: PT2H
- name: gpu
num_cpus: 8
num_gpus: 1
memory: 32g
runtime: PT1H
jobs:
- name: preprocess
command: python preprocess.py
resource_requirements: light
- name: train
command: python train.py
resource_requirements: gpu
depends_on: [preprocess]
- name: evaluate
command: python evaluate.py
resource_requirements: compute
depends_on: [train]
Torc stages resource allocation based on dependencies:
preprocessresources are allocated at workflow starttrainresources are allocated whenpreprocesscompletesevaluateresources are allocated whentraincompletes
This prevents wasting allocation time on resources that aren’t needed yet.
Preview Before Submitting
For production workflows, preview the generated Slurm configuration first:
torc slurm generate --account <your-account> workflow.yaml
This shows what schedulers and actions Torc will create without submitting anything.
Next Steps
- Slurm Workflows — How Torc manages Slurm
- Resource Requirements — All resource options
- HPC Profiles — Managing HPC configurations
- Working with Slurm — Advanced Slurm configuration
- Debugging Slurm Workflows — Troubleshooting