Skip to content

Eagle Job Partitions and Scheduling Policies#

Learn about job partitions and policies for scheduling jobs on Eagle.

Partitions#

Eagle nodes are associated with one or more partitions. Each partition is associated with one or more job characteristics, which include run time, per-node memory requirements, per-node local scratch disk requirements, and whether graphics processing units (GPUs) are needed.

Jobs will be automatically routed to the appropriate partitions by Slurm based on node quantity, walltime, hardware features, and other aspects specified in the submission. Jobs will have access to the largest number of nodes, thus shortest wait, if the partition is not specified during job submission.

The following table summarizes the partitions on Eagle.

Partition Name Description Limits Placement Condition
debug Nodes dedicated to developing and
troubleshooting jobs. Debug nodes
with each of the non-standard
hardware configurations are available.
The node-type distribution is:
- 4 GPU nodes
- 2 Bigmem nodes
- 7 standard nodes
- 13 total nodes
1 job with a
max of 2 nodes
per user
01:00:00 max walltime
-p debug
or
--partition=debug
short Nodes that prefer jobs with walltimes <= 4 hours No partition limit.
No limit per user.
--time <= 4:00:00
--mem <= 85248 (1800 nodes)
--mem <= 180224 (720 nodes)
standard Nodes that prefer jobs with walltimes <= 2 days 2100 nodes total
1050 nodes per user
--time <= 2-00
--mem <= 85248 (1800 nodes)
--mem <= 180224 (720 nodes)
long Nodes that prefer jobs with walltimes > 2 days
Maximum walltime of any job is 10 days
525 nodes total
262 nodes per user
--time <= 10-00
--mem <= 85248 (1800 nodes)
--mem <= 180224 (720 nodes)
bigmem Nodes that have 768 GB of RAM 90 nodes total
45 nodes per user
--mem > 180224
bigscratch Nodes that each have larger /tmp/scratch mounts (24 TB SSD) for
per-node large-data tasks
20 nodes total
10 nodes per user
--tmp > 1500000
gpu Nodes with dual NVIDIA Tesla V100 PCIe
16 GB Computational Accelerators for GPU-based software
20 nodes total
10 nodes per user
2 GPUs per node
--gres=gpu:1 (1 per node)
--gres=gpu:2 (2 per node)
--timelimit <= 2 days
gpul Nodes with dual NVIDIA Tesla V100 PCIe
16 GB Computational Accelerators for GPU-based software
8 nodes
2 nodes per user
2 GPUs per node
--gres=gpu:1 (1 per node)
--gres=gpu:2 (2 per node)
--timelimit > 2 days

Use the option listed above on the srun, sbatch, or salloc command or in your job script to specify what resources your job requires. More details regarding these commands and how to write an sbatch script are available in the Slurm Job Scheduler section.

Job Scheduling Policies#

The system configuration page lists the four categories that Eagle nodes exhibit based on their hardware features. No single user can have jobs running on more than half of the nodes from each hardware category. For example, the maximum quantity of data and analysis visualization (DAV) nodes a single job can use is 25.

Also learn how jobs are prioritized.