How to Select Compute Nodes¶
Selecting the right compute nodes is critical for Spark cluster performance, especially for jobs that perform large shuffles (joins, aggregations, sorting).
Storage Speed Requirements¶
Spark shuffle operations write intermediate data to local storage. If this storage is slow, shuffles become a bottleneck. The minimum requirement is approximately 500 MB/s sequential write speed. Drives that can deliver 2 GB/s are ideal.
Storage Options (Best to Worst)¶
Storage Type |
Speed |
Capacity |
Notes |
|---|---|---|---|
NVMe SSD |
2-7 GB/s |
1-10 TB |
Best choice for shuffle-heavy workloads |
RAM disk ( |
10+ GB/s |
Limited by RAM |
Excellent speed but limited capacity |
HPC shared filesystem (Lustre, GPFS) |
500 MB/s - 2 GB/s |
Unlimited |
Shared bandwidth; slower than local SSD |
Spinning disk (HDD) |
100-200 MB/s |
Large |
Too slow for shuffle-heavy workloads |
Recommendations by Workload¶
Shuffle-Heavy Workloads (Joins, Group-Bys)¶
Prefer nodes with local NVMe storage
Configure sparkctl to use the NVMe path for shuffle storage
Allocate more nodes if shuffle data exceeds local SSD capacity
Read-Heavy Workloads (Scans, Filters)¶
Storage speed is less critical
Focus on memory capacity and network bandwidth
Shared filesystem storage is often acceptable
Small Data / Prototyping¶
RAM disk (
/dev/shm) works wellMany HPC compute nodes have
/dev/shmavailable (typically half of RAM)Fastest option but limited by memory
Configuring Shuffle Storage Location¶
Specify local storage for shuffle writes (for systems where sparkctl knows how to identify local storage such as NREL’s Kestrel cluster):
$ sparkctl configure --local-storage
Or for RAM disk (or other custom location accessible on each compute node):
$ sparkctl configure --spark-scratch /dev/shm
Capacity Planning¶
If your shuffle data exceeds local storage capacity, you have two options:
Add more nodes: Distributes shuffle data across more local SSDs
Use shared filesystem: Configure
--spark-scratchto point to Lustre/GPFS (slower but unlimited capacity)
Checking Node Storage¶
On most HPCs, you can check available local storage during an interactive session:
$ df -h /dev/nvme0
$ df -h /dev/shm
Consult your HPC documentation for node-specific storage configurations.
Network Considerations¶
Shuffle operations also transfer data between nodes. Ensure your nodes have:
High-bandwidth, low-latency interconnect
Nodes in the same rack or network segment if possible