Skip to content

Running Multiple Sub-Jobs with One Job Script#

If your workload consists of serial or modestly parallel programs, you can run multiple instances of your program at the same time using different processor cores on a single node. This will allow you to make better use of your allocation because it will use the resources on the node that would otherwise be idle.

Example#

For illustration, we use a simple C code to calculate pi. The source code and instructions for building that program are provided below:

Sample Program#

Copy and paste the following into a terminal window that's connected to the cluster. This will stream the pasted contents into a file called pi.c using the command cat << eof > pi.c.

cat << eof > pi.c
#include <stdio.h>

// pi.c: A sample C code calculating pi

main() {
  double x,h,sum = 0;
  int i,N;
  printf("Input number of iterations: ");
  scanf("%d",&N);
  h=1.0/(double) N;

  for (i=0; i<N; i++) {
   x=h*((double) i + 0.5);
   sum += 4.0*h/(1.0+x*x);
  }

  printf("\nN=%d, PI=%.15f\n", N,sum);
}

eof

Compile the Code#

This example uses the Intel C compiler. Load the module and compile pi.c with the following commands:

$ module purge
$ module load intel-mpi
$ icc -O2 pi.c -o pi_test
$ ./pi_test

A sample batch job script file to run 8 copies of the pi_test program on a node with 24 processor cores is given below. This script creates 8 directories and starts 8 jobs, each in the background. It waits for all 8 jobs to complete before finishing.

Copy and paste the following into a text file#

Place that batch file into one of your directories on the cluster. Make sure to change the allocation to a project-handle you belong to.

#!/bin/bash
## Required Parameters   ##############################################
#SBATCH --time 10:00               # WALLTIME limit of 10 minutes

## Double ## will cause SLURM to ignore the directive:
#SBATCH -A <handle>                # Account (replace with appropriate)

#SBATCH -n 8                       # ask for 8 tasks   
#SBATCH -N 1                       # ask for 1 node
## Optional Parameters   ##############################################
#SBATCH --job-name wait_test       # name to display in queue
#SBATCH --output std.out
#SBATCH --error std.err

JOBNAME=$SLURM_JOB_NAME            # re-use the job-name specified above

# Run 1 job per task
N_JOB=$SLURM_NTASKS                # create as many jobs as tasks

for((i=1;i<=$N_JOB;i++))
do
  mkdir $JOBNAME.run$i             # Make subdirectories for each job
  cd $JOBNAME.run$i                # Go to job directory
  echo 10*10^$i | bc > input       # Make input files
  time ../pi_test < input > log &  # Run your executable, note the "&"
  cd ..
done

#Wait for all
wait

echo
echo "All done. Checking results:"
grep "PI" $JOBNAME.*/log

Submit the Batch Script#

Use the following Slurm sbatch command to submit the script. The job will be scheduled, and you can view the output once the job completes to confirm the results.

$ sbatch -A <project-handle> <batch_file>