ExaWind#
ExaWind
is a suite of applications that simulate wind turbines and wind farms on accelerated systems. The applications include AMR-Wind, Nalu-Wind, TIOGA, and OpenFAST. AMR-Wind
is a massively parallel, block-structured adaptive-mesh, incompressible flow solver for wind turbine and wind farm simulations. Nalu-Wind
is a generalized, unstructured, massively parallel, incompressible flow solver for wind turbine and wind farm simulations. TIOGA
is a library for overset grid assembly on parallel distributed systems. OpenFAST
is a multi-physics, multi-fidelity tool for simulating the coupled dynamic response of wind turbines.
Building ExaWind#
We recommend installing ExaWind packages, either coupled or standalone, via ExaWind-manager
. While ExaWind-manager is recommended, CMake could be used as a substitute for installing the necessary packages. Instructions for building ExaWind packages with ExaWind-manager and AMR-Wind/OpenFAST with CMake are described below.
Building ExaWind using ExaWind-manager on Kestrel-CPU#
The following examples demonstrate how to use ExaWind-manager for building common ExaWind applications on Kestrel. The build requires a compute node having at least 36 cores and using Intel or GNU compiler. To avoid space and speed issues, clone ExaWind-manager to scratch, not your home directory; then activate it, create and activate a Spack environment, and finally concretize and build. When making a Spack environment, you can add (+) or remove (-) specs, and adjust versions (@) for the main and dependent (^) applications. The first example outlines the process of building ExaWind using the master branch, omitting GPU functionalities and the AMR-Wind and Nalu-Wind as its dependencies. The second example outlines the process of building a coupled release of AMR-Wind and OpenFAST from the develop branch. The final two examples illustrate how to build the released AMR-Wind and Nalu-Wind versions.
Building ExaWind
$ salloc --time=01:00:00 --account=<project account> --partition=shared --nodes=1 --ntasks-per-node=36
# Intel
$ module load PrgEnv-intel
$ module load cray-mpich/8.1.28
$ module load cray-libsci/23.12.5
$ module load cray-python
# clone ExaWind-manager
$ cd /scratch/${USER}
$ git clone --recursive https://github.com/Exawind/exawind-manager.git
$ cd exawind-manager
# Activate exawind-manager
$ export EXAWIND_MANAGER=`pwd`
$ source ${EXAWIND_MANAGER}/start.sh && spack-start
# Create Spack environment and change the software versions if needed
$ mkdir environments
$ cd environments
$ spack manager create-env --name exawind-cpu --spec 'exawind@master~amr_wind_gpu~cuda~gpu-aware-mpi~nalu_wind_gpu ^amr-wind@main~cuda~gpu-aware-mpi+hypre+mpi+netcdf+shared ^nalu-wind@master~cuda~fftw~gpu-aware-mpi+hypre+shared ^tioga@develop %oneapi'
# Activate the environment
$ spack env activate -d ${EXAWIND_MANAGER}/environments/exawind-cpu
# concretize specs and dependencies
$ spack concretize -f
# Build software
$ spack -d install
Building Coupled AMR-Wind
and OpenFAST
$ salloc --time=01:00:00 --account=<project account> --partition=shared --nodes=1 --ntasks-per-node=52
# Intel
$ module load PrgEnv-intel
$ module load cray-mpich/8.1.28
$ module load cray-libsci/23.12.5
$ module load cray-python
# clone ExaWind-manager
$ cd /scratch/${USER}
$ git clone --recursive https://github.com/Exawind/exawind-manager.git
$ cd exawind-manager
# Activate exawind-manager
$ export EXAWIND_MANAGER=`pwd`
$ source ${EXAWIND_MANAGER}/start.sh && spack-start
# Create Spack environment and change the software versions if needed
$ mkdir environments
$ cd environments
$ spack manager create-env --name amrwind-openfast-cpu --spec 'amr-wind+hypre+netcdf+openfast ^openfast@develop+openmp+rosco %oneapi'
# Activate the environment
$ spack env activate -d ${EXAWIND_MANAGER}/environments/amrwind-openfast-cpu
# concretize specs and dependencies
$ spack concretize -f
# Build software
$ spack -d install
Building AMR-Wind
$ salloc --time=01:00:00 --account=<project account> --partition=shared --nodes=1 --ntasks-per-node=52
# Intel
$ module load PrgEnv-intel
$ module load cray-mpich/8.1.28
$ module load cray-libsci/23.12.5
$ module load cray-python
# clone ExaWind-manager
$ cd /scratch/${USER}
$ git clone --recursive https://github.com/Exawind/exawind-manager.git
$ cd exawind-manager
# Activate exawind-manager
$ export EXAWIND_MANAGER=`pwd`
$ source ${EXAWIND_MANAGER}/start.sh && spack-start
# Create Spack environment and change the software versions if needed
$ mkdir environments
$ cd environments
$ spack manager create-env --name amrwind-cpu --spec 'amr-wind+hypre+netcdf %oneapi'
# Activate the environment
$ spack env activate -d ${EXAWIND_MANAGER}/environments/amrwind-cpu
# concretize specs and dependencies
$ spack concretize -f
# Build software
$ spack -d install
Building Nalu-Wind
$ salloc --time=01:00:00 --account=<project account> --partition=shared --nodes=1 --ntasks-per-node=52
# Intel
$ module load PrgEnv-intel
$ module load cray-mpich/8.1.28
$ module load cray-libsci/23.12.5
$ module load cray-python
# clone ExaWind-manager
$ cd /scratch/${USER}
$ git clone --recursive https://github.com/Exawind/exawind-manager.git
$ cd exawind-manager
# Activate exawind-manager
$ export EXAWIND_MANAGER=`pwd`
$ source ${EXAWIND_MANAGER}/start.sh && spack-start
# Create Spack environment and change the software versions if needed
$ mkdir environments
$ cd environments
$ spack manager create-env --name naluwind-cpu --spec 'nalu-wind+hypre+netcdf %oneapi'
# Activate the environment
$ spack env activate -d ${EXAWIND_MANAGER}/environments/naluwind-cpu
# concretize specs and dependencies
$ spack concretize -f
# Build software
$ spack -d install
Building AMR-Wind and OpenFAST using CMake on Kestrel-CPU#
This section describes how to install AMR-Wind and the coupled AMR-Wind/OpenFAST using provided CMake scripts. AMR-wind can be installed by following the instructions here. You can clone your desired version of AMR-wind from here. Once cloned, cd
into the AMR-Wind directory and create a build
directory. Use the scripts given below from within the build
directory to build AMR-Wind. On a Kestrel CPU node, build AMR-Wind for CPUs by executing the following script from within the build
directory:
Sample job script: Building AMR-Wind using cmake
on Kestrel-CPU
#!/bin/bash
module purge
module load PrgEnv-intel
module load netcdf/4.9.2-intel-oneapi-mpi-intel
module load netlib-scalapack/2.2.0-gcc
export LD_LIBRARY_PATH=/nopt/nrel/apps/cray-mpich-stall/libs_mpich_nrel_intel:$LD_LIBRARY_PATH
export LD_PRELOAD=/nopt/nrel/apps/cray-mpich-stall/libs_mpich_nrel_intel/libmpi_intel.so.12:/nopt/nrel/apps/cray-mpich-stall/libs_mpich_nrel_intel/libmpifort_intel.so.12
export MPICH_VERSION_DISPLAY=1
export MPICH_ENV_DISPLAY=1
export MPICH_OFI_CXI_COUNTER_REPORT=2
export FI_MR_CACHE_MONITOR=memhooks
export FI_CXI_RX_MATCH_MODE=software
export MPICH_SMP_SINGLE_COPY_MODE=NONE
echo $LD_LIBRARY_PATH |tr ':' '\n'
module list
cmake .. \
-DCMAKE_C_COMPILER=mpicc \
-DCMAKE_CXX_COMPILER=mpicxx \
-DMPI_Fortran_COMPILER=mpifort \
-DCMAKE_Fortran_COMPILER=ifx \
-DAMR_WIND_ENABLE_CUDA:BOOL=OFF \
-DAMR_WIND_ENABLE_MPI:BOOL=ON \
-DAMR_WIND_ENABLE_OPENMP:BOOL=OFF \
-DAMR_WIND_TEST_WITH_FCOMPARE:BOOL=OFF \
-DCMAKE_BUILD_TYPE=Release \
-DAMR_WIND_ENABLE_NETCDF:BOOL=ON \
-DAMR_WIND_ENABLE_HYPRE:BOOL=OFF \
-DAMR_WIND_ENABLE_MASA:BOOL=OFF \
-DAMR_WIND_ENABLE_TESTS:BOOL=ON \
-DAMR_WIND_ENABLE_ALL_WARNINGS:BOOL=ON \
-DBUILD_SHARED_LIBS:BOOL=ON \
-DCMAKE_INSTALL_PREFIX:PATH=${PWD}/install
nice make -j32
make install
If coupling to OpenFAST is needed, an additional flag must be passed to cmake
. A complete example is given below.
Sample job script: Building AMR-Wind coupled to OpenFAST using cmake
on Kestrel-CPU
#!/bin/bash
openfastpath=/full/path/to/your/openfast/build/install
module purge
module load PrgEnv-intel
module load netcdf/4.9.2-intel-oneapi-mpi-intel
module load netlib-scalapack/2.2.0-gcc
export LD_LIBRARY_PATH=/nopt/nrel/apps/cray-mpich-stall/libs_mpich_nrel_intel:$LD_LIBRARY_PATH
export LD_PRELOAD=/nopt/nrel/apps/cray-mpich-stall/libs_mpich_nrel_intel/libmpi_intel.so.12:/nopt/nrel/apps/cray-mpich-stall/libs_mpich_nrel_intel/libmpifort_intel.so.12
export MPICH_VERSION_DISPLAY=1
export MPICH_ENV_DISPLAY=1
export MPICH_OFI_CXI_COUNTER_REPORT=2
export FI_MR_CACHE_MONITOR=memhooks
export FI_CXI_RX_MATCH_MODE=software
export MPICH_SMP_SINGLE_COPY_MODE=NONE
echo $LD_LIBRARY_PATH |tr ':' '\n'
module list
cmake .. \
-DCMAKE_C_COMPILER=mpicc \
-DCMAKE_CXX_COMPILER=mpicxx \
-DMPI_Fortran_COMPILER=mpifort \
-DCMAKE_Fortran_COMPILER=ifx \
-DAMR_WIND_ENABLE_CUDA:BOOL=OFF \
-DAMR_WIND_ENABLE_MPI:BOOL=ON \
-DAMR_WIND_ENABLE_OPENMP:BOOL=OFF \
-DAMR_WIND_TEST_WITH_FCOMPARE:BOOL=OFF \
-DCMAKE_BUILD_TYPE=Release \
-DAMR_WIND_ENABLE_NETCDF:BOOL=ON \
-DAMR_WIND_ENABLE_HYPRE:BOOL=OFF \
-DAMR_WIND_ENABLE_MASA:BOOL=OFF \
-DAMR_WIND_ENABLE_TESTS:BOOL=ON \
-DAMR_WIND_ENABLE_ALL_WARNINGS:BOOL=ON \
-DBUILD_SHARED_LIBS:BOOL=ON \
-DOpenFAST_ROOT:PATH=${openfastpath} \
-DCMAKE_INSTALL_PREFIX:PATH=${PWD}/install
nice make -j32
make install
Building ExaWind with ExaWind-manager on Kestrel-GPU#
Building ExaWind applications on GPUs uses processes similar to those on CPUs described earlier. Example one details building the released ExaWind GPU version with its dependencies (AMR-Wind and Nalu-Wind), while examples two and three build the released AMR-Wind and Nalu-Wind independently.
Building ExaWind
on GPUs
$ salloc --time=02:00:00 --account=hpcapps --partition=gpu-h100 --gres=gpu:h100:1 --nodes=1 --ntasks-per-node=52 --gres=gpu:h100:1
# GNU
$ module load PrgEnv-gnu
$ module load cray-mpich/8.1.28
$ module load cray-libsci/23.12.5
$ module load cuda
$ module load cray-python
# clone ExaWind-manager
$ cd /scratch/${USER}
$ git clone --recursive https://github.com/Exawind/exawind-manager.git
$ cd exawind-manager
# Activate exawind-manager
$ export EXAWIND_MANAGER=`pwd`
$ source ${EXAWIND_MANAGER}/start.sh && spack-start
# Create Spack environment and change the software versions if needed
$ mkdir environments
$ cd environments
$ spack manager create-env --name exawind-gpu --spec 'exawind+cuda+gpu-aware-mpi+amr_wind_gpu+nalu_wind_gpu cuda_arch=90 %gcc'
# Activate the environment
$ spack env activate -d ${EXAWIND_MANAGER}/environments/exawind-gpu
# concretize specs and dependencies
$ spack concretize -f
# Build software
$ spack -d install
Building AMR-Wind
on GPUs
$ salloc --time=02:00:00 --account=hpcapps --partition=gpu-h100 --gres=gpu:h100:1 --nodes=1 --ntasks-per-node=52 --gres=gpu:h100:1
# GNU
$ module load PrgEnv-gnu
$ module load cray-mpich/8.1.28
$ module load cray-libsci/23.12.5
$ module load cuda
$ module load cray-python
# clone ExaWind-manager
$ cd /scratch/${USER}
$ git clone --recursive https://github.com/Exawind/exawind-manager.git
$ cd exawind-manager
# Activate exawind-manager
$ export EXAWIND_MANAGER=`pwd`
$ source ${EXAWIND_MANAGER}/start.sh && spack-start
# Create Spack environment and change the software versions if needed
$ mkdir environments
$ cd environments
$ spack manager create-env --name amrwind-gpu --spec 'amr-wind+cuda+gpu-aware-mpi+hypre+netcdf+hdf5 cuda_arch=90 %gcc'
# Activate the environment
$ spack env activate -d ${EXAWIND_MANAGER}/environments/amrwind-gpu
# concretize specs and dependencies
$ spack concretize -f
# Build software
$ spack -d install
Building Nalu-Wind
on GPUs
$ salloc --time=02:00:00 --account=hpcapps --partition=gpu-h100 --gres=gpu:h100:1 --nodes=1 --ntasks-per-node=52 --gres=gpu:h100:1
# GNU
$ module load PrgEnv-gnu
$ module load cray-mpich/8.1.28
$ module load cray-libsci/23.12.5
$ module load cuda
$ module load cray-python
# clone ExaWind-manager
$ cd /scratch/${USER}
$ git clone --recursive https://github.com/Exawind/exawind-manager.git
$ cd exawind-manager
# Activate exawind-manager
$ export EXAWIND_MANAGER=`pwd`
$ source ${EXAWIND_MANAGER}/start.sh && spack-start
# Create Spack environment and change the software versions if needed
$ mkdir environments
$ cd environments
$ spack manager create-env --name naluwind-gpu --spec 'nalu-wind+cuda+gpu-aware-mpi+hypre cuda_arch=90 %gcc'
# Activate the environment
$ spack env activate -d ${EXAWIND_MANAGER}/environments/naluwind-gpu
# concretize specs and dependencies
$ spack concretize -f
# Build software
$ spack -d install
Building AMR-Wind Using CMake on Kestrel-GPU#
On a Kestrel GPU node, build AMR-Wind for GPUs by executing the follow script from the build
directory:
Sample job script: Building AMR-Wind using cmake
on GPUs
#!/bin/bash
module purge
module load binutils
module load PrgEnv-nvhpc
module load cray-libsci/22.12.1.1
module load cmake
module load cmake/3.27.9
module load cray-python
module load netcdf-fortran/4.6.1-oneapi
module load craype-x86-genoa
module load craype-accel-nvidia90
export MPICH_GPU_SUPPORT_ENABLED=1
export CUDAFLAGS="-L/nopt/nrel/apps/gpu_stack/libraries-gcc/06-24/linux-rhel8-zen4/gcc-12.3.0/hdf5-1.14.3-zoremvtiklvvkbtr43olrq3x546pflxe/lib -I/nopt/nrel/apps/gpu_stack/libraries-gcc/06-24/linux-rhel8-zen4/gcc-12.3.0/hdf5-1.14.3-zoremvtiklvvkbtr43olrq3x546pflxe/include -lhdf5 -lhdf5_hl -I${MPICH_DIR}/include -L${MPICH_DIR}/lib -lmpi ${PE_MPICH_GTL_DIR_nvidia90} ${PE_MPICH_GTL_LIBS_nvidia90}"
export CXXFLAGS="-L/nopt/nrel/apps/gpu_stack/libraries-gcc/06-24/linux-rhel8-zen4/gcc-12.3.0/hdf5-1.14.3-zoremvtiklvvkbtr43olrq3x546pflxe/lib -I/nopt/nrel/apps/gpu_stack/libraries-gcc/06-24/linux-rhel8-zen4/gcc-12.3.0/hdf5-1.14.3-zoremvtiklvvkbtr43olrq3x546pflxe/include -lhdf5 -lhdf5_hl -I${MPICH_DIR}/include -L${MPICH_DIR}/lib -lmpi ${PE_MPICH_GTL_DIR_nvidia90} ${PE_MPICH_GTL_LIBS_nvidia90}"
module list
cmake .. \
-DAMR_WIND_ENABLE_CUDA=ON \
-DAMR_WIND_ENABLE_TINY_PROFILE:BOOL=ON \
-DAMReX_CUDA_ERROR_CAPTURE_THIS:BOOL=ON \
-DCMAKE_CUDA_COMPILE_SEPARABLE_COMPILATION:BOOL=ON \
-DCMAKE_CXX_COMPILER:STRING=CC \
-DCMAKE_C_COMPILER:STRING=cc \
-DMPI_CXX_COMPILER=/opt/cray/pe/mpich/8.1.28/ofi/nvidia/23.3/bin/mpicxx \
-DMPI_C_COMPILER=/opt/cray/pe/mpich/8.1.28/ofi/nvidia/23.3/bin/mpicc \
-DMPI_Fortran_COMPILER=/opt/cray/pe/mpich/8.1.28/ofi/nvidia/23.3/bin/mpifort \
-DAMReX_DIFFERENT_COMPILER=ON \
-DCMAKE_CUDA_ARCHITECTURES=90 \
-DAMR_WIND_ENABLE_CUDA=ON \
-DAMR_WIND_ENABLE_CUDA:BOOL=ON \
-DAMR_WIND_ENABLE_OPENFAST:BOOL=OFF \
-DAMR_WIND_ENABLE_NETCDF:BOOL=ON \
-DAMR_WIND_ENABLE_HDF5:BOOL=ON \
-DAMR_WIND_ENABLE_MPI:BOOL=ON \
-DCMAKE_BUILD_TYPE=Release \
-DAMR_WIND_ENABLE_HYPRE:BOOL=OFF \
-DAMR_WIND_ENABLE_MASA:BOOL=OFF \
-DAMR_WIND_ENABLE_TESTS:BOOL=ON \
-DCMAKE_INSTALL_PREFIX:PATH=./install
make -j32 amr_wind
You should now have a successful installation of AMR-Wind. At runtime, make sure to load the same modules used to build, as discussed below.
Running ExaWind#
Running ExaWind on Kestrel-CPU#
ExaWind applications on more than 8 nodes run better on the hbw
("high bandwidth") partition than on short
, standard
, or long
partitions. We strongly recommend submitting multi-node jobs to the hbw
partition for the best performance and to save AUs. Our benchmark studies show ExaWind applications perform best on the hbw
partition with 72 MPI ranks per node. Following are example scripts illustrating the above recommendations.
Note
Single-node jobs are not allowed to be submitted to hbw
; they should instead be submitted to short
, standard
, or long
.
Sample job script: Running ExaWind with the hbw
partition
#!/bin/bash
#SBATCH --job-name=<job-name>
#SBATCH --nodes=16
#SBATCH --ntasks-per-node=72
#SBATCH --time=1:00:00
#SBATCH --partition=hbw
#SBATCH --account=<account-name>
#SBATCH --exclusive
#SBATCH --mem=0
module load PrgEnv-intel
module load cray-python
module list
export EXAWIND_MANAGER=/scratch/{user}/exawind-manager
source ${EXAWIND_MANAGER}/start.sh && spack-start
spack env activate -d ${EXAWIND_MANAGER}/environments/exawind-cpu
spack load exawind
export MPICH_OFI_NIC_POLICY=NUMA
# Adjust the ratio of total MPI ranks for AMR-Wind and Nalu-Wind as needed by a job
srun -N $SLURM_JOB_NUM_NODES -n $(($SLURM_NTASKS_PER_NODE * $SLURM_JOB_NUM_NODES)) \
--distribution=block:block --cpu_bind=rank_ldom exawind --awind $(($SLURM_NTASKS_PER_NODE * $SLURM_JOB_NUM_NODES) * 0.25) \
--nwind $(($SLURM_NTASKS_PER_NODE * $SLURM_JOB_NUM_NODES) * 0.75) <input-name>.yaml
Sample job script: Running AMR-Wind using the hbw
partition
#!/bin/bash
#SBATCH --job-name=<job-name>
#SBATCH --nodes=16
#SBATCH --ntasks-per-node=72
#SBATCH --time=1:00:00
#SBATCH --partition=hbw
#SBATCH --account=<account-name>
#SBATCH --exclusive
#SBATCH --mem=0
module load PrgEnv-intel
module load cray-mpich/8.1.28
module load cray-libsci/23.12.5
module load cray-python
module list
export EXAWIND_MANAGER=/scratch/{user}/exawind-manager
source ${EXAWIND_MANAGER}/start.sh && spack-start
spack env activate -d ${EXAWIND_MANAGER}/environments/amrwind-cpu
spack load amr-wind
export MPICH_OFI_NIC_POLICY=NUMA
srun -N $SLURM_JOB_NUM_NODES -n $(($SLURM_NTASKS_PER_NODE * $SLURM_JOB_NUM_NODES)) --distribution=block:block --cpu_bind=rank_ldom amr_wind <input-name>.inp
Sample job script: Running Nalu-Wind using the hbw
partition
#!/bin/bash
#SBATCH --job-name=<job-name>
#SBATCH --nodes=16
#SBATCH --ntasks-per-node=72
#SBATCH --time=1:00:00
#SBATCH --partition=hbw
#SBATCH --account=<account-name>
#SBATCH --exclusive
#SBATCH --mem=0
module load PrgEnv-intel
module load cray-mpich/8.1.28
module load cray-libsci/23.12.5
module load cray-python
module list
export EXAWIND_MANAGER=/scratch/{user}/exawind-manager
source ${EXAWIND_MANAGER}/start.sh && spack-start
spack env activate -d ${EXAWIND_MANAGER}/environments/naluwind-cpu
spack load nalu-wind
export MPICH_OFI_NIC_POLICY=NUMA
srun -N $SLURM_JOB_NUM_NODES -n $(($SLURM_NTASKS_PER_NODE * $SLURM_JOB_NUM_NODES)) --distribution=block:block --cpu_bind=rank_ldom naluX <input-name>.yaml
Our benchmark studies suggest that using the stall
library with the hbw
partition could further improve ExaWind application performance. Moreover, the stall library is highly recommended for ExaWind applications running on 8 nodes or less, each with 96 cores, across short, standard, or long partitions. Optimizing the MPICH_OFI_CQ_STALL_USECS
parameter is key to acheive the best performance. Following are sample scripts demonstrating the aforementioned recommendations.
Sample job script: Running ExaWind using the stall library
#!/bin/bash
#SBATCH --job-name=<job-name>
#SBATCH --partition=<partition-name> # hbw, short, standard or long
#SBATCH --nodes=<nodes> # >=16 nodes for hbw or <=8 nodes for short, standard or long
#SBATCH --ntasks-per-node=<cores> # 72 cores for hbw or 96 cores for short, standard or long
#SBATCH --time=1:00:00
#SBATCH --account=<account-name>
#SBATCH --exclusive
#SBATCH --mem=0
module load PrgEnv-intel
module load cray-mpich/8.1.28
module load cray-libsci/23.12.5
module load cray-python
module list
export EXAWIND_MANAGER=/scratch/{user}/exawind-manager
source ${EXAWIND_MANAGER}/start.sh && spack-start
spack env activate -d ${EXAWIND_MANAGER}/environments/exawind-cpu
spack load exawind
export LD_PRELOAD=/nopt/nrel/apps/cray-mpich-stall/libs_mpich_nrel_intel/libmpi_intel.so.12:/nopt/nrel/apps/cray-mpich-stall/libs_mpich_nrel_intel/libmpifort_intel.so.12
export MPICH_OFI_CQ_STALL=1
# Find an optimal value from this list [1,3,6,9,12,16,20,24]
export MPICH_OFI_CQ_STALL_USECS=12
export MPICH_OFI_CQ_MIN_PPN_PER_NIC=26
export MPICH_OFI_NIC_POLICY=NUMA
# Adjust the ratio of total MPI ranks for AMR-Wind and Nalu-Wind as needed by a job
srun -N $SLURM_JOB_NUM_NODES -n $(($SLURM_NTASKS_PER_NODE * $SLURM_JOB_NUM_NODES)) \
--distribution=block:block --cpu_bind=rank_ldom exawind --awind $(($SLURM_NTASKS_PER_NODE * $SLURM_JOB_NUM_NODES) * 0.25) \
--nwind $(($SLURM_NTASKS_PER_NODE * $SLURM_JOB_NUM_NODES) * 0.75) <input-name>.yaml
Sample job script: Running AMR-Wind using the stall library nodes
#!/bin/bash
#SBATCH --job-name=<job-name>
#SBATCH --partition=<partition-name> # hbw, short, standard or long
#SBATCH --nodes=<nodes> # >=16 nodes for hbw or <=8 nodes for short, standard or long
#SBATCH --ntasks-per-node=<cores> # 72 cores for hbw or 96 cores for short, standard or long
#SBATCH --time=1:00:00
#SBATCH --account=<account-name>
#SBATCH --exclusive
#SBATCH --mem=0
module load PrgEnv-intel
module load cray-mpich/8.1.28
module load cray-libsci/23.12.5
module load cray-python
module list
export EXAWIND_MANAGER=/scratch/{user}/exawind-manager
source ${EXAWIND_MANAGER}/start.sh && spack-start
spack env activate -d ${EXAWIND_MANAGER}/environments/amrwind-cpu
spack load amr-wind
export LD_PRELOAD=/nopt/nrel/apps/cray-mpich-stall/libs_mpich_nrel_intel/libmpi_intel.so.12:/nopt/nrel/apps/cray-mpich-stall/libs_mpich_nrel_intel/libmpifort_intel.so.12
export MPICH_OFI_CQ_STALL=1
# Find an optimal value from this list [1,3,6,9,12,16,20,24]
export MPICH_OFI_CQ_STALL_USECS=12
export MPICH_OFI_CQ_MIN_PPN_PER_NIC=26
export MPICH_OFI_NIC_POLICY=NUMA
srun -N $SLURM_JOB_NUM_NODES -n $(($SLURM_NTASKS_PER_NODE * $SLURM_JOB_NUM_NODES)) --distribution=block:block --cpu_bind=rank_ldom amr_wind <input-name>.inp
Sample job script: Running Nalu-Wind using the stall library nodes
#!/bin/bash
#SBATCH --job-name=<job-name>
#SBATCH --partition=<partition-name> # hbw, short, standard or long
#SBATCH --nodes=<nodes> # >=16 nodes for hbw or <=8 nodes for short, standard or long
#SBATCH --ntasks-per-node=<cores> # 72 cores for hbw or 96 cores for short, standard or long
#SBATCH --time=1:00:00
#SBATCH --account=<account-name>
#SBATCH --exclusive
#SBATCH --mem=0
module load PrgEnv-intel
module load cray-mpich/8.1.28
module load cray-libsci/23.12.5
module load cray-python
module list
export EXAWIND_MANAGER=/scratch/{user}/exawind-manager
source ${EXAWIND_MANAGER}/start.sh && spack-start
spack env activate -d ${EXAWIND_MANAGER}/environments/naluwind-cpu
spack load nalu-wind
export LD_PRELOAD=/nopt/nrel/apps/cray-mpich-stall/libs_mpich_nrel_intel/libmpi_intel.so.12:/nopt/nrel/apps/cray-mpich-stall/libs_mpich_nrel_intel/libmpifort_intel.so.12
export MPICH_OFI_CQ_STALL=1
# Find an optimal value from this list [1,3,6,9,12,16,20,24]
export MPICH_OFI_CQ_STALL_USECS=12
export MPICH_OFI_CQ_MIN_PPN_PER_NIC=26
export MPICH_OFI_NIC_POLICY=NUMA
srun -N $SLURM_JOB_NUM_NODES -n $(($SLURM_NTASKS_PER_NODE * $SLURM_JOB_NUM_NODES)) --distribution=block:block --cpu_bind=rank_ldom naluX <input-name>.yaml
Running ExaWind on Kestrel-GPU#
Running ExaWind on GPUs yields optimal performance. The following scripts illustrate how to submit jobs on the gpu-h100
partition.
Sample job script: Running ExaWind on GPU nodes
#!/bin/bash
#SBATCH --time=1:00:00
#SBATCH --account=<user-account>
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=4
#SBATCH --gpus=h100:4
#SBATCH --exclusive
#SBATCH --mem=0
module load PrgEnv-gnu
module load cray-mpich/8.1.28
module load cray-libsci/23.12.5
module load cuda
module load cray-python
export EXAWIND_MANAGER=/scratch/${USER}/exawind-manager
source ${EXAWIND_MANAGER}/start.sh && spack-start
spack env activate -d ${EXAWIND_MANAGER}/environments/exawind-gpu
spack load exawind@master
export MPICH_OFI_NIC_POLICY=NUMA
# Adjust the ratio of total MPI ranks for AMR-Wind and Nalu-Wind as needed by a job
srun -N $SLURM_JOB_NUM_NODES -n $(($SLURM_NTASKS_PER_NODE * $SLURM_JOB_NUM_NODES)) --distribution=block:block --cpu_bind=rank_ldom \
exawind --nwind $(($SLURM_NTASKS_PER_NODE * $SLURM_JOB_NUM_NODES) * 0.75) --awind $(($SLURM_NTASKS_PER_NODE * $SLURM_JOB_NUM_NODES) * 0.25) <input-name>.yaml
wait
Sample job script: Running AMR-Wind on GPU nodes
#!/bin/bash
#SBATCH --time=1:00:00
#SBATCH --account=<user-account> # Replace with your HPC account
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=4
#SBATCH --gpus=h100:4
#SBATCH --exclusive
#SBATCH --mem=0
module load PrgEnv-gnu
module load cray-mpich/8.1.28
module load cray-libsci/23.12.5
module load cuda
module load cray-python
export EXAWIND_MANAGER=/scratch/${USER}/exawind-manager
source ${EXAWIND_MANAGER}/start.sh && spack-start
spack env activate -d ${EXAWIND_MANAGER}/environments/amrwind-gpu
spack load amr-wind
export MPICH_OFI_NIC_POLICY=NUMA
srun -N $SLURM_JOB_NUM_NODES -n $(($SLURM_NTASKS_PER_NODE * $SLURM_JOB_NUM_NODES)) --distribution=block:block --cpu-bind=rankldom amr_wind <input-name>.inp
Sample job script: Running Nalu-Wind on GPU nodes
#!/bin/bash
#SBATCH --time=1:00:00
#SBATCH --account=<user-account>
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=4
#SBATCH --gpus=h100:4
#SBATCH --exclusive
#SBATCH --mem=0
module load PrgEnv-gnu
module load cray-mpich/8.1.28
module load cray-libsci/23.12.5
module load cuda
module load cray-python
export EXAWIND_MANAGER=/scratch/${USER}/exawind-manager
source ${EXAWIND_MANAGER}/start.sh && spack-start
spack env activate -d ${EXAWIND_MANAGER}/environments/naluwind-gpu
spack load nalu-wind
export MPICH_OFI_NIC_POLICY=NUMA
srun -N $SLURM_JOB_NUM_NODES -n $(($SLURM_NTASKS_PER_NODE * $SLURM_JOB_NUM_NODES)) --distribution=block:block --cpu-bind=rankldom naluX <input-name>.yaml