# Start a Spark Cluster This page assumes that you have allocated compute nodes via Slurm. 1. Activate the Python environment that contains sparkctl. ```console $ module load python $ source ~/python-envs/sparkctl ``` 2. Configure and start the Spark cluster. The sparkctl code will detect the compute nodes based on Slurm environment variables. ```console $ sparkctl configure ``` 3. Optional, inpect the Spark configuration in `./conf`. 4. Start the cluster. ```console $ sparkctl start ``` 5. Set the environment variable `SPARK_CONF_DIR`. This will ensure that your application uses the Spark settings created in step 2. Instructions will be printed to the console. By default, it will be ```console $ export SPARK_CONF_DIR=$(pwd)/conf ``` 6. Set the `JAVA_HOME` environment variable to be the same as the java used by Spark. This should bin in your `/.sparkctl.toml` configuration file. ```console $ export JAVA_HOME=/datasets/images/apache_spark/jdk-21.0.7 ``` 7. Run your application. 8. Shut down the cluster. ```console $ sparkctl stop ```