How TosΒΆ Contents: Getting Started Installation Deploy sparkctl in an HPC environment URLs sparkctl configuration file Configuration How to Select Compute Nodes Storage Speed Requirements Storage Options (Best to Worst) Recommendations by Workload Configuring Shuffle Storage Location Capacity Planning Checking Node Storage Network Considerations Heterogeneous Slurm jobs Interactive job Batch job How to handle compute node failures How to use a custom spark-default.conf file How to Set a Custom Spark Log Level Reducing Log Verbosity Available Log Levels Recommendations Example Output Comparison Per-Logger Configuration Execution Start a Spark Cluster How to run Spark jobs in Python Interactive session with pyspark Jupyter notebook Script execution with spark-submit How to to monitor Spark resource utilization Managed execution Applications How to configure a Hive metastore Visualize Data with Tableau Architecture Concepts Compute Node Instructions Client-side Instructions Persistent metastore and spark-warehouse Debugging Spark web UI Log files Spark shuffle partitions Slow local storage Executors are spilling to disk Data skew Performance monitoring Automated resource monitoring Spark tuning resources