Validating ML results using Tensorboard#
Tensorboard provides visualization and tooling needed for machine learning, deep learning, and reinforcement learning experimentation:
- Tracking and visualizing metrics such as loss and accuracy.
- Visualizing the model graph (ops and layers).
- Viewing histograms of weights, biases, or other tensors as they change over time.
- Projecting embeddings to a lower dimensional space.
- Displaying images, text, and audio data.
- Profiling TensorFlow programs.
For RL it is useful to visualize metrics such as:
- Mean, min, and max reward values.
- Episodes/iteration.
- Estimated Q-values.
- Algorithm-specific metrics (e.g. entropy for PPO).
To visualize results from Tensorboard, first cd to the directory where your results reside. E.g., if you ran experiments using ray, then do the following:
cd ~/ray_results/
There are three main methods for activating Tensorboard:
- If you included Tensorboard installation in an Anaconda environment, simply activate it:
module purge conda activate <your_environment> - You can also install Tensorboard in userspace using
pip install:pip install tensorboard --user - Or, install using container images:
ml singularity-container singularity pull docker://tensorflow/tensorflow singularity run tensorflow_latest.sif
Then, initialize Tensorboard using a pre-specified port number of your choosing (e.g. 6006, 8008):
tensorboard --logdir=. --port 6006 --bind_all
Serving TensorBoard on localhost; to expose to the network, use a proxy or pass --bind_all
TensorBoard 2.5.0 at http://localhost:6006/ (Press CTRL+C to quit)
ssh -NfL 6006:localhost:6006 $USER@el1.hpc.nrel.gov
http://localhost:6006/) in a browser, where all the aforementioned plots will be shown.