Skip to content

Using TensorBoard for Visualization

Vit Stepanovs edited this page Feb 1, 2017 · 8 revisions

TensorBoard is a suite of visualization tools that makes it easier to understand and debug deep learning programs. For example, it allows viewing the model graph, plotting various scalar values as the training progresses, and visualizing the embeddings.

CNTK ProgressPrinter class in Python now supports output in the native TensorBoard format, thus enabling rich visualization capabilities for CNTK jobs. At present, ProgressPrinter can be used to:

  • Record model graph.
  • Record arbitrary scalar values during training.
  • Automatically record the values of a loss function and error rate during training.

CNTK model graph as displayed in TensorBoard.

Loss and error rate logged from CNTK and displayed in TensorBoard.

[Examples/TensorBoard/SimpleMNIST.py] (https://github.com/Microsoft/CNTK/blob/master/Examples/Tensorboard/SimpleMNIST.py) script provides an example of how to generate output in TensorBoard format.

First, you need to instantiate a ProgressPrinter class. The following constructor arguments are relevant for TensorBoard output:

  • tensorboard_log_dir - a directory where the output files will be created.
  • freq – how frequently to log to output files. E.g. the value of 2 will cause every second call to update method to write to disk.
  • model – a CNTK model to visualize.

For example, the below line instantiates a ProgressPrinter that will create files in the ‘log’ directory and write to disk on every 10th call. This ProgressPrinter will also persist the my_model's graph, so that it can be later visualized.

progress_printer = ProgressPrinter(freq=10, tensorboard_log_dir=log’, model=my_model)

In your training loop, you can then call ProgressPrinter’s update_with_trainer() method after training on each minibatch. This will automatically record average values of a loss function and error rate as the training progresses. Similarly, you can call epoch_summary() method after each epoch to record the average loss and error for each epoch. Finally, you can use update_value() method to record any arbitrary scalar value that you want to plot in TensorBoard.

After the end of your training loop, you should call end_progress_print() method to make sure that any outstanding records are immediately persisted to disk and any open files are closed.

Your training loop could look something like this:

for epoch in range(max_epochs):
    epoch_end = (epoch+1) * epoch_size
    while t < epoch_end:
        data = reader.next_minibatch(min(minibatch_size, epoch_end-t),  
                                     input_map=input_map)
        trainer.train_minibatch(data)               
        t += trainer.previous_minibatch_sample_count 
        progress_printer.update_with_trainer(trainer, with_metric=True)

        # Log mean of each parameter tensor, to confirm that the parameters change
        # indeed. Don't want to do that very often though, not to spend too much
        # time computing the mean.
        if t % 10000 == 0:
            for p in my_model.parameters:
                progress_printer.update_value(p.uid + "_mean", 
                                              reduce_mean(p).eval(), t)
        progress_printer.epoch_summary(with_metric=True)

progress_printer.end_progress_print()

TensorBoard is not part of CNTK package and should be installed separately. After the installation, once your training job is started, you can launch TensorBoard to monitor its progress by running the following command:

    tensorboard --logdir=log

(assuming the command is run from the script’s working directory) and navigate to http://localhost:6006/ in your favorite web-browser.

Clone this wiki locally