16 · Inspect the Training Results

16 · Inspect the Training Results#

When trainer.fit() finishes, it returns a Result object.

This object contains:

  • Final metrics → the most recent values reported from the training loop (e.g., loss at the last epoch).

  • Checkpoint → a reference to the latest saved checkpoint, including its path in cluster storage.

  • Metrics dataframe → a history of all reported metrics across epochs (accessible with result.metrics_dataframe).

  • Best checkpoints → Ray automatically tracks checkpoints associated with their reported metrics.

In the output above, you can see:

  • The final reported loss at epoch 1.

  • The location where checkpoints are stored (/mnt/cluster_storage/training/distributed-mnist-resnet18/...).

  • A list of best checkpoints with their corresponding metrics.

This makes it easy to both analyze training performance and restore the trained model later for inference.

# 16. Show the training results  

result  # contains metrics, checkpoints, and run history

17 · View Metrics as a DataFrame#

The Result object also includes a metrics_dataframe, which stores the full history of metrics reported during training.

  • Each row corresponds to one reporting step (here, each epoch).

  • The columns show the metrics you logged in the training loop (e.g., loss, epoch).

  • This makes it easy to plot learning curves or further analyze training progress.

In the example below, you can see the training loss steadily decreasing across two epochs.

# 17. Display the full metrics history as a pandas DataFrame

result.metrics_dataframe

To learn more about the training results, see this docs on inspecting the training results.