Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Deep Dive into Experiment Tracking

# Install dependencies if running in Google Colab
try:
    import google.colab
    !pip install mlflow scikit-learn pandas matplotlib
except ImportError:
    pass

Deep Dive into Experiment Tracking

Tracking is more than just logging a few accuracy scores. In this lesson, we cover Autologging, Artifacts, and Nested Runs for complex experiments.

1. The Power of Autologging

Writing mlflow.log_param for every variable is tedious. MLflow’s autolog() feature automatically captures parameters, metrics, and models for popular libraries (Scikit-Learn, TensorFlow, PyTorch, XGBoost, etc.).

import mlflow
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

# Enable autologging for scikit-learn
mlflow.sklearn.autolog()

db = load_iris()
X_train, X_test, y_train, y_test = train_test_split(db.data, db.target)

with mlflow.start_run(run_name="Autologging_Example"):
    clf = RandomForestClassifier(n_estimators=100, max_depth=5)
    clf.fit(X_train, y_train)
    # No need to log metrics manually! Autolog handles it.

2. Manual Logging: Artifacts and Dataframes

Sometimes you need to log custom items like feature importance plots or CSV summaries.

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

with mlflow.start_run(run_name="Custom_Artifacts"):
    # 1. Log a simple text file
    with open("features.txt", "w") as f:
        f.write("sepal length, sepal width, petal length, petal width")
    mlflow.log_artifact("features.txt")
    
    # 2. Log a Plotly/Matplotlib figure directly
    fig, ax = plt.subplots()
    ax.plot([0, 1, 2], [10, 20, 30])
    ax.set_title("Training Progress")
    mlflow.log_figure(fig, "plots/progress_chart.png")
    
    # 3. Log a Dictionary as JSON
    config = {"batch_size": 32, "optimizer": "adam", "layers": [64, 32]}
    mlflow.log_dict(config, "configs/hyperparams.json")

3. Organizing with Nested Runs

When doing Hyperparameter Tuning, you don’t want 100 individual runs cluttering your UI. You want one parent run with many children.

def train_model(n_estimators):
    with mlflow.start_run(run_name=f"Estimators_{n_estimators}", nested=True):
        # ... training logic ...
        mlflow.log_param("n_estimators", n_estimators)
        mlflow.log_metric("score", np.random.random())

with mlflow.start_run(run_name="Grid_Search_Parent"):
    for n in [10, 50, 100]:
        train_model(n)

4. Best Practice: Tags

Tags are searchable metadata. Use them to label runs with ‘production_candidate’, ‘baseline’, or the name of the data scientist.

mlflow.set_tag("release.version", "2.1.0")
mlflow.set_tag("model_type", "ensemble")