Track and visualize machine learning experiments with W&B. Log metrics, hyperparameters, and model artifacts for ML training runs.
Track, visualize, and manage machine learning experiments with Weights & Biases (W&B). This skill helps you integrate W&B into ML projects to log metrics, hyperparameters, datasets, and models.
This skill guides you through setting up Weights & Biases experiment tracking in Python machine learning projects. It handles library installation, authentication setup, run initialization, metric logging, and integration with popular ML frameworks.
When a user requests W&B integration or experiment tracking:
1. **Verify the project context:**
- Check if this is a Python ML project
- Identify the ML framework being used (PyTorch, TensorFlow, scikit-learn, etc.)
- Locate training scripts or notebooks
2. **Install wandb:**
- Add `wandb` to requirements.txt or install with `pip install wandb`
- For conda environments, use `conda install -c conda-forge wandb`
3. **Set up authentication:**
- Inform the user they need a W&B account (https://wandb.ai/login)
- Add authentication code: `wandb.login()` or instruct user to run `wandb login` CLI command
- Suggest storing API key as environment variable `WANDB_API_KEY`
4. **Initialize W&B run:**
- Add `wandb.init()` at the start of training code
- Set project name: `wandb.init(project="project-name")`
- Pass hyperparameters as config: `wandb.init(project="project-name", config={"lr": 0.001, "epochs": 10})`
- Use context manager for automatic cleanup: `with wandb.init(...) as run:`
5. **Log metrics and artifacts:**
- Add `wandb.log({"metric_name": value})` calls to log training metrics
- Log at appropriate intervals (per step, per epoch)
- For models: `wandb.log_artifact(model_path, type="model")`
- For datasets: `wandb.log_artifact(dataset_path, type="dataset")`
6. **Framework-specific integration:**
- **PyTorch**: Use `WandbCallback` or manual logging in training loop
- **TensorFlow/Keras**: Use `WandbCallback` in `model.fit()`
- **Hugging Face**: Set `report_to="wandb"` in TrainingArguments
- **scikit-learn**: Log manually after training
- Check W&B docs for framework-specific patterns
7. **Add best practices:**
- Name runs descriptively or let W&B auto-generate
- Group related runs with `wandb.init(group="experiment-name")`
- Add tags: `wandb.init(tags=["baseline", "cnn"])`
- Log system metrics automatically with default settings
- Call `wandb.finish()` or use context manager to mark run complete
8. **Handle common scenarios:**
- **Sweeps**: For hyperparameter tuning, create sweep config and use `wandb.agent()`
- **Distributed training**: Each process can log to the same run
- **Offline mode**: Set `WANDB_MODE=offline` for no internet environments
- **Private hosting**: Configure `WANDB_BASE_URL` for self-managed instances
```python
import wandb
import torch
import torch.nn as nn
from torch.utils.data import DataLoader
config = {
"learning_rate": 0.001,
"epochs": 10,
"batch_size": 32,
"architecture": "ResNet50"
}
with wandb.init(project="image-classification", config=config) as run:
# Access config
lr = wandb.config.learning_rate
epochs = wandb.config.epochs
# Training loop
model = create_model()
optimizer = torch.optim.Adam(model.parameters(), lr=lr)
for epoch in range(epochs):
for batch_idx, (data, target) in enumerate(train_loader):
loss = train_step(model, data, target, optimizer)
# Log metrics
wandb.log({
"train/loss": loss,
"train/epoch": epoch,
"train/batch": batch_idx
})
# Validation
val_loss, val_acc = validate(model, val_loader)
wandb.log({
"val/loss": val_loss,
"val/accuracy": val_acc,
"epoch": epoch
})
# Save model artifact
torch.save(model.state_dict(), "model.pth")
wandb.save("model.pth")
```
Leave a review
No reviews yet. Be the first to review this skill!
# Download SKILL.md from killerskills.ai/api/skills/weights-and-biases-integration/raw