[INDEX] // ALL_DOCS ›
[TOC] // ON_THIS_PAGE ›
At a glance
Workload: Train ResNet on CIFAR-10 across a 2×2 hyperparameter grid (learning rates × batch sizes) using GitHub Actions matrix strategy
Runner: gpu=t4, cpu=4, ram=16 × 4 parallel jobs ($0.004/min each)
Estimated cost: ~$0.55 per full sweep (~30 min wall-clock thanks to parallelism)
This page shows how to use a GitHub Actions matrix to fan out hyperparameter combinations across parallel machine.dev GPU runners, then aggregate the results in a single comparison job.
Use Case Overview
Why might you want to use parallel hyperparameter tuning?
- Find optimal model configurations more efficiently by testing multiple parameter sets simultaneously
- Reduce the total time needed for hyperparameter search
- Systematically compare model performance across different configurations
- Automate the process of identifying the best-performing models
How It Works
The Parallel Hyperparameter Tuning workflow uses GitHub Actions’ matrix strategy to run multiple training jobs concurrently. Each job trains a ResNet model on the CIFAR-10 dataset with a different combination of hyperparameters. The workflow is defined in GitHub Actions and can be triggered on-demand.
The tuning process:
- Defines a matrix of hyperparameter combinations to explore
- Launches multiple GPU-powered training jobs concurrently, one for each combination
- Saves performance metrics from each training run as artifacts
- Aggregates and compares results across all runs
- Generates a comprehensive comparison report
Workflow Implementation
The Parallel Hyperparameter Tuning is implemented as a GitHub Actions workflow that runs multiple jobs in parallel. Here’s the workflow definition:
name: ResNet Hyperparameter Tuning
on:
workflow_dispatch:
jobs:
hyperparameter_tuning:
name: Hyperparameter Tuning
runs-on:
- machine
- gpu=t4
- cpu=4
- ram=16
- architecture=x64
timeout-minutes: 30
strategy:
fail-fast: false
matrix:
learning_rate: [0.001, 0.0005]
batch_size: [32, 64]
steps:
- uses: actions/checkout@v4
- name: Set up Python 3.10
uses: actions/setup-python@v5
with:
python-version: '3.10'
- name: Install uv
uses: astral-sh/setup-uv@v5
- name: Install dependencies
run: |
uv venv .venv --python=3.10
source .venv/bin/activate
uv pip install -r requirements.txt
deactivate
- name: Train and Evaluate ResNet
env:
LEARNING_RATE: ${{ matrix.learning_rate }}
BATCH_SIZE: ${{ matrix.batch_size }}
run: |
source .venv/bin/activate
python train.py
deactivate
- name: Upload metrics artifact
uses: actions/upload-artifact@v4
with:
name: metrics-${{ matrix.learning_rate }}-${{ matrix.batch_size }}
path: metrics_*.json
compare_tuning:
needs: hyperparameter_tuning
name: Compare Tuning Performance
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Python 3.10
uses: actions/setup-python@v5
with:
python-version: '3.10'
- name: Install uv
uses: astral-sh/setup-uv@v5
- name: Install dependencies
run: |
uv venv .venv --python=3.10
source .venv/bin/activate
uv pip install -r requirements.txt
deactivate
- name: Download all metrics
uses: actions/download-artifact@v4
with:
path: metrics
- name: Compare Metrics
run: |
source .venv/bin/activate
python compare_metrics.py
deactivate
- name: Upload comparison results
uses: actions/upload-artifact@v4
with:
name: comparison-results
path: model_comparison.csv
Key Features
The power of this implementation comes from several key features:
-
Matrix Strategy: The workflow defines a matrix of hyperparameters, automatically creating separate jobs for each combination. In this example, we’re exploring two learning rates (0.001, 0.0005) and two batch sizes (32, 64), resulting in 4 concurrent training jobs.
-
Parallel Execution: Each hyperparameter combination runs as a separate job on its own GPU runner, allowing multiple experiments to run simultaneously rather than sequentially.
-
Metrics Collection: Each training job produces performance metrics that are saved as artifacts with names that indicate the hyperparameter values used.
-
Automated Comparison: After all training jobs complete, a separate job downloads all metrics and generates a comparison report, making it easy to identify the best configuration.
Using machine.dev GPU Runners
This hyperparameter tuning process leverages machine.dev GPU runners to provide the necessary computing power for efficient model training. The workflow is configured to use:
- T4 GPU: An entry-level ML GPU with 16GB VRAM, well-suited for training moderate-sized models
- Configurable resources: CPU, RAM, and architecture specifications optimized for each training job
The parallel nature of this approach means that you can complete a hyperparameter search in a fraction of the time it would take to run sequentially, even when using the same hardware resources per job.
Best Practices
- Choose parameters wisely: Select hyperparameters that have the most impact on model performance
- Start with a broad search: Begin with a wide range of values, then refine with narrower ranges around promising values
- Consider resource allocation: Adjust CPU/RAM requirements based on your specific model and dataset needs
- Set appropriate timeouts: Ensure your workflow timeout is sufficient for all jobs to complete
- Use fail-fast: false: This ensures all combinations are evaluated even if some fail, giving you a complete picture
Getting Started
To run the Parallel Hyperparameter Tuning workflow:
- Use the MachineDotDev/parallel-hyperparameter-tuning repository as a template
- Navigate to the Actions tab in your repository
- Select the “ResNet Hyperparameter Tuning” workflow
- Click “Run workflow” to start the tuning process
- Wait for all jobs to complete
- Download the comparison-results artifact to identify the best hyperparameter configuration
Customizing the Workflow
You can easily adapt this workflow for your own models and hyperparameters:
- Modify the matrix in the workflow file to include your specific hyperparameters
- Update the training script (train.py) to work with your model and dataset
- Adjust the metrics collection to capture the performance indicators most relevant to your task
- Customize the comparison script (compare_metrics.py) to generate insights tailored to your needs
How to adapt this
- More hyperparameters: add
optimizer,weight_decay,dropout, etc. to the matrix — every combination spawns its own runner - Larger sweep: matrix with 5×5×5 = 125 combinations is fine; each runs on its own runner concurrently
- Larger model: bump to
gpu=l4orgpu=a10gif T4’s 16 GB VRAM is too tight - Use spot: add
tenancy=spotto cut costs by 70–90% (the example currently uses on-demand)
Next steps
- Working repo — fork or use as a template
- Cost Optimization — spot pricing for sweep economics
- LLM Supervised Fine-Tuning — apply the same matrix technique to LLM fine-tunes