[INDEX] // ALL_DOCS ›
[TOC] // ON_THIS_PAGE ›
This page is a quick reference for the GitHub Actions YAML syntax for machine.dev runners. For the full per-selector reference see Configuration options.
The basics
Pack every selector into a single label string, led by machine and separated by /. Every job needs the leading machine sentinel and exactly one runner type selector (cpu=N or gpu=TYPE):
jobs:
cpu-job:
runs-on: machine/cpu=8 # CPU job
gpu-job:
runs-on: machine/gpu=l4 # GPU job
Add more selectors with /: machine/gpu=l4/tenancy=spot/regions=eu-south-2. To keep a runner from being grabbed by another job, thread the run id with id=: machine/id=${{ github.run_id }}/gpu=l4. See Configuration → Label format for the full rules, the runs-on.com migration map, and the legacy list-form note.
How machine.dev reads your requirements
machine.dev doesn’t inject environment variables into your job. Instead, the selectors you pack into runs-on: are how machine.dev knows what to provision. The Machine Provisioner reads gpu=l4, tenancy=spot, regions=eu-south-2, etc., launches a matching AWS instance, and the runner inside that instance dials home to GitHub and registers as an ephemeral self-hosted runner.
The label is your source of truth. Inside the job itself, the environment is whatever the runner image provides. You can nvidia-smi, nproc, lscpu, uname -m, etc., to inspect what you got.
All available selectors
See Configuration options for the complete table. The summary:
machine: required, always the leading segmentid=${{ github.run_id }}: recommended — pins the runner to this job so it can’t be stolencpu=Norgpu=TYPE: required, exactly onearchitecture=arm64: optional, CPU runners only (T4G GPU is always ARM)cpu=Nandram=N: optional, only withgpu=tenancy=spot: optional, defaulton_demandregions=...: optional, comma-separated AWS region codesdisk_size=N,disk_iops=N,disk_throughput=N: optional, EBS gp3 storage tuningmetrics=true|false,metrics_interval=N: optional, metrics control
Examples
Spot CPU build
jobs:
build:
runs-on: machine/cpu=16/tenancy=spot
steps:
- uses: actions/checkout@v4
- run: make -j$(nproc)
EU-only GPU job
jobs:
train:
runs-on: machine/gpu=l4/regions=eu-south-2
steps:
- uses: actions/checkout@v4
- run: python train.py
Matrix across GPU types
Matrix jobs share a label set, which is exactly when job-stealing bites. Make the id per-job-unique so each matrix leg gets its own runner:
jobs:
bench:
strategy:
fail-fast: false
matrix:
gpu: [t4g, t4, l4, a10g, l40s]
runs-on: machine/id=${{ github.run_id }}-${{ matrix.gpu }}/gpu=${{ matrix.gpu }}/tenancy=spot
steps:
- uses: actions/checkout@v4
- run: ./bench.sh ${{ matrix.gpu }}
Mixing GitHub-hosted and machine.dev runners
You can use the standard ubuntu-latest for cheap setup steps and switch to a machine.dev runner for the heavy work:
jobs:
lint:
runs-on: ubuntu-latest # GitHub-hosted, free for public repos
steps:
- uses: actions/checkout@v4
- run: ./lint.sh
train:
needs: lint
runs-on: machine/gpu=a10g/tenancy=spot
steps:
- uses: actions/checkout@v4
- run: python train.py
Combining a CPU build job with a GPU train job
jobs:
build:
runs-on: machine/id=${{ github.run_id }}-build/cpu=16/tenancy=spot
steps:
- uses: actions/checkout@v4
- run: make -j$(nproc)
- uses: actions/upload-artifact@v4
with:
name: build
path: dist/
train:
needs: build
runs-on: machine/id=${{ github.run_id }}-train/gpu=l4/tenancy=spot
steps:
- uses: actions/download-artifact@v4
with:
name: build
- run: python train.py
Custom storage and metrics
jobs:
big-train:
runs-on: machine/gpu=l40s/tenancy=spot/disk_size=500/disk_iops=10000/disk_throughput=750/metrics_interval=10
# disk_size=500 → 500 GB root volume
# disk_iops=10000 → 10,000 IOPS
# disk_throughput=750 → 750 MB/s throughput
# metrics_interval=10 → metrics every 10 seconds
steps:
- uses: actions/checkout@v4
- run: python train.py
Job timeouts
Self-hosted runners (including machine.dev) inherit GitHub’s job-level timeout limits:
- Maximum job runtime: 5 days (GitHub’s hard limit for self-hosted runners)
- Maximum workflow runtime: 35 days (including queue time)
For cost safety on long-running jobs, set a tighter timeout in the workflow:
jobs:
train:
runs-on: machine/gpu=l4
timeout-minutes: 60 # Fail after 1 hour
Common issues
| Symptom | Cause | Fix |
|---|---|---|
| Job stays queued | Self-hosted runners not enabled for the repo | Enable self-hosted runners |
| ”No runner matching the specified labels” | Typo in gpu= or unsupported region | Check Configuration and Regions |
| Spot interruption mid-job | Normal: spot can be reclaimed at any time | Use tenancy=on_demand or add checkpointing |
nvidia-smi: command not found | Used cpu= instead of gpu= | Change runs-on to gpu=t4g (or another GPU type) |
Related
- Configuration options: every label, every default
- Workflow setup: patterns for real workflows
- Cost Optimization: spot, checkpointing, right-sizing
- GitHub Actions Documentation: generic GH Actions reference