GitHub Actions Syntax — machine.dev docs

[INDEX] // ALL_DOCS ›

[TOC] // ON_THIS_PAGE ›

This page is a quick reference for the GitHub Actions YAML syntax for machine.dev runners. For the full per-label reference see Configuration options.

The basics

Every machine.dev job needs the machine label and exactly one runner type label (cpu=N or gpu=TYPE):

jobs:
  cpu-job:
    runs-on: [machine, cpu=8]            # CPU job

  gpu-job:
    runs-on: [machine, gpu=l4]           # GPU job

How machine.dev reads your requirements

machine.dev doesn’t inject environment variables into your job. Instead, the labels you put in runs-on: are how machine.dev knows what to provision. The Machine Provisioner reads gpu=l4, tenancy=spot, regions=eu-south-2, etc., launches a matching AWS instance, and the runner inside that instance dials home to GitHub and registers as an ephemeral self-hosted runner.

The labels are your source of truth. Inside the job itself, the environment is whatever the runner image provides — you can nvidia-smi, nproc, lscpu, uname -m, etc., to inspect what you got.

All available labels

See Configuration options for the complete table. The summary:

machine — required, always first
cpu=N or gpu=TYPE — required, exactly one
architecture=arm64 — optional, CPU runners only (T4G GPU is always ARM)
cpu=N and ram=N — optional, only with gpu=
tenancy=spot — optional, default on_demand
regions=... — optional, comma-separated AWS region codes
disk_size=N, disk_iops=N, disk_throughput=N — optional, EBS gp3 storage tuning
metrics=true|false, metrics_interval=N — optional, metrics control

Examples

Spot CPU build

jobs:
  build:
    runs-on: [machine, cpu=16, tenancy=spot]
    steps:
      - uses: actions/checkout@v4
      - run: make -j$(nproc)

EU-only GPU job

jobs:
  train:
    runs-on:
      - machine
      - gpu=l4
      - regions=eu-south-2
    steps:
      - uses: actions/checkout@v4
      - run: python train.py

Matrix across GPU types

jobs:
  bench:
    strategy:
      fail-fast: false
      matrix:
        gpu: [t4g, t4, l4, a10g, l40s]
    runs-on:
      - machine
      - "gpu=${{ matrix.gpu }}"
      - tenancy=spot
    steps:
      - uses: actions/checkout@v4
      - run: ./bench.sh ${{ matrix.gpu }}

Mixing GitHub-hosted and machine.dev runners

You can use the standard ubuntu-latest for cheap setup steps and switch to a machine.dev runner for the heavy work:

jobs:
  lint:
    runs-on: ubuntu-latest          # GitHub-hosted, free for public repos
    steps:
      - uses: actions/checkout@v4
      - run: ./lint.sh

  train:
    needs: lint
    runs-on: [machine, gpu=a10g, tenancy=spot]
    steps:
      - uses: actions/checkout@v4
      - run: python train.py

Combining a CPU build job with a GPU train job

jobs:
  build:
    runs-on: [machine, cpu=16, tenancy=spot]
    steps:
      - uses: actions/checkout@v4
      - run: make -j$(nproc)
      - uses: actions/upload-artifact@v4
        with:
          name: build
          path: dist/

  train:
    needs: build
    runs-on: [machine, gpu=l4, tenancy=spot]
    steps:
      - uses: actions/download-artifact@v4
        with:
          name: build
      - run: python train.py

Custom storage and metrics

jobs:
  big-train:
    runs-on:
      - machine
      - gpu=l40s
      - tenancy=spot
      - disk_size=500          # 500 GB root volume
      - disk_iops=10000        # 10,000 IOPS
      - disk_throughput=750    # 750 MB/s throughput
      - metrics_interval=10    # Metrics every 10 seconds
    steps:
      - uses: actions/checkout@v4
      - run: python train.py

Job timeouts

Self-hosted runners (including machine.dev) inherit GitHub’s job-level timeout limits:

Maximum job runtime: 5 days (GitHub’s hard limit for self-hosted runners)
Maximum workflow runtime: 35 days (including queue time)

For cost safety on long-running jobs, set a tighter timeout in the workflow:

jobs:
  train:
    runs-on: [machine, gpu=l4]
    timeout-minutes: 60   # Fail after 1 hour

Common issues

Symptom	Cause	Fix
Job stays queued	Self-hosted runners not enabled for the repo	Enable self-hosted runners
”No runner matching the specified labels”	Typo in `gpu=` or unsupported region	Check Configuration and Regions
Spot interruption mid-job	Normal — spot can be reclaimed at any time	Use `tenancy=on_demand` or add checkpointing
`nvidia-smi: command not found`	Used `cpu=` instead of `gpu=`	Change `runs-on` to `gpu=t4g` (or another GPU type)

Configuration options — every label, every default
Workflow setup — patterns for real workflows
Cost Optimization — spot, checkpointing, right-sizing
GitHub Actions Documentation — generic GH Actions reference