[INDEX] // ALL_DOCS ›
[TOC] // ON_THIS_PAGE ›
At a glance
Workload: Run DETR object detection on a batch of COCO 2017 images.
Runner: gpu=t4, cpu=4, ram=16, tenancy=spot ($0.004/min).
Estimated cost: ~$0.05 per run (~10 min on a few hundred images).
This page shows how to run batch object detection on machine.dev GPU runners. The kind of pipeline you’d use to label a dataset, generate annotations, or analyse a video frame by frame.
When to use this
A few reasons you might run GPU-accelerated batch object detection in CI:
- Process large image collections without hand-rolling a server
- Pull objects, people, vehicles, or other entities out of a corpus
- Generate metadata and annotations for a CV training set
- Build analytics from visual content with no manual review
How it works
The pipeline uses the DETR model (facebook/detr-resnet-50) via Hugging Face Transformers. It runs as a GitHub Actions workflow you trigger on demand with input parameters.
The job:
- Sets up Python and CUDA on a GPU runner
- Downloads images from the COCO2017 dataset
- Runs DETR over each image
- Writes annotations and a per-image CSV
- Uploads the CSV as a GitHub Actions artifact
Workflow
name: Batch Object Detection
on:
workflow_dispatch:
inputs:
tenancy:
type: choice
required: false
description: 'The tenancy of the machine'
default: 'spot'
options:
- 'spot'
- 'on_demand'
jobs:
detect_objects:
name: Detect Objects
runs-on: machine/gpu=t4/cpu=4/ram=16/architecture=x64/tenancy=${{ inputs.tenancy }}
steps:
- uses: actions/checkout@v4
- name: Set up Python 3.10
uses: actions/setup-python@v5
with:
python-version: '3.10'
- name: Install Dependencies
run: |
pip install -r requirements.txt
- name: Run Object Detection
run: python3 object_detection.py
- name: Upload Detection Results CSV
uses: actions/upload-artifact@v4
with:
name: detection-results-csv
path: detection_output/detections.csv
Runner config
The default runner used here:
- T4 GPU: 16 GB VRAM, plenty for DETR-resnet-50
- Spot tenancy: ~70-90% cheaper than on-demand
- Configurable CPU, RAM, and architecture
The T4 is fast enough for batch CV pipelines. For heavier models or larger images, step up to L4 or L40S.
Why CI
A few things you get from running this in GitHub Actions rather than a long-lived box:
- GPU acceleration without a dedicated server
- Datasets fetched on demand, no persistent disk
- Pipeline runs end to end with no manual review needed
- Output is a CSV artifact, easy to attach to a PR or download
- The whole repo is a template you can fork
Getting started
- Use MachineDotDev/gpu-batch-object-detection as a template
- Open the Actions tab in your repository
- Pick the “Batch Object Detection” workflow
- Click “Run workflow” and pick spot or on-demand tenancy
- Run it and wait
- Download the
detection-results-csvartifact to inspect the detections
Tips
- Spot tenancy is fine for batch CV jobs. Worst case you re-run.
- Tune batch size to fit GPU memory comfortably
- Pre-filter inputs so you don’t waste minutes on images you don’t care about
- For very large datasets, chunk the input and parallelise across multiple workflow runs
- Print progress every N images so a stalled run is easy to spot
How to adapt this
- Larger images or heavier model: swap
gpu=t4forgpu=l4(24 GB VRAM, $0.006/min spot) - Different model: change the model ID in the Python script. Any Hugging Face object-detection model works.
- Run on a schedule: add
on: schedule: cron: '0 2 * * *'to process new images nightly - Trigger from a webhook: add
on: repository_dispatchand POST to GitHub when new data lands
Next steps
- Working repo: fork or use as a template
- CPU vs GPU: picking GPU size for CV workloads
- Parallel Hyperparameter Tuning: fan the same workflow out across multiple model variants