Batch Object Detection — machine.dev docs

[INDEX] // ALL_DOCS ›

[TOC] // ON_THIS_PAGE ›

At a glance

Workload: Run DETR object detection on a batch of COCO 2017 images Runner: gpu=t4, cpu=4, ram=16, tenancy=spot ($0.004/min) Estimated cost: ~$0.05 per run (~10 min on a few hundred images)

This page shows how to run batch object detection on machine.dev GPU runners — the kind of pipeline you’d use to label a dataset, generate annotations, or analyze a video frame-by-frame.

When to use this

Why might you want to use GPU-accelerated batch object detection?

Process large collections of images efficiently
Extract and analyze objects, people, vehicles, or other entities
Generate metadata and annotations for computer vision datasets
Create analytics from visual content without manual intervention

How GPU-Powered Detection Works in CI/CD

The GPU Batch Object Detection workflow uses the DETR model (facebook/detr-resnet-50) with Hugging Face Transformers to detect and annotate objects in images. The workflow is defined in GitHub Actions and can be triggered on-demand with configurable parameters.

The detection process:

Sets up the necessary environment with GPU support
Downloads images from the COCO2017 dataset
Processes the images using the DETR object detection model
Generates annotated results and a CSV file with detection details
Uploads the results as a GitHub Actions artifact

GitHub Actions Workflow for GPU Object Detection

The GPU Batch Object Detection is implemented as a GitHub Actions workflow that can be triggered manually. Here’s the workflow definition:

name: Batch Object Detection

on:
  workflow_dispatch:
    inputs:
      tenancy:
        type: choice
        required: false
        description: 'The tenancy of the machine'
        default: 'spot'
        options:
          - 'spot'
          - 'on_demand'

jobs:
  detect_objects:
    name: Detect Objects
    runs-on:
      - machine
      - gpu=t4
      - cpu=4
      - ram=16
      - architecture=x64
      - tenancy=${{ inputs.tenancy }}
    steps:

      - uses: actions/checkout@v4

      - name: Set up Python 3.10
        uses: actions/setup-python@v5
        with:
          python-version: '3.10'

      - name: Install Dependencies
        run: |
          pip install -r requirements.txt

      - name: Run Object Detection
        run: python3 object_detection.py

      - name: Upload Detection Results CSV
        uses: actions/upload-artifact@v4
        with:
          name: detection-results-csv
          path: detection_output/detections.csv

Using machine.dev GPU Runners

This object detection process leverages machine.dev GPU runners to provide the necessary computing power for efficient image processing. The workflow is configured to use:

T4 GPU: An entry-level ML GPU with 16GB VRAM, well-suited for computer vision tasks
Spot instance: To optimize for cost while maintaining performance
Configurable resources: CPU, RAM, and architecture specifications

The T4 GPU provides excellent performance for batch object detection tasks, delivering significantly faster processing compared to CPU-only solutions. For larger workloads or more complex models, you can also configure the workflow to use more powerful GPUs like L4 or L40S.

Benefits of GPU-Accelerated Computer Vision in GitHub Actions

GPU Acceleration: Efficiently perform object detection using GPUs via machine.dev
Seamless Data Integration: Automatically fetch and process images from datasets
Automated Detection Pipeline: Detect, annotate, and export results without manual intervention
Results in CSV: Generate a structured CSV artifact with detection details for easy review
Easy Deployment: Use the repository as a GitHub template to quickly start your own GPU-accelerated workflows

Getting Started

To run the GPU Batch Object Detection workflow:

Use the MachineDotDev/gpu-batch-object-detection repository as a template
Navigate to the Actions tab in your repository
Select the “Batch Object Detection” workflow
Click “Run workflow” and configure your parameters:
- Choose between spot or on-demand tenancy
Run the workflow and wait for results
Download the detection-results-csv artifact to view the detection details

Best Practices

Use spot instances for non-time-critical batch processing to optimize costs
Adjust batch sizes to match your GPU memory capacity
Pre-filter your input data to reduce processing time on irrelevant images
Consider dataset chunking for extremely large collections of images
Implement progress tracking for long-running batch jobs

How to adapt this

Larger images / heavier model: swap gpu=t4 for gpu=l4 (24 GB VRAM, $0.006/min spot)
Different model: change the model ID in the Python script — any Hugging Face object detection model works
Run on a schedule: add on: schedule: cron: '0 2 * * *' to process new images nightly
Trigger from a webhook: add on: repository_dispatch and POST to GitHub when new data lands

Next steps

Working repo — fork or use as a template
CPU vs GPU — picking GPU size for CV workloads
Parallel Hyperparameter Tuning — fan out the same workflow across multiple model variants