GitHub Self-Hosted Runners on AWS: Pull vs Push for On-Demand Scaling

GitHub-hosted runners are convenient, but they get expensive fast. Once your team ships dozens of workflows a day — or your builds take 20+ minutes — the per-minute billing stacks up. The alternative: run your own runners on AWS and only pay for the compute you actually use.

The catch is how you spin runners up. There are two fundamentally different approaches: pull (runners poll GitHub for jobs) and push (GitHub events trigger runner creation). Each has real trade-offs in latency, complexity, and cost. This post walks through both so you can pick the right one for your setup.

Background: How GitHub Self-Hosted Runners Work

Before comparing patterns, it helps to know what a self-hosted runner actually does:

A runner process registers itself with GitHub using a registration token.
It opens a long-poll connection to api.github.com and waits for jobs.
When a job is dispatched, the runner claims it, executes the steps, and reports results.
After the job finishes, an ephemeral runner exits; a persistent runner loops back to waiting.

The key detail: GitHub does not push jobs to runners — runners pull them. This means even "push-based" architectures in this post refer to how you create the runner process, not how jobs are dispatched.

Interactive Diagram
Click diagram or fullscreen button for better viewing • Press ESC to exit fullscreen

Section 1 — Pull-Based: Runners Poll GitHub for Jobs

How It Works

In the pull model, you pre-provision a pool of runner processes on AWS. Each runner is always-on (or warmed up ahead of time) and continuously long-polls GitHub for available jobs. As soon as a workflow triggers, one of the idle runners picks it up immediately.

Interactive Diagram
Click diagram or fullscreen button for better viewing • Press ESC to exit fullscreen

AWS Implementation

The most common pull setup uses an Auto Scaling Group (ASG) of EC2 instances, each running the GitHub Actions runner binary.

1. Launch Template (User Data)

#!/bin/bash
# Install runner
mkdir -p /home/ec2-user/actions-runner && cd /home/ec2-user/actions-runner
curl -o runner.tar.gz -L https://github.com/actions/runner/releases/download/v2.317.0/actions-runner-linux-x64-2.317.0.tar.gz
tar xzf runner.tar.gz

# Fetch registration token from SSM (stored by your bootstrap Lambda)
TOKEN=$(aws ssm get-parameter --name /github/runner-token --with-decryption --query Parameter.Value --output text)

# Register as ephemeral runner
./config.sh \
  --url https://github.com/YOUR_ORG \
  --token "$TOKEN" \
  --ephemeral \
  --unattended \
  --labels aws,on-demand,production

# Run and exit when job completes (ephemeral)
./run.sh

2. Auto Scaling Policy

Scale the ASG based on a custom CloudWatch metric — the number of queued jobs — published by a lightweight polling Lambda:

import boto3
import requests

def lambda_handler(event, context):
    token = get_secret("github-pat")
    headers = {"Authorization": f"Bearer {token}"}

    # Count queued jobs across your org
    resp = requests.get(
        "https://api.github.com/orgs/YOUR_ORG/actions/runners",
        headers=headers
    )
    runners = resp.json().get("runners", [])
    busy = sum(1 for r in runners if r["status"] == "online" and r["busy"])
    idle  = sum(1 for r in runners if r["status"] == "online" and not r["busy"])

    # Publish metric
    cw = boto3.client("cloudwatch")
    cw.put_metric_data(
        Namespace="GitHub/Runners",
        MetricData=[{"MetricName": "IdleRunners", "Value": idle, "Unit": "Count"}]
    )

Then wire a scale-in policy on IdleRunners > desired_buffer and scale-out on IdleRunners < 1.

Pull Model Trade-offs

	Detail
Latency	Near-zero — idle runners claim jobs in seconds
Cost	Higher idle cost; you pay for runners waiting for work
Complexity	Low — just an ASG + User Data script
Best for	High-throughput teams with frequent, unpredictable job bursts
Risk	Idle runners accumulate cost on quiet nights/weekends

Pull is the right choice when your team ships code constantly and cold-start latency (even 60 seconds) would break developer flow. The idle cost is worth the instant feedback.

Pull Model — Cost Breakdown

The dominant cost driver is idle EC2 time. Runners sitting in your ASG waiting for jobs still generate an hourly bill.

Example: team running 500 jobs/month, avg 10 min each

Resource	On-Demand	EC2 Spot (~70% off)
2× t3.large idle 24/7 (2 vCPU, 8 GB)	$0.0832/hr × 2 × 720 hr = $119.81/mo	~$0.025/hr × 2 × 720 hr = $36/mo
CloudWatch custom metric (idle runner count)	~$0.30/mo	~$0.30/mo
Polling Lambda (500 invocations/mo)	Free tier	Free tier
Total	~$120/mo	~$36/mo

Tip: Using EC2 Spot for your ASG is the single biggest lever. With Spot, idle runners cost ~$0.025/hr (t3.large us-east-1) vs $0.0832/hr on-demand. Just configure a mixed-instance policy with a fallback On-Demand minimum of 1 so you're never left with zero runners.

Monthly cost sensitivity (Spot ASG, t3.large):
  Min pool size 1 runner  → ~$18/mo idle
  Min pool size 2 runners → ~$36/mo idle
  Min pool size 4 runners → ~$72/mo idle

Scale-in aggressively on nights/weekends (e.g., scheduled action to drop min to 0 outside working hours) and idle cost can drop by 60–70%.

Section 2 — Push-Based: Events Trigger Runner Creation

How It Works

In the push model, there are no pre-warmed runners. Instead, a GitHub webhook fires when a workflow job enters the queued state. That event triggers AWS infrastructure to spin up a fresh runner just in time for that job. Once the job finishes, the runner is destroyed.

Interactive Diagram
Click diagram or fullscreen button for better viewing • Press ESC to exit fullscreen

AWS Implementation

1. GitHub Webhook → API Gateway → Lambda

In your GitHub org settings, create a webhook pointing to an API Gateway URL, listening for workflow_job events with action queued.

# orchestrator Lambda
import boto3
import hmac
import hashlib
import requests

def lambda_handler(event, context):
    # Validate webhook signature
    sig = event["headers"].get("X-Hub-Signature-256", "")
    body = event["body"].encode()
    secret = get_secret("github-webhook-secret")
    expected = "sha256=" + hmac.new(secret.encode(), body, hashlib.sha256).hexdigest()
    if not hmac.compare_digest(sig, expected):
        return {"statusCode": 401, "body": "Unauthorized"}

    payload = json.loads(event["body"])
    if payload.get("action") != "queued":
        return {"statusCode": 200, "body": "Ignored"}

    # Get a fresh runner registration token
    pat = get_secret("github-pat")
    resp = requests.post(
        "https://api.github.com/orgs/YOUR_ORG/actions/runners/registration-token",
        headers={"Authorization": f"Bearer {pat}"}
    )
    reg_token = resp.json()["token"]

    # Launch an ECS Fargate task with the token as an environment variable
    ecs = boto3.client("ecs")
    ecs.run_task(
        cluster="github-runners",
        taskDefinition="github-runner",
        launchType="FARGATE",
        networkConfiguration={
            "awsvpcConfiguration": {
                "subnets": ["subnet-abc123"],
                "securityGroups": ["sg-abc123"],
                "assignPublicIp": "ENABLED"
            }
        },
        overrides={
            "containerOverrides": [{
                "name": "runner",
                "environment": [
                    {"name": "RUNNER_TOKEN", "value": reg_token},
                    {"name": "GITHUB_URL", "value": "https://github.com/YOUR_ORG"},
                    {"name": "RUNNER_LABELS", "value": "aws,fargate,on-demand"}
                ]
            }]
        }
    )
    return {"statusCode": 200, "body": "Runner launching"}

2. Runner Container (Dockerfile)

FROM ubuntu:22.04

RUN apt-get update && apt-get install -y curl jq libicu70

WORKDIR /runner
RUN curl -o runner.tar.gz -L \
    https://github.com/actions/runner/releases/download/v2.317.0/actions-runner-linux-x64-2.317.0.tar.gz \
  && tar xzf runner.tar.gz && rm runner.tar.gz

COPY entrypoint.sh .
RUN chmod +x entrypoint.sh
ENTRYPOINT ["./entrypoint.sh"]

#!/bin/bash
# entrypoint.sh — register, run one job, deregister
./config.sh \
  --url "$GITHUB_URL" \
  --token "$RUNNER_TOKEN" \
  --ephemeral \
  --unattended \
  --labels "$RUNNER_LABELS"

./run.sh   # exits after one job (ephemeral flag)

3. ECS Task Definition (key settings)

{
  "family": "github-runner",
  "cpu": "2048",
  "memory": "4096",
  "requiresCompatibilities": ["FARGATE"],
  "networkMode": "awsvpc",
  "containerDefinitions": [{
    "name": "runner",
    "image": "123456789.dkr.ecr.us-east-1.amazonaws.com/github-runner:latest",
    "essential": true,
    "logConfiguration": {
      "logDriver": "awslogs",
      "options": {
        "awslogs-group": "/ecs/github-runner",
        "awslogs-region": "us-east-1",
        "awslogs-stream-prefix": "runner"
      }
    }
  }]
}

Push Model Trade-offs

	Detail
Latency	30–90 seconds cold start (ECS task launch + runner registration)
Cost	Near-zero idle cost — you pay only for actual job runtime
Complexity	Higher — webhook, Lambda, ECS task definition, IAM roles
Best for	Teams with sporadic workflows or long gaps between builds
Risk	Cold-start delay can frustrate developers on interactive PRs

Push is the right choice when cost is a priority and your team can tolerate a ~60 second ramp-up. Perfect for scheduled jobs, nightly builds, or release pipelines where a minute doesn't matter.

Push Model — Cost Breakdown

With push, you pay only for the seconds a runner task is actually running. All supporting infrastructure is effectively free at typical job volumes.

ECS Fargate pricing (us-east-1, task: 2 vCPU / 4 GB)

Component	Rate	Notes
vCPU	$0.04048/vCPU/hr	Per-second billing
Memory	$0.004445/GB/hr	Per-second billing
2 vCPU + 4 GB per task	~$0.099/hr	= ~$0.00165/min
Fargate Spot	~$0.030/hr	~70% off, best-effort

Example: same 500 jobs/month, avg 10 min each

Resource	Standard Fargate	Fargate Spot
500 jobs × 10 min runner time	500 × 10 × $0.00165 = $8.25/mo	500 × 10 × $0.00050 = $2.50/mo
API Gateway (500 webhook calls)	~$0.00 (free tier)	~$0.00
Orchestrator Lambda (500 invocations)	~$0.00 (free tier)	~$0.00
ECR image storage (~500 MB)	~$0.05/mo	~$0.05/mo
Total	~$8.30/mo	~$2.55/mo

Tip: Use Fargate Spot for runner tasks. Since jobs are typically retriable (GitHub will re-queue if a Spot task is interrupted mid-job), the interruption risk is manageable. Set a stopTimeout of 120 seconds so in-flight jobs have time to finish before the task is reclaimed.

Monthly cost sensitivity (Fargate Spot, 2 vCPU / 4 GB):
  100 jobs × 10 min  →  ~$0.50/mo
  500 jobs × 10 min  →  ~$2.50/mo
  2000 jobs × 10 min →  ~$10.00/mo
  2000 jobs × 20 min →  ~$20.00/mo

Cost scales linearly with actual job runtime — there is no idle cost whatsoever.

Cost Comparison: Self-Hosted vs GitHub-Hosted

Before committing to either pattern, it's worth knowing how much you'd save over GitHub-hosted runners.

GitHub-hosted runner pricing (as of 2025)

Runner type	Price per minute
`ubuntu-latest` (2-core)	$0.008/min
`ubuntu-latest` (4-core)	$0.016/min
`ubuntu-latest` (8-core)	$0.032/min

Head-to-head: 500 jobs/month × 10 min avg (2-core equivalent)

Option	Monthly Cost	Notes
GitHub-hosted (`ubuntu-latest`)	$40.00	$0.008 × 5,000 min
Pull — EC2 On-Demand (2× t3.large)	~$120.00	Cheaper only at very high volume
Pull — EC2 Spot (2× t3.large)	~$36.00	~10% cheaper than GitHub-hosted
Push — ECS Fargate Standard	~$8.30	79% cheaper than GitHub-hosted
Push — ECS Fargate Spot	~$2.55	94% cheaper than GitHub-hosted

At 2,000 jobs/month × 10 min avg (a busy team)

Option	Monthly Cost	Savings vs GitHub-hosted
GitHub-hosted	$160.00	baseline
Pull — EC2 Spot	~$36.00	~$124/mo saved
Push — Fargate Spot	~$10.00	~$150/mo saved

The pull model only beats GitHub-hosted runners on cost once your job volume is high enough to justify the idle ASG. At low volume, push (Fargate Spot) dominates every alternative.

Additional AWS Costs to Budget

These apply to both models but are typically small:

Service	Cost	When it matters
NAT Gateway	$0.045/hr + $0.045/GB data	Required for private subnet runners egressing to GitHub
ECR (push model)	$0.10/GB/mo stored	One runner image ~500 MB = $0.05/mo
Secrets Manager	$0.40/secret/mo	2–3 secrets (PAT, webhook secret) = ~$1/mo
CloudWatch Logs	$0.50/GB ingested	Scales with job output verbosity

NAT Gateway is typically the only meaningful surprise — if you run runners in private subnets (recommended), budget ~$33/mo for the gateway plus data transfer.

Choosing Between Pull and Push

Interactive Diagram
Click diagram or fullscreen button for better viewing • Press ESC to exit fullscreen

Criteria	Pull (ASG Pool)	Push (Webhook + ECS)
Job frequency	High (many/hour)	Low (few/day or scheduled)
Latency tolerance	Low (< 10s)	High (60–90s OK)
Cost priority	Secondary	Primary
Setup complexity	Low	Medium–High
Spot/Fargate support	EC2 Spot ASG	Fargate Spot

Security Considerations for Both Models

Regardless of which pattern you choose, apply these controls:

Ephemeral runners only — never reuse a runner across jobs; use --ephemeral flag always.
No long-lived registration tokens — fetch a fresh token per runner via the GitHub API; tokens expire in 1 hour.
IAM least-privilege — runners need only the AWS permissions their jobs require. Use instance profiles (EC2) or task roles (ECS), not static credentials.
Private subnets — runners don't need inbound traffic; place them in private subnets with NAT Gateway egress.
Webhook secret validation — always verify X-Hub-Signature-256 in your push-model Lambda before launching anything.
Runner labels — use labels to route jobs to the right runner type; prevent untrusted forks from targeting self-hosted runners.

Summary

GitHub self-hosted runners on AWS are a straightforward way to cut CI costs and gain control over your build environment. The pull model gives you instant job pickup at the cost of idle compute. The push model eliminates idle cost but adds cold-start latency and webhook plumbing.

For most teams, the right answer is simple:

Ship frequently? → Pull model with an ASG.
Ship occasionally or on a schedule? → Push model with ECS Fargate.
Both? → A small idle pool (pull) plus push-based overflow handles burst without burning money on nights and weekends.

Either way, keep runners ephemeral, validate your webhooks, and apply least-privilege IAM — and you'll have a CI setup that scales cleanly without the GitHub-hosted bill.

GitHub Self-Hosted Runners on AWS: Pull vs Push for On-Demand Scaling

Background: How GitHub Self-Hosted Runners Work

Section 1 — Pull-Based: Runners Poll GitHub for Jobs

How It Works

AWS Implementation

Pull Model Trade-offs

Pull Model — Cost Breakdown

Section 2 — Push-Based: Events Trigger Runner Creation

How It Works

AWS Implementation

Push Model Trade-offs

Push Model — Cost Breakdown

Cost Comparison: Self-Hosted vs GitHub-Hosted

Additional AWS Costs to Budget

Choosing Between Pull and Push

Security Considerations for Both Models

Summary

Related Articles