GitHub Self-Hosted Runners on AWS: Pull vs Push for On-Demand Scaling
Cut your CI costs by running GitHub Actions on AWS only when you need them. Compare pull-based (polling) and push-based (event-driven) architectures for spinning up on-demand self-hosted runners.
GitHub Self-Hosted Runners on AWS: Pull vs Push for On-Demand Scaling
GitHub-hosted runners are convenient, but they get expensive fast. Once your team ships dozens of workflows a day — or your builds take 20+ minutes — the per-minute billing stacks up. The alternative: run your own runners on AWS and only pay for the compute you actually use.
The catch is how you spin runners up. There are two fundamentally different approaches: pull (runners poll GitHub for jobs) and push (GitHub events trigger runner creation). Each has real trade-offs in latency, complexity, and cost. This post walks through both so you can pick the right one for your setup.
Background: How GitHub Self-Hosted Runners Work
Before comparing patterns, it helps to know what a self-hosted runner actually does:
- A runner process registers itself with GitHub using a registration token.
- It opens a long-poll connection to
api.github.comand waits for jobs. - When a job is dispatched, the runner claims it, executes the steps, and reports results.
- After the job finishes, an ephemeral runner exits; a persistent runner loops back to waiting.
The key detail: GitHub does not push jobs to runners — runners pull them. This means even "push-based" architectures in this post refer to how you create the runner process, not how jobs are dispatched.
Interactive DiagramClick diagram or fullscreen button for better viewing • Press ESC to exit fullscreen
Section 1 — Pull-Based: Runners Poll GitHub for Jobs
How It Works
In the pull model, you pre-provision a pool of runner processes on AWS. Each runner is always-on (or warmed up ahead of time) and continuously long-polls GitHub for available jobs. As soon as a workflow triggers, one of the idle runners picks it up immediately.
Interactive DiagramClick diagram or fullscreen button for better viewing • Press ESC to exit fullscreen
AWS Implementation
The most common pull setup uses an Auto Scaling Group (ASG) of EC2 instances, each running the GitHub Actions runner binary.
1. Launch Template (User Data)
#!/bin/bash
# Install runner
mkdir -p /home/ec2-user/actions-runner && cd /home/ec2-user/actions-runner
curl -o runner.tar.gz -L https://github.com/actions/runner/releases/download/v2.317.0/actions-runner-linux-x64-2.317.0.tar.gz
tar xzf runner.tar.gz
# Fetch registration token from SSM (stored by your bootstrap Lambda)
TOKEN=$(aws ssm get-parameter --name /github/runner-token --with-decryption --query Parameter.Value --output text)
# Register as ephemeral runner
./config.sh \
--url https://github.com/YOUR_ORG \
--token "$TOKEN" \
--ephemeral \
--unattended \
--labels aws,on-demand,production
# Run and exit when job completes (ephemeral)
./run.sh
2. Auto Scaling Policy
Scale the ASG based on a custom CloudWatch metric — the number of queued jobs — published by a lightweight polling Lambda:
import boto3
import requests
def lambda_handler(event, context):
token = get_secret("github-pat")
headers = {"Authorization": f"Bearer {token}"}
# Count queued jobs across your org
resp = requests.get(
"https://api.github.com/orgs/YOUR_ORG/actions/runners",
headers=headers
)
runners = resp.json().get("runners", [])
busy = sum(1 for r in runners if r["status"] == "online" and r["busy"])
idle = sum(1 for r in runners if r["status"] == "online" and not r["busy"])
# Publish metric
cw = boto3.client("cloudwatch")
cw.put_metric_data(
Namespace="GitHub/Runners",
MetricData=[{"MetricName": "IdleRunners", "Value": idle, "Unit": "Count"}]
)
Then wire a scale-in policy on IdleRunners > desired_buffer and scale-out on IdleRunners < 1.
Pull Model Trade-offs
| Detail | |
|---|---|
| Latency | Near-zero — idle runners claim jobs in seconds |
| Cost | Higher idle cost; you pay for runners waiting for work |
| Complexity | Low — just an ASG + User Data script |
| Best for | High-throughput teams with frequent, unpredictable job bursts |
| Risk | Idle runners accumulate cost on quiet nights/weekends |
Pull is the right choice when your team ships code constantly and cold-start latency (even 60 seconds) would break developer flow. The idle cost is worth the instant feedback.
Pull Model — Cost Breakdown
The dominant cost driver is idle EC2 time. Runners sitting in your ASG waiting for jobs still generate an hourly bill.
Example: team running 500 jobs/month, avg 10 min each
| Resource | On-Demand | EC2 Spot (~70% off) |
|---|---|---|
| 2× t3.large idle 24/7 (2 vCPU, 8 GB) | $0.0832/hr × 2 × 720 hr = $119.81/mo | ~$0.025/hr × 2 × 720 hr = $36/mo |
| CloudWatch custom metric (idle runner count) | ~$0.30/mo | ~$0.30/mo |
| Polling Lambda (500 invocations/mo) | Free tier | Free tier |
| Total | ~$120/mo | ~$36/mo |
Tip: Using EC2 Spot for your ASG is the single biggest lever. With Spot, idle runners cost ~$0.025/hr (t3.large us-east-1) vs $0.0832/hr on-demand. Just configure a mixed-instance policy with a fallback On-Demand minimum of 1 so you're never left with zero runners.
Monthly cost sensitivity (Spot ASG, t3.large):
Min pool size 1 runner → ~$18/mo idle
Min pool size 2 runners → ~$36/mo idle
Min pool size 4 runners → ~$72/mo idle
Scale-in aggressively on nights/weekends (e.g., scheduled action to drop min to 0 outside working hours) and idle cost can drop by 60–70%.
Section 2 — Push-Based: Events Trigger Runner Creation
How It Works
In the push model, there are no pre-warmed runners. Instead, a GitHub webhook fires when a workflow job enters the queued state. That event triggers AWS infrastructure to spin up a fresh runner just in time for that job. Once the job finishes, the runner is destroyed.
Interactive DiagramClick diagram or fullscreen button for better viewing • Press ESC to exit fullscreen
AWS Implementation
1. GitHub Webhook → API Gateway → Lambda
In your GitHub org settings, create a webhook pointing to an API Gateway URL, listening for workflow_job events with action queued.
# orchestrator Lambda
import boto3
import hmac
import hashlib
import requests
def lambda_handler(event, context):
# Validate webhook signature
sig = event["headers"].get("X-Hub-Signature-256", "")
body = event["body"].encode()
secret = get_secret("github-webhook-secret")
expected = "sha256=" + hmac.new(secret.encode(), body, hashlib.sha256).hexdigest()
if not hmac.compare_digest(sig, expected):
return {"statusCode": 401, "body": "Unauthorized"}
payload = json.loads(event["body"])
if payload.get("action") != "queued":
return {"statusCode": 200, "body": "Ignored"}
# Get a fresh runner registration token
pat = get_secret("github-pat")
resp = requests.post(
"https://api.github.com/orgs/YOUR_ORG/actions/runners/registration-token",
headers={"Authorization": f"Bearer {pat}"}
)
reg_token = resp.json()["token"]
# Launch an ECS Fargate task with the token as an environment variable
ecs = boto3.client("ecs")
ecs.run_task(
cluster="github-runners",
taskDefinition="github-runner",
launchType="FARGATE",
networkConfiguration={
"awsvpcConfiguration": {
"subnets": ["subnet-abc123"],
"securityGroups": ["sg-abc123"],
"assignPublicIp": "ENABLED"
}
},
overrides={
"containerOverrides": [{
"name": "runner",
"environment": [
{"name": "RUNNER_TOKEN", "value": reg_token},
{"name": "GITHUB_URL", "value": "https://github.com/YOUR_ORG"},
{"name": "RUNNER_LABELS", "value": "aws,fargate,on-demand"}
]
}]
}
)
return {"statusCode": 200, "body": "Runner launching"}
2. Runner Container (Dockerfile)
FROM ubuntu:22.04
RUN apt-get update && apt-get install -y curl jq libicu70
WORKDIR /runner
RUN curl -o runner.tar.gz -L \
https://github.com/actions/runner/releases/download/v2.317.0/actions-runner-linux-x64-2.317.0.tar.gz \
&& tar xzf runner.tar.gz && rm runner.tar.gz
COPY entrypoint.sh .
RUN chmod +x entrypoint.sh
ENTRYPOINT ["./entrypoint.sh"]
#!/bin/bash
# entrypoint.sh — register, run one job, deregister
./config.sh \
--url "$GITHUB_URL" \
--token "$RUNNER_TOKEN" \
--ephemeral \
--unattended \
--labels "$RUNNER_LABELS"
./run.sh # exits after one job (ephemeral flag)
3. ECS Task Definition (key settings)
{
"family": "github-runner",
"cpu": "2048",
"memory": "4096",
"requiresCompatibilities": ["FARGATE"],
"networkMode": "awsvpc",
"containerDefinitions": [{
"name": "runner",
"image": "123456789.dkr.ecr.us-east-1.amazonaws.com/github-runner:latest",
"essential": true,
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/ecs/github-runner",
"awslogs-region": "us-east-1",
"awslogs-stream-prefix": "runner"
}
}
}]
}
Push Model Trade-offs
| Detail | |
|---|---|
| Latency | 30–90 seconds cold start (ECS task launch + runner registration) |
| Cost | Near-zero idle cost — you pay only for actual job runtime |
| Complexity | Higher — webhook, Lambda, ECS task definition, IAM roles |
| Best for | Teams with sporadic workflows or long gaps between builds |
| Risk | Cold-start delay can frustrate developers on interactive PRs |
Push is the right choice when cost is a priority and your team can tolerate a ~60 second ramp-up. Perfect for scheduled jobs, nightly builds, or release pipelines where a minute doesn't matter.
Push Model — Cost Breakdown
With push, you pay only for the seconds a runner task is actually running. All supporting infrastructure is effectively free at typical job volumes.
ECS Fargate pricing (us-east-1, task: 2 vCPU / 4 GB)
| Component | Rate | Notes |
|---|---|---|
| vCPU | $0.04048/vCPU/hr | Per-second billing |
| Memory | $0.004445/GB/hr | Per-second billing |
| 2 vCPU + 4 GB per task | ~$0.099/hr | = ~$0.00165/min |
| Fargate Spot | ~$0.030/hr | ~70% off, best-effort |
Example: same 500 jobs/month, avg 10 min each
| Resource | Standard Fargate | Fargate Spot |
|---|---|---|
| 500 jobs × 10 min runner time | 500 × 10 × $0.00165 = $8.25/mo | 500 × 10 × $0.00050 = $2.50/mo |
| API Gateway (500 webhook calls) | ~$0.00 (free tier) | ~$0.00 |
| Orchestrator Lambda (500 invocations) | ~$0.00 (free tier) | ~$0.00 |
| ECR image storage (~500 MB) | ~$0.05/mo | ~$0.05/mo |
| Total | ~$8.30/mo | ~$2.55/mo |
Tip: Use Fargate Spot for runner tasks. Since jobs are typically retriable (GitHub will re-queue if a Spot task is interrupted mid-job), the interruption risk is manageable. Set a
stopTimeoutof 120 seconds so in-flight jobs have time to finish before the task is reclaimed.
Monthly cost sensitivity (Fargate Spot, 2 vCPU / 4 GB):
100 jobs × 10 min → ~$0.50/mo
500 jobs × 10 min → ~$2.50/mo
2000 jobs × 10 min → ~$10.00/mo
2000 jobs × 20 min → ~$20.00/mo
Cost scales linearly with actual job runtime — there is no idle cost whatsoever.
Cost Comparison: Self-Hosted vs GitHub-Hosted
Before committing to either pattern, it's worth knowing how much you'd save over GitHub-hosted runners.
GitHub-hosted runner pricing (as of 2025)
| Runner type | Price per minute |
|---|---|
ubuntu-latest (2-core) | $0.008/min |
ubuntu-latest (4-core) | $0.016/min |
ubuntu-latest (8-core) | $0.032/min |
Head-to-head: 500 jobs/month × 10 min avg (2-core equivalent)
| Option | Monthly Cost | Notes |
|---|---|---|
GitHub-hosted (ubuntu-latest) | $40.00 | $0.008 × 5,000 min |
| Pull — EC2 On-Demand (2× t3.large) | ~$120.00 | Cheaper only at very high volume |
| Pull — EC2 Spot (2× t3.large) | ~$36.00 | ~10% cheaper than GitHub-hosted |
| Push — ECS Fargate Standard | ~$8.30 | 79% cheaper than GitHub-hosted |
| Push — ECS Fargate Spot | ~$2.55 | 94% cheaper than GitHub-hosted |
At 2,000 jobs/month × 10 min avg (a busy team)
| Option | Monthly Cost | Savings vs GitHub-hosted |
|---|---|---|
| GitHub-hosted | $160.00 | baseline |
| Pull — EC2 Spot | ~$36.00 | ~$124/mo saved |
| Push — Fargate Spot | ~$10.00 | ~$150/mo saved |
The pull model only beats GitHub-hosted runners on cost once your job volume is high enough to justify the idle ASG. At low volume, push (Fargate Spot) dominates every alternative.
Additional AWS Costs to Budget
These apply to both models but are typically small:
| Service | Cost | When it matters |
|---|---|---|
| NAT Gateway | $0.045/hr + $0.045/GB data | Required for private subnet runners egressing to GitHub |
| ECR (push model) | $0.10/GB/mo stored | One runner image ~500 MB = $0.05/mo |
| Secrets Manager | $0.40/secret/mo | 2–3 secrets (PAT, webhook secret) = ~$1/mo |
| CloudWatch Logs | $0.50/GB ingested | Scales with job output verbosity |
NAT Gateway is typically the only meaningful surprise — if you run runners in private subnets (recommended), budget ~$33/mo for the gateway plus data transfer.
Choosing Between Pull and Push
Interactive DiagramClick diagram or fullscreen button for better viewing • Press ESC to exit fullscreen
| Criteria | Pull (ASG Pool) | Push (Webhook + ECS) |
|---|---|---|
| Job frequency | High (many/hour) | Low (few/day or scheduled) |
| Latency tolerance | Low (< 10s) | High (60–90s OK) |
| Cost priority | Secondary | Primary |
| Setup complexity | Low | Medium–High |
| Spot/Fargate support | EC2 Spot ASG | Fargate Spot |
Security Considerations for Both Models
Regardless of which pattern you choose, apply these controls:
- Ephemeral runners only — never reuse a runner across jobs; use
--ephemeralflag always. - No long-lived registration tokens — fetch a fresh token per runner via the GitHub API; tokens expire in 1 hour.
- IAM least-privilege — runners need only the AWS permissions their jobs require. Use instance profiles (EC2) or task roles (ECS), not static credentials.
- Private subnets — runners don't need inbound traffic; place them in private subnets with NAT Gateway egress.
- Webhook secret validation — always verify
X-Hub-Signature-256in your push-model Lambda before launching anything. - Runner labels — use labels to route jobs to the right runner type; prevent untrusted forks from targeting self-hosted runners.
Summary
GitHub self-hosted runners on AWS are a straightforward way to cut CI costs and gain control over your build environment. The pull model gives you instant job pickup at the cost of idle compute. The push model eliminates idle cost but adds cold-start latency and webhook plumbing.
For most teams, the right answer is simple:
- Ship frequently? → Pull model with an ASG.
- Ship occasionally or on a schedule? → Push model with ECS Fargate.
- Both? → A small idle pool (pull) plus push-based overflow handles burst without burning money on nights and weekends.
Either way, keep runners ephemeral, validate your webhooks, and apply least-privilege IAM — and you'll have a CI setup that scales cleanly without the GitHub-hosted bill.