GitHub Actions Timeout, Permission Denied & Runner Offline: Complete Troubleshooting Guide
Fix GitHub Actions timeout, out-of-memory, permission denied, and runner offline errors with step-by-step commands and YAML fixes.
- Job timeouts are caused by hung processes, missing -y flags, or no explicit timeout-minutes set—the default is 6 hours for GitHub-hosted runners
- Out-of-memory kills produce exit code 137 or a silent SIGKILL; fix by raising NODE_OPTIONS/JVM heap, splitting jobs, or upgrading to a larger runner
- Permission denied errors stem from insufficient GITHUB_TOKEN scopes—add an explicit permissions block to the workflow or job
- Runner offline means the self-hosted runner service crashed, the registration token expired, or the host lost network connectivity to GitHub
- Enable ACTIONS_RUNNER_DEBUG=true and ACTIONS_STEP_DEBUG=true secrets for verbose logs when the root cause is unclear
| Method | When to Use | Time to Apply | Risk |
|---|---|---|---|
| Set explicit timeout-minutes | Job hangs indefinitely or exceeds 6 h default | < 2 min | Low — only cancels jobs sooner |
| Add npm ci / apt-get -y flags | Hanging on package install prompts | < 5 min | Low — idempotent change |
| Add actions/cache | Slow builds due to repeated dependency downloads | 15–30 min | Low — cache miss falls back to fresh install |
| Set NODE_OPTIONS / MAVEN_OPTS | Node or JVM OOM kill (exit 137) | < 5 min | Low — tuning only |
| Upgrade to larger GitHub-hosted runner | Legitimate memory ceiling on standard runners | < 5 min | Medium — cost increase |
| Add permissions block to workflow | Resource not accessible by integration | < 5 min | Low — additive change |
| Store PAT as secret, use in checkout | Cross-repo push or advanced scopes needed | 10–20 min | Medium — PAT must be rotated |
| Restart self-hosted runner service | Runner shows Offline in repo settings | 5 min | Low — service restart |
| Re-register self-hosted runner | Runner token expired or runner corrupted | 10 min | Low — old runner entry is removed |
| Enable tmate debug session | Cannot reproduce failure locally | 15 min | Low — SSH session is ephemeral |
Understanding GitHub Actions Failures: Timeout, OOM, Permission Denied & More
GitHub Actions workflows fail for a surprisingly small set of root causes. Learning to read the error signature quickly saves hours of trial-and-error. This guide maps each error message to a concrete fix.
Recognise the Error Before You Fix It
Each failure class has a distinct fingerprint in the job logs:
Timeout:
##[error]The job running on runner GitHub Actions X has exceeded the maximum execution time of 360 minutes.
Error: The operation was canceled.
Out of Memory (OOM):
Killed
Process completed with exit code 137.
or a sudden silent job cancellation with no error text—the Linux OOM killer sent SIGKILL.
Permission Denied (GITHUB_TOKEN):
Error: Resource not accessible by integration
remote: Permission to org/repo.git denied to github-actions[bot].
Error: HttpError: GitHub Actions is not permitted to create or approve pull requests.
Runner Offline:
No runners are available to run the requested job.
##[error]This request has been automatically failed because there are no available runners online to process the request.
General failure (catch-all):
Process completed with exit code 1.
##[error]The process '/usr/bin/git' failed with exit code 128.
Step 1: Diagnose the Root Cause
Read the Raw Logs
The GitHub Actions UI collapses log lines. Always click gear icon → View raw logs on a failed run to see the untruncated output. For programmatic access:
gh run view <RUN_ID> --log-failed
Identify Which Timeout Layer Fired
GitHub Actions has two independent timeout controls:
- Job-level
timeout-minutes: default 360 minutes for GitHub-hosted runners, unlimited for self-hosted. - Step-level
timeout-minutes: no default—steps run until the job timeout kills them.
If you see the 360-minute message, the job-level default fired. If a specific step message appears, a step-level timeout was set. Either way, trace back to the last log line before cancellation—that is the hanging operation.
Confirm OOM vs Timeout
OOM exit code is 137 (128 + 9 = SIGKILL). Timeout cancellation typically shows the ##[error] timeout message. If the job vanishes with exit 137 and no timeout message, it is OOM.
Add a pre-flight memory check to any suspect job:
- name: Runner memory profile
run: free -h && df -h && nproc
GitHub-hosted runner memory limits:
ubuntu-latest/ubuntu-22.04: 7 GB RAM, 2 vCPU, 14 GB SSDwindows-latest: 16 GB RAM, 2 vCPUmacos-latest: 14 GB RAM, 3 vCPU
Confirm Permission Scope
The GITHUB_TOKEN is minted per workflow run with the minimum scope unless you override it. The repository-level default is Read and write or Read-only depending on Settings → Actions → General → Workflow permissions. Check your current effective scope:
gh api /repos/{owner}/{repo}/actions/permissions
Check Runner Status
For self-hosted runners navigate to Settings → Actions → Runners (repo level) or Organization Settings → Actions → Runners. A runner showing Offline or Idle for an extended period needs attention.
Step 2: Fix Timeout Errors
Set an Explicit Job Timeout
Never rely on the 6-hour default for short jobs. Setting a tight timeout catches regressions early:
jobs:
build:
runs-on: ubuntu-latest
timeout-minutes: 20
steps:
- uses: actions/checkout@v4
- name: Install dependencies
timeout-minutes: 5
run: npm ci
- name: Run tests
timeout-minutes: 12
run: npm test
Eliminate Hanging Processes
Common causes of indefinite hangs:
- Interactive prompts —
apt-get,pip,brewwaiting for confirmation. Fix: always pass-y/--yes/--non-interactive. - Servers that never bind — test suite starting a dev server that fails to listen. Fix: use
wait-onor health-check loops with a timeout. - npm install vs npm ci —
npm installcan stall on registry issues. Fix: usenpm ciwhich is deterministic and faster. - Deadlocked processes — build tool waiting on a lock file held by a previous cancelled run. Fix: add a cache-busting step or clean workspace.
- name: Install system deps
run: sudo apt-get update && sudo apt-get install -y build-essential curl
- name: Install node deps
run: npm ci --prefer-offline
Cache Dependencies to Prevent Slow Installs
- uses: actions/cache@v4
with:
path: |
~/.npm
~/.cache/pip
key: ${{ runner.os }}-deps-${{ hashFiles('**/package-lock.json', '**/requirements.txt') }}
restore-keys: |
${{ runner.os }}-deps-
Step 3: Fix Out of Memory (OOM) Errors
Raise the Process Memory Limit
For Node.js builds hitting the default 512 MB V8 heap:
- name: Build frontend
run: npm run build
env:
NODE_OPTIONS: "--max-old-space-size=4096"
For JVM-based builds (Maven, Gradle):
- name: Maven package
run: mvn -B package --no-transfer-progress
env:
MAVEN_OPTS: "-Xmx4g -Xms512m -XX:+UseG1GC"
JAVA_TOOL_OPTIONS: "-Xmx4g"
Split Tests Across Matrix Shards
A single large Jest or Pytest run can exhaust memory. Shard across parallel jobs:
jobs:
test:
strategy:
matrix:
shard: [1, 2, 3, 4]
fail-fast: false
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: npx jest --shard=${{ matrix.shard }}/4 --forceExit
Use a Larger Runner
For GitHub Team or Enterprise, larger hosted runners are available:
jobs:
heavy-build:
runs-on: ubuntu-latest-8-cores # 32 GB RAM, 8 vCPU
For self-hosted runners, provision a machine with sufficient RAM and register it with an appropriate label.
Step 4: Fix Permission Denied Errors
Add a Permissions Block
The most common fix. Add it at the workflow level or per-job:
# Workflow-level — applies to all jobs
permissions:
contents: write
pull-requests: write
packages: write
id-token: write # required for OIDC/cloud auth
issues: write
statuses: write
jobs:
deploy:
# Job-level override — more restrictive is safer
permissions:
contents: read
id-token: write
Fix Repository-Level Default Permissions
Navigate to Settings → Actions → General → Workflow permissions and select Read and write permissions if your workflows need to push commits or create releases. Also enable Allow GitHub Actions to create and approve pull requests if needed.
Use a PAT for Cross-Repository or Elevated Operations
For operations that exceed GITHUB_TOKEN capabilities (pushing to another repo, triggering workflows in a different organisation):
jobs:
cross-repo-push:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
token: ${{ secrets.AUTOMATION_PAT }}
repository: org/other-repo
- name: Commit and push
run: |
git config user.email "ci-bot@example.com"
git config user.name "CI Bot"
git add .
git commit -m "chore: automated update"
git push
Store the PAT as a repository or organisation secret. Rotate it on a schedule using gh secret set.
Step 5: Fix Runner Offline Errors
Restart Self-Hosted Runner Service
# Navigate to the runner installation directory
cd /opt/actions-runner
# Check the managed service status
./svc.sh status
# Restart
./svc.sh stop && ./svc.sh start
# For systemd-managed installation
sudo systemctl status "actions.runner.*"
sudo systemctl restart "actions.runner.OWNER-REPO.RUNNER-NAME.service"
# Inspect runner diagnostic logs
ls -lt _diag/ | head -5
tail -100 _diag/Runner_$(date +%Y%m%d)*.log
Re-Register an Expired Runner
Runner registration tokens are valid for 1 hour. If the runner was registered long ago and the service was reinstalled, re-register:
# Remove the old registration (get removal token from GitHub Settings UI)
./config.sh remove --token <REMOVE_TOKEN>
# Re-register with a fresh token from Settings → Actions → Runners → New self-hosted runner
./config.sh \
--url https://github.com/OWNER/REPO \
--token <NEW_REGISTRATION_TOKEN> \
--name my-runner \
--labels linux,x64,production \
--unattended
./svc.sh install
./svc.sh start
Ensure Network Connectivity
Self-hosted runners must reach these endpoints (allow outbound HTTPS/443):
github.comapi.github.com*.actions.githubusercontent.comobjects.githubusercontent.com*.blob.core.windows.net(artifact storage)
Test connectivity from the runner host:
curl -v https://api.github.com/zen
curl -v https://pipelines.actions.githubusercontent.com/_apis/health
Step 6: Enable Debug Logging
For failures that are not obvious from the logs, enable runner and step debug output by adding these as repository secrets (not variables):
| Secret Name | Value |
|---|---|
ACTIONS_RUNNER_DEBUG |
true |
ACTIONS_STEP_DEBUG |
true |
This adds verbose output including environment variables, runner internals, and full step traces. Disable after debugging to avoid log noise and potential secret exposure.
For interactive debugging, drop a tmate session into a failed job:
- name: Interactive debug session on failure
uses: mxschmitt/action-tmate@v3
if: ${{ failure() }}
timeout-minutes: 10
with:
limit-access-to-actor: true
This opens an SSH tunnel directly into the runner, letting you inspect the filesystem and environment interactively.
Frequently Asked Questions
#!/usr/bin/env bash
# GitHub Actions self-hosted runner diagnostic script
# Run this on the runner host to diagnose common issues
set -euo pipefail
RUNNER_DIR="${1:-/opt/actions-runner}"
echo "=== Runner host diagnostics ==="
echo "Date: $(date -u)"
echo "Hostname: $(hostname)"
echo ""
echo "=== System resources ==="
free -h
echo ""
df -h /
echo ""
nproc
echo ""
echo "=== Runner service status ==="
if command -v systemctl &>/dev/null; then
systemctl list-units 'actions.runner*' --no-pager 2>/dev/null || echo "No systemd runner units found"
fi
if [[ -f "${RUNNER_DIR}/svc.sh" ]]; then
"${RUNNER_DIR}/svc.sh" status 2>&1 || true
fi
echo ""
echo "=== Network connectivity to GitHub ==="
curl -s --max-time 10 https://api.github.com/zen && echo " [OK] api.github.com" || echo " [FAIL] api.github.com unreachable"
curl -s --max-time 10 -o /dev/null -w "%{http_code}" https://github.com | grep -q "200\|301\|302" && echo " [OK] github.com" || echo " [FAIL] github.com unreachable"
echo ""
echo "=== Recent runner diagnostic logs ==="
if [[ -d "${RUNNER_DIR}/_diag" ]]; then
LATEST_LOG=$(ls -t "${RUNNER_DIR}/_diag"/Runner_*.log 2>/dev/null | head -1)
if [[ -n "${LATEST_LOG}" ]]; then
echo "Log: ${LATEST_LOG}"
tail -50 "${LATEST_LOG}"
else
echo "No runner logs found in ${RUNNER_DIR}/_diag/"
fi
else
echo "Runner directory ${RUNNER_DIR} not found"
fi
echo ""
echo "=== Runner registration check ==="
if [[ -f "${RUNNER_DIR}/.runner" ]]; then
echo "Runner is registered:"
cat "${RUNNER_DIR}/.runner"
else
echo "WARNING: .runner file not found — runner may not be registered"
fi
echo ""
echo "=== Trigger a workflow re-run via GitHub CLI ==="
echo "# Re-run all failed jobs in the most recent run of a workflow:"
echo "# gh run list --workflow=build.yml --limit=1 --json databaseId -q '.[0].databaseId' | xargs gh run rerun --failed"
echo ""
echo "# Enable debug logging for the next run (set as repository secrets):"
echo "# gh secret set ACTIONS_RUNNER_DEBUG --body true"
echo "# gh secret set ACTIONS_STEP_DEBUG --body true"
echo ""
echo "Diagnostic complete."
Error Medic Editorial
Error Medic Editorial is a team of senior DevOps and SRE engineers with combined experience across GitHub Actions, GitLab CI, Jenkins, and Kubernetes-based CI/CD pipelines. We write actionable troubleshooting guides grounded in production war stories, not just documentation summaries.
Sources
- https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/using-jobs-in-a-workflow#setting-a-timeout-for-a-job
- https://docs.github.com/en/actions/administering-github-actions/usage-limits-billing-and-administration
- https://docs.github.com/en/actions/hosting-your-own-runners/managing-self-hosted-runners/monitoring-and-troubleshooting-self-hosted-runners
- https://docs.github.com/en/actions/security-for-github-actions/security-guides/automatic-token-authentication
- https://docs.github.com/en/actions/hosting-your-own-runners/managing-self-hosted-runners/about-self-hosted-runners#communication-between-self-hosted-runners-and-github