Why did my GitHub Actions job timeout after exactly 6 hours?

Six hours (360 minutes) is the default maximum execution time for a job on a standard GitHub-hosted runner. If your job hits this limit, it means a process inside it hung—often waiting for network input or an interactive prompt. You should set `timeout-minutes` to a lower value (e.g., 15) to fail faster and save action minutes.

What does 'Exit code 137' mean in GitHub Actions?

Exit code 137 generally indicates that the operating system's Out of Memory (OOM) killer terminated the process because it consumed too much RAM. Standard Linux runners have 7GB of RAM. You need to optimize your build's memory usage, add swap space, or switch to a larger runner.

How do I fix 'Permission denied' when trying to push a tag from an Action?

By default, the `GITHUB_TOKEN` might only have read access. You must explicitly grant write permissions in your workflow file by adding a `permissions` block at the job or workflow level, setting `contents: write`.

My self-hosted runner is showing as 'Offline'. How do I bring it back online?

SSH into the machine hosting the runner. Check if the runner service is active (e.g., `sudo systemctl status actions.runner.*` on Linux). Check the logs in the `_diag` directory for network connection issues. If the service crashed, restart it. If it cannot authenticate, you may need to generate a new runner token and reconfigure it.

Can I automatically retry a failed GitHub Action?

GitHub Actions does not have a native 'retry' keyword at the job level. However, for individual steps that are flaky (like network requests), you can use third-party actions from the marketplace (like `nick-fields/retry`) to wrap the command and automatically retry it on failure.

Fixing GitHub Actions Timeout and Runner Offline Errors

Common Fixes for GitHub Actions Failures
Issue	Primary Fix Approach	Time to Fix	Complexity
Job Timeout	Add timeout-minutes, fix hung scripts	10-30 mins	Low
Out of Memory (137)	Optimize build, use larger runners, increase swap	30-60 mins	Medium
Permission Denied (403)	Update permissions block in workflow YAML	5-15 mins	Low
Runner Offline	Restart self-hosted runner service, check network	15-45 mins	Medium

Common Fixes for GitHub Actions Failures

Issue

Primary Fix Approach

Time to Fix

Complexity

Job Timeout

Add timeout-minutes, fix hung scripts

10-30 mins

Low

Out of Memory (137)

Optimize build, use larger runners, increase swap

30-60 mins

Medium

Permission Denied (403)

Update permissions block in workflow YAML

5-15 mins

Low

Runner Offline

Restart self-hosted runner service, check network

15-45 mins

Medium

Understanding GitHub Actions Failures

When a GitHub Actions workflow fails, it can halt your CI/CD pipeline, preventing deployments and blocking pull requests. The error messages can range from explicit exit codes to frustratingly vague timeouts. Understanding the root cause of these failures is critical for maintaining a reliable development velocity.

Diagnosing 'The job running on runner ... has exceeded the maximum execution time of 360 minutes.'

This is the classic GitHub Actions timeout error. By default, a job on a GitHub-hosted runner can run for up to 6 hours (360 minutes). If your job hits this limit, it's almost certainly stuck. Common culprits include:

Hanging Network Requests: A script is waiting for a response from an external API or database that is down or unreachable, and there is no timeout configured on the request itself.
Interactive Prompts: A CLI tool is asking for user input (e.g., 'Are you sure you want to proceed? [y/N]') and waiting indefinitely because there is no TTY.
Infinite Loops: A bug in your test suite or build script has caused an infinite loop.

The Fix: First, always set a reasonable timeout-minutes on your jobs and even individual steps. If a typical build takes 10 minutes, set the timeout to 20. This fails fast and saves money. Second, review the logs right before the timeout. Look for commands that might prompt for input and add flags like --non-interactive, -y, or --quiet.

Handling 'Process completed with exit code 137' (Out of Memory)

Exit code 137 typically means the process was killed by the OOM (Out Of Memory) killer. Standard GitHub-hosted Linux runners come with 7 GB of RAM. Memory-intensive tasks like compiling large C++ projects, running heavy Java builds, or complex Node.js bundling can easily exceed this.

The Fix:

Optimize the Build: Can you run tests in parallel on different jobs rather than sequentially in one? Can you reduce the number of workers your test runner uses?
Increase Swap Space: You can artificially increase available memory by creating a swap file before your build step.
Larger Runners: If optimization isn't enough, consider upgrading to GitHub's Larger Runners (which offer up to 64-core and 256GB RAM options) or using a beefy self-hosted runner.

Resolving 'Permission denied' or '403 Forbidden'

These errors usually occur when your workflow tries to perform an action it isn't authorized for, such as pushing a tag, creating a release, or authenticating with a cloud provider.

The Fix: Ensure the GITHUB_TOKEN has the correct scopes. By default, the token permissions might be restricted to read-only depending on your repository or organization settings. Explicitly define the required permissions in your workflow YAML:

permissions:
  contents: write
  pull-requests: read

If you are accessing external services (like AWS, GCP, or Azure), ensure your secrets are correctly populated and that you are using OIDC (OpenID Connect) for authentication where possible, rather than long-lived credentials.

Troubleshooting 'Offline' Self-Hosted Runners

If your workflow is queued indefinitely and your self-hosted runner shows as 'Offline' in the GitHub UI, the runner process on your host machine has likely stopped communicating with GitHub.

The Fix:

Check the Service: Ensure the runner service is running on the host machine (systemctl status actions.runner.*).
Review Runner Logs: Inspect the diagnostic logs located in the _diag folder within the runner application directory. Look for network timeouts or SSL errors.
Network Configuration: Verify that the host machine can reach github.com and api.github.com on HTTPS (port 443).
Re-configuration: In rare cases, the runner token may have expired or become corrupted. You may need to remove the runner from the GitHub UI and reconfigure it on the host.

Frequently Asked Questions

# Example of fixing timeout and permission issues in workflow.yml name: CI Build on: [push] # Fix 1: Explicitly grant needed permissions to avoid 403s permissions: contents: read packages: write jobs: build: runs-on: ubuntu-latest # Fix 2: Prevent 6-hour hangs by setting a realistic timeout timeout-minutes: 15 steps: - uses: actions/checkout@v4 # Fix 3: Handle OOM issues by adding swap space (if needed) - name: Create Swap Space run: | sudo fallocate -l 4G /swapfile sudo chmod 600 /swapfile sudo mkswap /swapfile sudo swapon /swapfile free -h - name: Run Build # Fix 4: Ensure non-interactive mode for tools that might prompt run: npm ci --prefer-offline --no-audit && npm run build