Error Medic

Docker Permission Denied: Complete Fix Guide for Crashes, OOM, Disk Full, 502/504 & More

Fix docker permission denied, OOM kills, no space left on device, 502/504 errors, and high CPU with step-by-step Linux commands and a diagnostic script.

Last updated:
Last verified:
2,486 words
Key Takeaways
  • Permission denied on /var/run/docker.sock is caused by your Linux user not belonging to the docker group — fix with: sudo usermod -aG docker $USER && newgrp docker
  • Exit code 137 means the container was OOM-killed by the Linux kernel — set a memory limit with docker run -m 2g or mem_limit: 2g in docker-compose.yml
  • 'no space left on device' errors require pruning stopped containers, dangling images, and unused volumes with docker system prune -a --volumes
  • 502 Bad Gateway and connection refused errors almost always mean the Docker daemon is down or the container crashed before binding its port — check systemctl status docker and docker logs
  • Quick fix summary: (1) verify daemon is running, (2) fix user group membership, (3) prune disk space, (4) set memory/CPU resource limits, (5) read docker logs for root cause
Fix Approaches Compared
MethodWhen to UseTimeRisk
sudo usermod -aG docker $USERPermission denied on docker.sock< 1 min + re-loginLow
docker system prune -a --volumesNo space left on device / disk full1–10 minMedium — deletes unused data
docker run -m 2gContainer OOM killed (exit code 137)< 1 minLow
sudo systemctl restart dockerDaemon unresponsive / 502 Bad Gateway< 1 minMedium — stops all containers
Edit daemon.json data-rootDocker partition permanently full5–15 min + migrationHigh — requires data copy
docker update --cpus='1.5'Container consuming excessive CPU< 1 minLow
DOCKER_BUILDKIT=1 docker buildSlow Docker image buildsVariesNone
sudo journalctl -xeu docker.serviceDaemon fails to start / core dumpsDiagnostic onlyNone

Understanding Docker Errors on Linux

Docker errors range from a simple Unix socket permission mismatch to kernel-level OOM kills and filesystem exhaustion. Every problem has a clear diagnostic path. This guide covers each failure class with the exact error strings you will see and the commands to resolve them.


Exact Error Messages You Will Encounter

Permission denied connecting to the daemon:

Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: dial unix /var/run/docker.sock: connect: permission denied

Container OOM killed:

Error response from daemon: Cannot start container <id>: [8] System error: cannot allocate memory
Killed

Exit code will be 137 (128 + SIGKILL).

Disk / filesystem full:

Error response from daemon: no space left on device
Write /var/lib/docker/tmp/GetImageBlob: no space left on device

Daemon not running:

Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?

DNS or registry connection refused:

dial tcp: lookup registry-1.docker.io: connection refused

502 / 504 from reverse proxy (nginx, Traefik, Caddy):

502 Bad Gateway
504 Gateway Timeout
upstream connect error or disconnect/reset before headers. reset reason: connection failure

Step 1: Verify the Docker Daemon Is Running

Every Docker failure starts with this check. If the daemon is down, every command fails.

sudo systemctl status docker

If output shows Active: failed or Active: inactive:

sudo systemctl start docker

If it refuses to start, inspect the systemd journal:

sudo journalctl -xeu docker.service --no-pager | tail -60

Common daemon startup failures include corrupted /var/lib/docker, invalid JSON in /etc/docker/daemon.json, or a port/socket conflict. Validate the config file before restarting:

sudo dockerd --validate --config-file /etc/docker/daemon.json

If the config is malformed, reset it to a safe default:

echo '{}' | sudo tee /etc/docker/daemon.json
sudo systemctl restart docker

Step 2: Fix Docker Permission Denied Errors

The permission denied error on /var/run/docker.sock is the most common Docker issue on Linux. Docker's Unix socket is owned by the docker group; only root or group members can access it.

Check your current group membership:

groups $USER

Add your user to the docker group:

sudo usermod -aG docker $USER

Apply without logging out:

newgrp docker

Or fully log out and back in, then verify:

groups | grep docker
docker ps

Security note: Members of the docker group have effective root access to the host system. For production environments, consider rootless Docker instead:

dockerd-rootless-setuptool.sh install
export DOCKER_HOST=unix://$XDG_RUNTIME_DIR/docker.sock

CI/CD environments where the runner user is not in the docker group:

docker run -v /var/run/docker.sock:/var/run/docker.sock \
  --group-add $(stat -c '%g' /var/run/docker.sock) \
  your-image

Step 3: Diagnose 502, 504, and Connection Refused Errors

These errors appear when a reverse proxy (nginx, Traefik, Caddy) cannot reach the upstream container. The proxy returns 502 Bad Gateway when the container is down and 504 Gateway Timeout when the container is alive but responding too slowly.

Check all container states:

docker ps -a

Containers with status Exited or Restarting are your culprits. Read their logs:

docker logs --tail=200 <container_name>
docker logs --since=30m <container_name>

Inspect the exit code and OOM status:

docker inspect <container_id> \
  --format='ExitCode: {{.State.ExitCode}} | OOMKilled: {{.State.OOMKilled}} | Error: {{.State.Error}}'

Verify port bindings are correct:

docker port <container_id>
ss -tlnp | grep <expected_port>

Test the application from inside the container:

docker exec -it <container_id> curl -v http://localhost:<internal_port>/health

For 504 timeouts, check whether the app is deadlocked or CPU-starved:

docker exec <container_id> ps aux
docker exec <container_id> top -b -n1 | head -20

Step 4: Fix OOM and Out of Memory Errors

Exit code 137 nearly always means OOM kill. Confirm definitively:

docker inspect <container_id> --format='{{.State.OOMKilled}}'
# Returns: true
dmesg | grep -iE 'out of memory|oom|killed process'

Set a container memory limit at run time:

docker run -m 2g --memory-swap 2g your-image

Setting --memory-swap equal to -m disables swap for that container. Set it larger to permit swap.

In docker-compose.yml:

services:
  app:
    image: your-image
    mem_limit: 2g
    memswap_limit: 2g

Update a running container without restarting:

docker update --memory 2g --memory-swap 2g <container_name>

Check host memory availability:

free -h
cat /proc/meminfo | grep -E 'MemAvailable|SwapFree'

If the host itself is under memory pressure, add a swap file:

sudo fallocate -l 4G /swapfile
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile
echo '/swapfile none swap sw 0 0' | sudo tee -a /etc/fstab

Step 5: Fix Docker Disk Full and No Space Left on Device

Docker accumulates data aggressively: image layers, stopped container filesystems, anonymous volumes, and build cache. First, understand the breakdown:

docker system df
docker system df -v
df -h /var/lib/docker

Incremental cleanup (safe):

docker container prune    # remove stopped containers
docker image prune        # remove dangling images only
docker image prune -a     # remove ALL unused images
docker volume prune       # remove unused named volumes
docker builder prune      # remove build cache

Full cleanup (removes all unused resources):

docker system prune -a --volumes

Move Docker data to a larger disk (permanent fix):

  1. Stop Docker: sudo systemctl stop docker
  2. Copy data to new location: sudo rsync -aP /var/lib/docker/ /mnt/large-disk/docker/
  3. Update /etc/docker/daemon.json:
{
  "data-root": "/mnt/large-disk/docker"
}
  1. Start Docker: sudo systemctl start docker
  2. Verify: docker info | grep 'Docker Root Dir'

Automate cleanup with cron to prevent recurrence:

# Add to root's crontab (sudo crontab -e):
0 3 * * 0 docker system prune -f --filter 'until=168h' >> /var/log/docker-cleanup.log 2>&1

Enable log rotation in /etc/docker/daemon.json to prevent logs from filling disk:

{
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "10m",
    "max-file": "3"
  }
}

Step 6: Fix Docker High CPU Usage

Identify the offending container:

docker stats --no-stream

Apply CPU limits:

# Limit to 1.5 cores at run time
docker run --cpus='1.5' your-image

# Update a running container without restart
docker update --cpus='1.5' <container_name>

In docker-compose (v3 Swarm syntax):

services:
  app:
    deploy:
      resources:
        limits:
          cpus: '1.50'

Profile the process inside the container:

docker exec -it <container_id> top -b -n3
docker exec -it <container_id> sh -c 'ps aux --sort=-%cpu | head -20'

Step 7: Analyze Container Crashes and Core Dumps

When a container crashes with a segmentation fault or generates a core dump:

# Check dmesg for segfaults or signal 11
dmesg | grep -E 'segfault|core dumped|signal 11'

# Get crash context from journald
sudo journalctl -u docker --since '1 hour ago' | grep -iE 'fatal|panic|segfault|core'

# Read the crash log from the container
docker logs --tail=200 <container_id>

# Enable core dumps and ptrace for deep debugging
docker run --ulimit core=-1 \
  --cap-add SYS_PTRACE \
  -v /tmp/cores:/cores \
  your-image

Identify the crash log location inside the container:

docker exec <container_id> ls /var/crash/ 2>/dev/null || echo 'No /var/crash directory'
docker exec <container_id> find /var/log -name '*.log' -newer /proc/1 2>/dev/null | head -10

Step 8: Fix Slow Docker Performance

Enable BuildKit for significantly faster image builds:

DOCKER_BUILDKIT=1 docker build -t myapp .

Or enable it permanently in /etc/docker/daemon.json:

{"features": {"buildkit": true}}

Reduce build context with .dockerignore:

node_modules
.git
*.log
dist
__pycache__
.pytest_cache

Fix DNS resolution slowness (often causes 2+ second delays on every network call):

# Test DNS inside a container
docker run --rm busybox nslookup google.com

# If slow, override DNS in /etc/docker/daemon.json:
# {"dns": ["8.8.8.8", "8.8.4.4"]}
# Then: sudo systemctl restart docker

Verify the storage driver is overlay2 (not the slow devicemapper loop mode):

docker info | grep 'Storage Driver'
# Should show: Storage Driver: overlay2

To switch to overlay2, add to /etc/docker/daemon.json:

{"storage-driver": "overlay2"}

Then restart Docker. Note: this does NOT migrate existing images.


Step 9: Emergency Docker Recovery

If Docker is completely non-functional:

# 1. Gracefully stop all containers
docker stop $(docker ps -q) 2>/dev/null || true

# 2. Stop the daemon
sudo systemctl stop docker

# 3. Validate config
sudo dockerd --validate --config-file /etc/docker/daemon.json

# 4. If config is corrupt, reset it
echo '{}' | sudo tee /etc/docker/daemon.json

# 5. Restart
sudo systemctl start docker

# LAST RESORT: Full reset — loses all containers, images, and volumes
sudo systemctl stop docker
sudo rm -rf /var/lib/docker/*
sudo systemctl start docker

Frequently Asked Questions

bash
#!/usr/bin/env bash
# Docker Comprehensive Diagnostics Script
# Usage: bash docker-diag.sh 2>&1 | tee /tmp/docker-diag.log

set -uo pipefail

HR='================================================================='

echo "$HR"
echo "DOCKER DIAGNOSTICS — $(date)"
echo "$HR"

echo ""
echo "--- 1. Daemon Status ---"
systemctl is-active docker 2>/dev/null && echo "Daemon: RUNNING" || echo "Daemon: STOPPED"
systemctl is-enabled docker 2>/dev/null | xargs -I{} echo "Enabled: {}"

echo ""
echo "--- 2. Docker Version ---"
docker version 2>/dev/null | head -8 || echo "ERROR: Cannot connect to daemon"

echo ""
echo "--- 3. User Group Check ---"
groups | grep -q docker \
  && echo "OK: Current user is in the docker group" \
  || echo "WARNING: Not in docker group. Fix: sudo usermod -aG docker $USER && newgrp docker"

echo ""
echo "--- 4. Docker Socket Permissions ---"
ls -la /var/run/docker.sock 2>/dev/null || echo "WARNING: docker.sock not found"

echo ""
echo "--- 5. Disk Usage ---"
df -h /var/lib/docker 2>/dev/null || df -h / 2>/dev/null
echo ""
docker system df 2>/dev/null || echo "(cannot reach daemon)"

echo ""
echo "--- 6. All Containers (running + stopped) ---"
docker ps -a 2>/dev/null || echo "(cannot reach daemon)"

echo ""
echo "--- 7. Container Resource Usage ---"
docker stats --no-stream 2>/dev/null || echo "(cannot reach daemon)"

echo ""
echo "--- 8. OOM Events (dmesg) ---"
dmesg 2>/dev/null | grep -iE 'out of memory|oom_kill|killed process' | tail -20 \
  || echo "No OOM events found (or dmesg requires root)"

echo ""
echo "--- 9. Daemon Logs (last hour) ---"
sudo journalctl -u docker --no-pager --since '1 hour ago' 2>/dev/null | tail -40 \
  || echo "Cannot read journald (try: sudo journalctl -u docker)"

echo ""
echo "--- 10. Daemon Config ---"
if [ -f /etc/docker/daemon.json ]; then
  echo "Contents of /etc/docker/daemon.json:"
  cat /etc/docker/daemon.json
else
  echo "No /etc/docker/daemon.json found (using all defaults)"
fi

echo ""
echo "--- 11. Storage Driver and Root Dir ---"
docker info 2>/dev/null | grep -E 'Storage Driver|Docker Root Dir|Logging Driver|Cgroup Driver' \
  || echo "(cannot reach daemon)"

echo ""
echo "--- 12. Crash Indicators (segfault / core dump) ---"
dmesg 2>/dev/null | grep -iE 'segfault|signal 11|core dumped' | tail -10 \
  || echo "No segfaults found in dmesg"

echo ""
echo "$HR"
echo "Diagnostics complete. Review warnings above."
echo "Full output saved to: /tmp/docker-diag.log (if redirected)"
echo "$HR"
E

Error Medic Editorial

Error Medic Editorial is a team of senior DevOps engineers and SREs with combined decades of experience managing containerized workloads on Linux in production. We specialize in Docker, Kubernetes, and cloud-native infrastructure troubleshooting — translating real incident postmortems into actionable, command-first guides.

Sources

Related Guides