How to Fix Traefik 502 Bad Gateway and 504 Gateway Timeout Errors
Comprehensive guide to diagnosing and fixing Traefik 502 Bad Gateway and 504 Gateway Timeout errors. Learn how to resolve backend connection issues, port mismat
- Docker Network Mismatch: The most common cause of a 502 Bad Gateway is Traefik and the target container operating on different, unlinked Docker networks.
- Incorrect Port Binding: Traefik might be forwarding traffic to the wrong internal port. Use `loadbalancer.server.port` labels to explicitly define the backend port.
- Container Health and Startup: A 504 Gateway Timeout often occurs if the backend service is overwhelmed, crashing, or takes too long to start before Traefik attempts to route traffic.
- Quick Fix: Verify both containers share a network, check `docker logs <container>` for application crashes, and explicitly define the routing port in your docker-compose labels.
| Method | When to Use | Time | Risk |
|---|---|---|---|
| Explicit Port Labeling | When the backend exposes multiple ports or the wrong default port is selected | 2 mins | Low |
| Network Reconfiguration | When containers cannot ping each other or are isolated | 5 mins | Low |
| Increasing Timeouts | When facing 504 errors on heavy, slow-to-process requests | 5 mins | Medium |
| Fixing App Crashes | When the backend container is continuously restarting or failing healthchecks | Variable | High |
Understanding the Traefik 502 Bad Gateway Error
When you deploy a service behind Traefik and are greeted with a plain text Bad Gateway or 502 Bad Gateway HTTP status code, it signifies a breakdown in communication between the reverse proxy (Traefik) and your backend service (the application container). Traefik successfully received the client's request but was unable to forward it to the intended destination, or it received an invalid response from the upstream server.
Closely related is the 504 Gateway Timeout error. While a 502 indicates a connection failure or immediate refusal (often presenting alongside connection refused logs), a 504 means the connection was established, but the backend service failed to respond within Traefik's configured timeout window.
Common Symptoms and Log Outputs
If you inspect the Traefik logs when a 502 error occurs, you will typically find entries resembling the following:
level=debug msg="'502 Bad Gateway' caused by: dial tcp 172.18.0.4:80: connect: connection refused"
Or:
level=debug msg="'502 Bad Gateway' caused by: dial tcp 172.18.0.4:8080: i/o timeout"
These log lines are your primary diagnostic tool. They tell you exactly which IP address and port Traefik is trying to reach, and why it's failing.
Root Causes and Detailed Diagnostics
Let's break down the most frequent culprits behind these gateway errors in a Traefik environment, specifically focusing on Docker and containerized deployments.
1. Docker Network Isolation
By default, Docker Compose creates a distinct network for each docker-compose.yml file. If your Traefik instance is defined in traefik-compose.yml and your application is in app-compose.yml, they exist on isolated networks. Traefik will attempt to route traffic to the container's IP, but the Docker daemon's firewall rules will drop the packets, resulting in a connection timeout or refusal.
Diagnosis: Inspect the networks attached to both containers.
docker inspect traefik_container_name -f '{{json .NetworkSettings.Networks}}'
docker inspect app_container_name -f '{{json .NetworkSettings.Networks}}'
If the JSON output doesn't show at least one shared network name, they cannot communicate.
2. Incorrect Port Forwarding
Traefik attempts to auto-detect the port your application is listening on. If your Dockerfile includes an EXPOSE directive, Traefik uses that. However, if multiple ports are exposed, or if the application listens on a non-standard port that isn't explicitly exposed, Traefik might guess wrong.
For example, if your Node.js app listens on port 3000, but Traefik tries to forward traffic to port 80, you will receive a 502 Bad Gateway.
Diagnosis:
Check what port Traefik thinks it should use by looking at the Traefik Dashboard or the logs. Then, verify what port your application is actually bound to internally. You can execute a shell in the application container and run netstat -tulpn or ss -lntu to see listening ports.
3. Application Crashes and Boot Loops
A 502 Bad Gateway is guaranteed if the backend container simply isn't running. If the application crashes immediately upon startup due to missing environment variables, database connection failures, or syntax errors, Traefik will have nowhere to send the traffic.
Diagnosis: Check the status of your application container.
docker ps -a | grep app_container_name
If the status is Restarting or Exited, you need to investigate the application logs.
docker logs app_container_name
4. 504 Gateway Timeout: Slow Responses
If you encounter a 504 Gateway Timeout, the network path is correct, and the port is correct, but the application is taking too long. This often happens on endpoints that perform heavy database queries, interact with slow third-party APIs, or during application cold starts.
Traefik has default read and write timeouts. If your backend service exceeds these, Traefik drops the connection and returns a 504 to the client.
Step-by-Step Resolution Strategies
Now that we've identified the common causes, here are the concrete steps to resolve them.
Fix 1: Bridging the Network Gap
To resolve network isolation, you must ensure both Traefik and your backend service share a Docker network. The best practice is to create an external network specifically for web-facing traffic.
Create the network:
docker network create webproxyAttach Traefik to the network (in its docker-compose.yml):
services: traefik: image: traefik:v2.10 networks: - webproxy networks: webproxy: external: trueAttach your application to the network:
services: myapp: image: myapp:latest networks: - webproxy - internal_db_network # App can also talk to its own DB labels: - "traefik.enable=true" - "traefik.docker.network=webproxy" networks: webproxy: external: true internal_db_network:Crucial Note: The label
traefik.docker.network=webproxytells Traefik explicitly which network to use for routing traffic to this container, which is vital when the container is attached to multiple networks.
Fix 2: Explicitly Defining the Load Balancer Port
Take the guesswork out of Traefik's routing by explicitly defining the backend port using labels. This is the most robust way to ensure traffic hits the right internal endpoint.
Add the following label to your application's docker-compose.yml:
services:
myapp:
# ... other config ...
labels:
- "traefik.enable=true"
- "traefik.http.routers.myapp.rule=Host(`myapp.example.com`)"
- "traefik.http.services.myapp.loadbalancer.server.port=8080" # Force Traefik to use port 8080
Replace 8080 with the actual port your application is listening on internally.
Fix 3: Adjusting Timeouts for 504 Errors
If your application legitimately requires more time to process requests (e.g., generating large reports), you can increase Traefik's timeout settings. In Traefik v2/v3, this is handled via the forwardingTimeouts or respondingTimeouts on the entrypoint or transport configuration.
You can configure the dialTimeout and responseHeaderTimeout on the default transport. In your static configuration (e.g., traefik.yml):
serversTransport:
forwardingTimeouts:
dialTimeout: 30s
responseHeaderTimeout: 60s
Or via CLI arguments on the Traefik container:
- "--serversTransport.forwardingTimeouts.dialTimeout=30s"
- "--serversTransport.forwardingTimeouts.responseHeaderTimeout=60s"
Fix 4: Resolving Application Errors
If docker logs reveals that your application is crashing, you must fix the underlying application code or configuration. A reverse proxy cannot route to a dead service. Ensure that environment variables are correctly injected, database credentials are valid, and the container has sufficient CPU and memory resources to boot up successfully.
Advanced Troubleshooting: Connection Refused
A specific variant of the 502 error manifests in the Traefik logs as connect: connection refused. This explicitly means Traefik reached the container's IP address, but the container's operating system actively rejected the connection on the specified port.
This almost always means one of two things:
- The application is listening on
localhostor127.0.0.1inside the container. In a containerized environment,localhostrefers to the container itself. If the app binds to127.0.0.1, it will not accept connections from external sources, including Traefik. The application must bind to0.0.0.0to accept external traffic. - The application hasn't finished starting. Traefik might be trying to send traffic before the application's HTTP server has actually bound to the port. Implementing Docker Healthchecks can mitigate this; Traefik won't route traffic until the container reports as healthy.
By systematically verifying network connectivity, explicitly defining routing ports, and ensuring application stability, you can effectively eliminate 502 Bad Gateway and 504 Gateway Timeout errors in your Traefik deployments.
Frequently Asked Questions
# 1. Verify containers share a network
docker network inspect webproxy
# 2. Check Traefik logs for specific dial errors
docker logs traefik | grep "502 Bad Gateway"
# 3. Check application logs for crashes
docker logs <your_app_container>
# 4. Verify internal listening ports inside the app container
docker exec -it <your_app_container> netstat -tulpn
# Example docker-compose label fix:
# labels:
# - "traefik.enable=true"
# - "traefik.docker.network=webproxy"
# - "traefik.http.services.my-service.loadbalancer.server.port=8080"Error Medic Editorial
Our SRE and DevOps experts write actionable guides to solve the toughest infrastructure and networking challenges. We focus on practical, command-driven troubleshooting for modern cloud-native environments.