Why does my Azure API return 504 after exactly 30 seconds every time?

The 30-second cutoff is the default APIM backend timeout. When the backend has not responded in exactly 30,000 ms, APIM closes the upstream connection and returns HTTP 504 with the message 'Backend service timeout'. The precision of the cutoff (always ~30s, never more) is the diagnostic tell. Fix by adding timeout="120" (or higher, max 240) to the policy element for that operation, or adopt an async pattern for operations that genuinely need longer than 4 minutes.

My Azure Function throws FunctionTimeoutException even after updating host.json to 00:10:00 — what am I missing?

On the Consumption plan, 10 minutes is the absolute ceiling — the runtime will not honor values above it in host.json. If your function is already set to 00:10:00 and still timing out before that, confirm the function is on Consumption (check the plan SKU with: az functionapp show --name --resource-group --query 'sku'). If you need unlimited time, upgrade to a Premium or Dedicated plan and then set functionTimeout to 00:30:00 or remove the setting entirely to use the platform default.

The Azure CLI hangs indefinitely waiting for a resource deployment — how do I bound that wait?

Add the --no-wait flag to return immediately and poll manually, or set the AZURE_CLI_DISABLE_CONNECTION_VERIFICATION and AZURE_HTTP_USER_AGENT environment variables to control behavior. For a bounded wait: use 'az resource wait --ids --created --timeout 300' which polls ARM and exits after 300 seconds if the resource has not provisioned. You can also poll the ARM operation URL directly using 'az rest --method GET --uri ' from the 202 response headers.

How do I tell if the timeout is client-side (SDK/HttpClient) or server-side (Azure returning 504)?

A server-side timeout returns a proper HTTP 504 or 408 response containing an x-ms-request-id header you can look up in Azure Monitor logs. A client-side timeout (e.g., HttpClient.Timeout in .NET, requests.get timeout= in Python) throws a TaskCanceledException, OperationCanceledException, or requests.exceptions.Timeout before any HTTP response is received — there will be no HTTP status code and no x-ms-request-id in your application logs. If your logs show an exception rather than a 504 status code, the problem is in your client timeout configuration.

Does increasing the APIM backend timeout cause performance or billing problems?

Increasing the timeout itself has no billing impact and does not change the APIM SLA (99.95% for multi-region deployments). However, holding a backend connection open for longer ties up gateway worker threads. On high-traffic APIs with many concurrent long-running requests, this can exhaust the gateway connection pool, causing cascading timeouts for unrelated operations. Monitor the 'Capacity' metric in APIM (Azure Monitor) — if it consistently exceeds 70%, the connection pool is under pressure and you should move long operations to an async pattern instead of simply raising the timeout.

Azure API Timeout: Fix 504 Gateway Timeout and RequestTimeout Errors in Azure API Management, Functions, and ARM

Diagnose and fix Azure API timeout errors (504, 408, RequestTimeout) across API Management, Functions, and ARM. Includes policy fixes, host.json config, and CLI

Last updated: February 23, 2026

Last verified: February 23, 2026

2,285 words

Key Takeaways

Azure API Management enforces a hard default 30-second backend timeout; any backend response exceeding this returns HTTP 504 with body '{"statusCode":504,"message":"Backend service timeout"}' — override with the timeout attribute on <set-backend-service> (max 240s) or switch to an async LRO polling pattern
Azure Functions on the Consumption plan caps at 10 minutes regardless of host.json settings; operations that cannot complete within that window must move to Premium/Dedicated plan or be redesigned with Durable Functions
Network-layer timeouts (Azure Load Balancer 4-minute idle TCP, Application Gateway 20-second default) are independent of APIM and require separate configuration changes via the Azure portal or CLI
Quick fix for APIM: add timeout="120" to <set-backend-service> in the inbound policy; quick fix for Functions: set functionTimeout in host.json; quick fix for App Gateway: az network application-gateway http-settings update --timeout 120

Azure API Timeout Fix Approaches Compared
Method	When to Use	Time to Implement	Risk
Increase APIM backend timeout policy	Backend genuinely needs >30s and you control APIM policy	5 minutes	Low — scoped to specific operation or API
Async LRO polling pattern in APIM	Operations >240s; must offload to background job	1–4 hours	Medium — requires client-side polling logic changes
Optimize backend response time	Backend slow due to cold starts, N+1 queries, or heavy compute	Hours to days	Low — improves reliability across all consumers
Application Gateway request timeout tuning	504/502 occurs at the WAF/AGW layer before traffic reaches APIM	10 minutes	Low — change requestTimeout in backend HTTP settings
Azure Function timeout + plan upgrade	Function hitting 5-min Consumption cap or 10-min hard ceiling	15–30 minutes	Low for config; Medium for plan migration (cost increase)
APIM retry policy with exponential backoff	Transient timeouts from cold starts, brief backend spikes	30 minutes	Low — defensive pattern, does not mask persistent issues

Understanding Azure API Timeout Errors

Azure API timeouts surface at multiple layers of the request pipeline. Correct diagnosis requires identifying where the clock ran out before changing any configuration. The exact error messages differ by layer:

Azure API Management: HTTP 504 Gateway Timeout, body: {"statusCode": 504, "message": "Backend service timeout"}
Azure API Management (forward-request): Microsoft.Azure.ApiManagement.Gateway.BackendRequestException: The request to the backend service timed out.
Azure Functions (Consumption plan): Microsoft.Azure.WebJobs.Host.FunctionTimeoutException: Timeout value of 00:05:00 exceeded by function 'FunctionName'
ARM REST API: {"code": "OperationTimeout", "message": "The operation 'Microsoft.Compute/virtualMachines/write' timed out"} — or a 202 Accepted that never transitions to Succeeded
Azure Load Balancer: Silent TCP RST after 4 minutes of idle — no HTTP error code, connection simply drops
Application Gateway / WAF: HTTP 502 Bad Gateway or HTTP 504 Gateway Timeout with x-appgw-trace-id header present

Step 1: Identify Which Layer Is Timing Out

Always capture full HTTP response headers before changing configuration.

1a. Inspect the Via and x-ms-request-id headers.

If the response contains Via: 1.1 apim-gateway-xxxxx (Azure API Management), the request reached APIM and the timeout occurred between APIM and your backend. If no Via header is present and you see a 504, the timeout happened upstream (Application Gateway or Load Balancer).

1b. Query APIM gateway logs in Log Analytics.

ApiManagementGatewayLogs
| where TimeGenerated > ago(1h)
| where ResponseCode == 504 or DurationMs > 28000
| project TimeGenerated, OperationName, DurationMs, BackendResponseCode, LastErrorReason, Url
| order by TimeGenerated desc
| take 20

A DurationMs clustering near 30000 confirms the default 30-second APIM backend timeout is the culprit. A DurationMs near 20000 with no backend response code points to Application Gateway.

1c. Check Application Insights for end-to-end traces.

requests
| where timestamp > ago(1h)
| where resultCode == "504" or duration > 25000
| project timestamp, name, url, duration, resultCode, cloud_RoleName
| order by timestamp desc

Step 2: Fix Azure API Management Timeout

The default backend timeout is 30 seconds and applies to all operations unless overridden. Fix at the operation, API, or global scope using the timeout attribute on <set-backend-service>.

Option A — Increase timeout (synchronous operations up to 240s):

<policies>
  <inbound>
    <base />
    <set-backend-service
      base-url="https://your-backend.example.com"
      timeout="120" />
  </inbound>
  <backend>
    <base />
  </backend>
  <outbound><base /></outbound>
  <on-error><base /></on-error>
</policies>

The timeout attribute is in seconds (integer). The platform maximum is 240 seconds. If you need longer, you must use an async pattern.

Option B — Async LRO pattern for operations over 4 minutes:

Kick off work asynchronously and return a 202 with a polling URL:

<inbound>
  <base />
  <send-request mode="new" response-variable-name="jobResponse" timeout="10">
    <set-url>https://your-backend.example.com/api/jobs</set-url>
    <set-method>POST</set-method>
    <set-body>@(context.Request.Body.As<string>(preserveContent: true))</set-body>
  </send-request>
  <return-response>
    <set-status code="202" reason="Accepted" />
    <set-header name="Location" exists-action="override">
      <value>@(((IResponse)context.Variables["jobResponse"]).Headers.GetValueOrDefault("Location", ""))</value>
    </set-header>
    <set-header name="Retry-After" exists-action="override">
      <value>5</value>
    </set-header>
  </return-response>
</inbound>

Clients poll the Location URL every Retry-After seconds until the status transitions to Succeeded or Failed.

Step 3: Fix Azure Function Timeout

Timeout limits are hard-enforced by plan tier:

Plan	Default Timeout	Maximum Timeout
Consumption	5 minutes	10 minutes
Flex Consumption	30 minutes	Unlimited
Premium	30 minutes	Unlimited
Dedicated (App Service)	30 minutes	Unlimited

Update host.json to extend timeout up to the plan ceiling:

{
  "version": "2.0",
  "functionTimeout": "00:10:00"
}

If your function requires more than 10 minutes, upgrade the plan:

# Upgrade to Premium EP1
az functionapp update \
  --name <function-app-name> \
  --resource-group <rg> \
  --plan-name <premium-plan-name>

For complex multi-step workflows, refactor to Durable Functions using the fan-out/fan-in pattern — this removes timeout concerns entirely because the orchestrator checkpoints state between activities.

Step 4: Fix Application Gateway Timeout

Application Gateway WAF v2 defaults to 20 seconds for the backend request timeout. Standard v2 defaults to 30 seconds. Update backend HTTP settings:

az network application-gateway http-settings update \
  --gateway-name <agw-name> \
  --resource-group <rg> \
  --name <backend-http-settings-name> \
  --timeout 120

# Verify the change
az network application-gateway http-settings show \
  --gateway-name <agw-name> \
  --resource-group <rg> \
  --name <backend-http-settings-name> \
  --query requestTimeout

Step 5: Handle ARM API Long-Running Operations

ARM returns HTTP 202 Accepted for operations that run longer than a few seconds. The response includes a polling URL:

HTTP/1.1 202 Accepted
Azure-AsyncOperation: https://management.azure.com/subscriptions/.../operations/abc123?api-version=2024-01-01
Retry-After: 30

Your client must poll Azure-AsyncOperation until status is Succeeded or Failed. Azure SDKs handle this automatically. For raw HTTP clients:

import requests, time

def poll_arm_lro(token, operation_url, retry_after=30, max_polls=40):
    headers = {"Authorization": f"Bearer {token}", "Content-Type": "application/json"}
    for _ in range(max_polls):
        resp = requests.get(operation_url, headers=headers, timeout=10)
        resp.raise_for_status()
        body = resp.json()
        status = body.get("status", "")
        if status == "Succeeded":
            return body
        if status in ("Failed", "Canceled"):
            err = body.get("error", {})
            raise RuntimeError(f"ARM LRO failed [{err.get('code')}]: {err.get('message')}")
        time.sleep(retry_after)
    raise TimeoutError(f"ARM operation did not complete after {max_polls} polls")

Step 6: Add a Retry Policy as a Safety Net

Transient 504 errors from backend cold starts or brief spikes should be retried before surfacing to callers. Add this APIM retry policy around the <forward-request> element:

<backend>
  <retry condition="@(context.Response.StatusCode == 504 || context.Response.StatusCode == 502)"
         count="3"
         interval="2"
         max-interval="10"
         delta="2"
         first-fast-retry="false">
    <forward-request timeout="60" />
  </retry>
</backend>

This retries up to 3 times with exponential backoff (2s, 4s, 8s) without masking genuine persistent timeouts.

Verification

After applying fixes, run a targeted latency test:

curl -v -w "\nDNS: %{time_namelookup}s | Connect: %{time_connect}s | TTFB: %{time_starttransfer}s | Total: %{time_total}s\n" \
  --max-time 150 \
  -H "Ocp-Apim-Subscription-Key: <your-key>" \
  https://<your-apim>.azure-api.net/<api-path>

Then confirm in Application Insights that p95 latency is below your new timeout threshold:

requests
| where timestamp > ago(30min)
| where url contains "<api-path>"
| summarize p50=percentile(duration,50), p95=percentile(duration,95), p99=percentile(duration,99)
  by bin(timestamp, 5m)
| render timechart

Frequently Asked Questions

bash

#!/usr/bin/env bash
# Azure API Timeout Diagnostic Script
# Usage: ./diagnose-api-timeout.sh <resource-group> [<apim-name>] [<function-app-name>] [<agw-name>]

RG="${1:?Usage: $0 <resource-group> [apim-name] [function-app-name] [agw-name]}"
APIM_NAME="${2:-}"
FUNC_NAME="${3:-}"
AGW_NAME="${4:-}"

echo "================================================================"
echo " Azure API Timeout Diagnostics — RG: $RG"
echo " $(date -u +%Y-%m-%dT%H:%M:%SZ)"
echo "================================================================"

# --- 1. APIM Gateway Logs (requires Log Analytics workspace) ---
if [ -n "$APIM_NAME" ]; then
  echo ""
  echo "[1] APIM Gateway Logs — last 1h, DurationMs > 25000 or HTTP 504"
  WORKSPACE_ID=$(az monitor log-analytics workspace list \
    -g "$RG" --query '[0].customerId' -o tsv 2>/dev/null)
  if [ -n "$WORKSPACE_ID" ]; then
    az monitor log-analytics query \
      --workspace "$WORKSPACE_ID" \
      --analytics-query "
ApiManagementGatewayLogs
| where TimeGenerated > ago(1h)
| where ResponseCode == 504 or DurationMs > 25000
| project TimeGenerated, OperationName, DurationMs, BackendResponseCode, LastErrorReason
| order by TimeGenerated desc
| take 20" \
      --timespan "PT1H" -o table 2>/dev/null || echo "[WARN] Query failed — check workspace permissions"
  else
    echo "[WARN] No Log Analytics workspace found in RG $RG"
  fi

  echo ""
  echo "[2] APIM APIs in service $APIM_NAME"
  az apim api list --resource-group "$RG" --service-name "$APIM_NAME" \
    --query '[].{API:name, Path:path, Protocols:protocols}' -o table 2>/dev/null \
    || echo "[WARN] APIM $APIM_NAME not found or no access"
fi

# --- 2. Azure Function timeout config ---
if [ -n "$FUNC_NAME" ]; then
  echo ""
  echo "[3] Function App Plan and Timeout — $FUNC_NAME"
  az functionapp show \
    --name "$FUNC_NAME" --resource-group "$RG" \
    --query '{Name:name, Kind:kind, State:state, Plan:appServicePlanId}' \
    -o json 2>/dev/null || echo "[WARN] Function app $FUNC_NAME not found"

  echo ""
  echo "[4] Function App Settings (FUNCTIONS_WORKER_RUNTIME, timeout)"
  az functionapp config appsettings list \
    --name "$FUNC_NAME" --resource-group "$RG" \
    --query "[?name=='FUNCTIONS_WORKER_RUNTIME' || name=='AzureWebJobsStorage' || name=='WEBSITE_RUN_FROM_PACKAGE'].{Key:name,Value:value}" \
    -o table 2>/dev/null
fi

# --- 3. Application Gateway timeout ---
if [ -n "$AGW_NAME" ]; then
  echo ""
  echo "[5] Application Gateway Backend HTTP Settings — $AGW_NAME"
  az network application-gateway http-settings list \
    --gateway-name "$AGW_NAME" --resource-group "$RG" \
    --query '[].{Name:name, TimeoutSec:requestTimeout, Port:port, Protocol:protocol}' \
    -o table 2>/dev/null || echo "[WARN] AGW $AGW_NAME not found"
else
  echo ""
  echo "[5] Discovering Application Gateways in RG $RG"
  az network application-gateway list -g "$RG" \
    --query '[].{Name:name, State:operationalState, SKU:sku.name}' \
    -o table 2>/dev/null
fi

# --- 4. Recent ARM operation failures ---
echo ""
echo "[6] Recent ARM Failed Operations — last 1h"
START=$(date -u -d '1 hour ago' +%Y-%m-%dT%H:%M:%SZ)
az monitor activity-log list \
  --resource-group "$RG" \
  --start-time "$START" \
  --status Failed \
  --query '[].{Time:eventTimestamp, Operation:operationName.value, Status:status.value}' \
  -o table 2>/dev/null | head -25

# --- 5. Live latency probe ---
if [ -n "$APIM_NAME" ]; then
  echo ""
  echo "[7] Live Latency Probe — APIM gateway health endpoint"
  GW_URL=$(az apim show -g "$RG" -n "$APIM_NAME" --query 'gatewayUrl' -o tsv 2>/dev/null)
  if [ -n "$GW_URL" ]; then
    echo "Endpoint: $GW_URL"
    curl -s -o /dev/null \
      -w "HTTP %{http_code} | DNS %.3f s | TCP %.3f s | TTFB %.3f s | Total %.3f s\n" \
      --max-time 35 "${GW_URL}/status-0123456789abcdef" 2>/dev/null \
      || echo "[INFO] Probe endpoint not reachable — test a real API operation path"
  fi
fi

echo ""
echo "Diagnostics complete."
echo "Next steps: review DurationMs values and plan/tier limits above."
echo "Docs: https://aka.ms/apim-timeout-policy | https://aka.ms/functions-timeout"

Error Medic Editorial

The Error Medic Editorial team consists of senior DevOps, SRE, and cloud engineers with hands-on experience running large-scale Azure, AWS, and GCP production workloads. Every guide is based on real incident postmortems, official vendor documentation, and community-verified solutions — no filler, no fluff.

Sources

Fix Azure API timeouts caused by misconfigured APIM policies, backend latency, or connection limits. Step-by-step diagnostics and policy fixes included.

Azure API Timeout: Fixing 'The request timed out' and 408/504 Errors in Azure APIs

Fix Azure API timeout errors (408, 504, RequestTimeout) fast. Covers ARM, APIM, Function App, and SDK timeouts with real commands and config fixes.

Azure API Timeout: How to Diagnose and Fix 408/504 Timeout Errors

Fix Azure API timeout errors (408, 504, OperationTimedOut) by adjusting timeout settings, enabling retries, and optimizing long-running calls. Step-by-step guid

Azure API Timeout: How to Fix 504, 408, and Operation Timed Out Errors

Fix Azure API timeouts (504, 408, TimeoutException) by adjusting APIM forward-request policy, Function host.json limits, and App Service timeouts. Step-by-step