Why does my Azure API return 504 even though my backend responds in 25 seconds?

APIM's default `forward-request` timeout is **30 seconds** on the Consumption tier and varies on other tiers (check your policy XML for an explicit value). Additionally, network round-trip time between APIM and your backend is included in the clock. If your backend takes 25s to produce the response body but APIM's timeout is 28s, APIM may time out before the full response is received. Raise ` ` and monitor `backendTime_d` in GatewayLogs to confirm.

My Azure Function works fine in isolation but times out through APIM. What is different?

Three common causes: (1) The Function is on Consumption plan and experiences a **cold start** of 5–15 seconds that pushes total latency past APIM's timeout. Fix by enabling **Always On** (requires at least Basic App Service plan) or using **Premium plan with pre-warmed instances**. (2) APIM is in a different region than the Function, adding network latency. Move them to the same region or use a private endpoint. (3) APIM's `forward-request` timeout is lower than the Function's `functionTimeout`. Align both values.

I raised the APIM timeout to 240 seconds but still get timeouts on App Service backends.

Azure App Service (IIS) enforces a hard **230-second** request timeout at the platform level, which cannot be raised via Application Settings on the Standard tier. Requests that exceed 230 seconds receive a TCP reset from the IIS worker, causing APIM to see a 502. Solutions: (1) Optimize the backend to complete in < 230s. (2) Migrate to the **async pattern** (202 + polling). (3) Move to **Azure Functions Premium plan** with `functionTimeout` set to `00:10:00` or higher, bypassing the 230s IIS limit.

I see 'TaskCanceledException: The request was canceled due to the configured HttpClient.Timeout of 100 seconds elapsing' in my .NET client. Is this an Azure issue?

This error originates in your **client code**, not Azure. The default `HttpClient.Timeout` in .NET is 100 seconds. If your request takes longer — even if APIM and the backend would succeed — the client cancels the request at 100s and throws `TaskCanceledException`. Fix by increasing `HttpClient.Timeout` on your client instance: `new HttpClient { Timeout = TimeSpan.FromSeconds(300) }`. Ensure this value is larger than your APIM `forward-request timeout` so that APIM-level errors surface correctly rather than being masked by client cancellation.

How do I find which specific API operation is causing the most timeouts in production?

Use this KQL query in Azure Monitor Log Analytics against your APIM diagnostic logs: `AzureDiagnostics | where Category == "GatewayLogs" | where backendTime_d > 25000 | summarize TimeoutCount=count(), AvgBackendMs=avg(backendTime_d) by operationId_s | order by TimeoutCount desc | take 20`. This surfaces the top 20 slowest operations by timeout frequency. Drill into `operationId_s` to identify the exact API and method. Enable **Application Insights** on your APIM instance for richer end-to-end correlation across APIM and backend.

Azure API Timeout: 'The operation timed out' — Root Causes and Fixes

Q: I see 'TaskCanceledException: The request was canceled due to the configured HttpClient.Timeout of 100 seconds elapsing' in my .NET client. Is this an Azure issue?

This error originates in your **client code**, not Azure. The default `HttpClient.Timeout` in .NET is 100 seconds. If your request takes longer — even if APIM and the backend would succeed — the client cancels the request at 100s and throws `TaskCanceledException`. Fix by increasing `HttpClient.Timeout` on your client instance: `new HttpClient { Timeout = TimeSpan.FromSeconds(300) }`. Ensure this value is larger than your APIM `forward-request timeout` so that APIM-level errors surface correctly rather than being masked by client cancellation.

Q: How do I find which specific API operation is causing the most timeouts in production?

Use this KQL query in Azure Monitor Log Analytics against your APIM diagnostic logs: `AzureDiagnostics | where Category == "GatewayLogs" | where backendTime_d > 25000 | summarize TimeoutCount=count(), AvgBackendMs=avg(backendTime_d) by operationId_s | order by TimeoutCount desc | take 20`. This surfaces the top 20 slowest operations by timeout frequency. Drill into `operationId_s` to identify the exact API and method. Enable **Application Insights** on your APIM instance for richer end-to-end correlation across APIM and backend.

Fix Azure API timeouts caused by misconfigured APIM policies, backend latency, or connection limits. Step-by-step diagnostics and policy fixes included.

Last updated: February 23, 2026

Last verified: February 23, 2026

2,022 words

Key Takeaways

Azure API Management (APIM) enforces a default 30-second forward-timeout on all backend calls; backends slower than this receive a 504 Gateway Timeout upstream.
Azure Function and App Service backends can hit host-level idle timeouts (230 seconds for App Service, configurable for Functions) that differ from APIM's policy timeout, causing cascading 502/504 chains.
Client-side SDKs (Azure SDK for .NET, Python, Java) default to a 100-second HttpClient timeout; mismatches between client, gateway, and backend timeouts produce misleading error attribution.
Quick fix: raise the APIM `<forward-request timeout='...' />` policy, increase backend host timeout in App Service settings, and align client SDK `TimeoutPolicy` — then add Circuit Breaker policies to prevent retry storms.

Fix Approaches Compared
Method	When to Use	Time to Apply	Risk
Raise APIM forward-request timeout in policy XML	APIM gateway times out before backend finishes; 504 returned to client	< 5 min	Low — policy scoped to operation or product
Increase App Service request timeout (WEBSITE_FCNL_IDLE_TIMEOUT / applicationInitialization)	Backend App Service or Azure Function returns 230s timeout before APIM limit	5–10 min, requires app restart	Medium — increases worker saturation risk under load
Implement retry + circuit breaker in APIM policy	Transient network errors causing intermittent timeouts at scale	15–30 min	Low — retry policy must include idempotency guard
Switch long-running operations to async (202 Accepted + polling)	Backend operation legitimately takes > 30s (e.g., ML inference, bulk import)	Hours — requires API redesign	Low long-term; high short-term migration effort
Enable Azure API Management backend health probes	Load-balanced backend pool has unhealthy instances causing latency spikes	10 min	Low — read-only probe traffic
Profile and optimize backend query / function cold start	Timeout root cause is slow database query or Function cold start	Hours to days	Low — performance improvement with no contract change

Understanding Azure API Timeouts

Azure API timeouts manifest across multiple layers — client SDK, Azure API Management (APIM), App Service / Azure Functions host, and downstream dependencies. Each layer enforces its own deadline, and the shortest deadline wins. The resulting error message varies by layer:

APIM Gateway → Client: HTTP 504 Gateway Timeout with body { "statusCode": 504, "message": "The operation timed out." }
App Service → APIM: HTTP 502 Bad Gateway — Upstream connect error or disconnect/reset before headers. Reset reason: connection timeout
Azure SDK → Client Code: TaskCanceledException: The request was canceled due to the configured HttpClient.Timeout of 100 seconds elapsing. (.NET) or azure.core.exceptions.ServiceRequestError: Connection timeout (Python)
Azure Functions Durable / long poll: System.TimeoutException: The activity function 'ProcessBatch' timed out after 00:10:00.

Understanding which layer threw the error is step zero.

Step 1: Identify the Failing Layer

Open Azure Monitor → Diagnostics logs for your APIM instance. Filter by BackendTime metric:

AzureDiagnostics
| where ResourceType == "APIMANAGEMENT/SERVICE"
| where Category == "GatewayLogs"
| where responseCode_d == 504 or responseCode_d == 502
| project TimeGenerated, operationId_s, backendTime_d, totalTime_d, lastError_message_s
| order by TimeGenerated desc

If backendTime_d is close to the forward-request timeout value (default 300 in newer tiers, 60 in Consumption), the timeout is occurring at the APIM-to-backend boundary. If backendTime_d is low but totalTime_d is high, the timeout is APIM-internal or client-side.

For App Service backends, check the Diagnose and Solve Problems blade → HTTP 4xx/5xx errors → drill into the Failed Requests log. A TimeTaken > 230 milliseconds on an IIS entry indicates the App Service host killed the request at its own 230-second hard limit.

For Azure Functions, check Application Insights:

requests
| where success == false
| where duration > 30000
| project timestamp, name, duration, resultCode, operation_Id
| order by timestamp desc

Step 2: Fix APIM Forward-Request Timeout

Navigate to Azure Portal → API Management → APIs → [Your API] → [Operation] → Inbound processing → Policy editor. Locate or add the <forward-request> element:

<policies>
  <inbound>
    <base />
  </inbound>
  <backend>
    <!-- Raise timeout from default 30s to 120s for this operation -->
    <forward-request timeout="120" follow-redirects="true" />
  </backend>
  <outbound>
    <base />
  </outbound>
  <on-error>
    <base />
  </on-error>
</policies>

The timeout attribute is in seconds. The maximum value on the Consumption tier is 240 seconds. On Developer, Basic, Standard, and Premium tiers the limit is 240 seconds per the service-level cap. For operations that will always exceed this, you must redesign to an async pattern.

To apply the timeout at the API level rather than per-operation, set the policy in the API's All Operations scope. To apply globally, set it in the All APIs product scope under API Management → Products → [Product] → Policies.

Step 3: Fix App Service / Azure Functions Host Timeout

For App Service (Windows/Linux), add the following Application Setting:

Setting	Value	Effect
`WEBSITE_FCNL_IDLE_TIMEOUT`	`600`	Extends idle socket timeout to 600s
`SCM_DO_BUILD_DURING_DEPLOYMENT`	`true`	Reduces cold-start latency

For Azure Functions, edit host.json:

{
  "version": "2.0",
  "functionTimeout": "00:10:00",
  "extensions": {
    "http": {
      "routePrefix": "api",
      "maxOutstandingRequests": 200,
      "maxConcurrentRequests": 100
    }
  }
}

Note: On the Consumption plan, functionTimeout maximum is 10 minutes. On Premium and Dedicated plans, the maximum is unlimited (set to -1), but the App Service load balancer still enforces a 230-second TCP idle timeout by default. For long-running workloads on Consumption, migrate to Durable Functions with the async HTTP polling pattern.

Step 4: Add Retry and Circuit Breaker Policies in APIM

To handle transient timeouts gracefully without propagating 504s to clients:

<backend>
  <retry condition="@(context.Response == null || context.Response.StatusCode == 504 || context.Response.StatusCode == 502)" count="2" interval="2" first-fast-retry="false">
    <forward-request timeout="60" />
  </retry>
</backend>

Pair this with a circuit breaker backend policy (available in APIM v2 tiers) to stop hammering an unhealthy backend:

<backend>
  <circuit-breaker rule-name="backendCircuitBreaker">
    <trip-duration>PT30S</trip-duration>
    <acceptance-count>5</acceptance-count>
    <error-threshold>0.5</error-threshold>
    <on-trip-status-code>503</on-trip-status-code>
  </circuit-breaker>
  <forward-request timeout="60" />
</backend>

Warning: Only retry on idempotent operations (GET, HEAD, OPTIONS, PUT with a full replacement body). Never auto-retry POST or PATCH without verifying the backend is idempotent.

Step 5: Align Client SDK Timeouts

If your client uses the Azure SDK, set HttpClientTransport timeout options to exceed the gateway timeout so APIM — not the client — controls the deadline:

C# / .NET:

var options = new ApiManagementClientOptions
{
    Retry = { MaxRetries = 3, Delay = TimeSpan.FromSeconds(2), Mode = RetryMode.Exponential },
    Transport = new HttpClientTransport(new HttpClient { Timeout = TimeSpan.FromSeconds(180) })
};

Python:

from azure.core.pipeline.transport import RequestsTransport
client = ApiManagementClient(
    credential=credential,
    subscription_id=subscription_id,
    transport=RequestsTransport(connection_timeout=10, read_timeout=180)
)

Step 6: Long-Running Operations — Async Pattern

For operations that legitimately exceed 240 seconds, implement the 202 Accepted + Location polling pattern:

Client calls POST /api/jobs
Backend immediately returns 202 Accepted with Location: /api/jobs/{jobId}/status
Client polls GET /api/jobs/{jobId}/status until { "status": "Completed" }

In APIM, use the send-request policy to proxy status checks. This eliminates all timeout risk on long compute.

Frequently Asked Questions

bash

#!/usr/bin/env bash
# Azure API Timeout Diagnostics Script
# Prerequisites: az CLI logged in, jq installed
# Usage: APIM_NAME=my-apim RG=my-rg ./diagnose-apim-timeout.sh

set -euo pipefail

APIM_NAME="${APIM_NAME:-my-apim-instance}"
RG="${RG:-my-resource-group}"
API_ID="${API_ID:-my-api}"

echo "=== 1. Check APIM SKU and tier-level timeout limits ==="
az apim show --name "$APIM_NAME" --resource-group "$RG" \
  --query '{sku: sku.name, tier: sku.tier, gatewayRegionalUrl: gatewayRegionalUrl}' \
  --output table

echo ""
echo "=== 2. Retrieve all-operations policy XML for the target API ==="
az apim api policy show \
  --resource-group "$RG" \
  --service-name "$APIM_NAME" \
  --api-id "$API_ID" \
  --output json | jq -r '.value' \
  | grep -E 'forward-request|timeout' || echo '[INFO] No explicit forward-request timeout found — default applies'

echo ""
echo "=== 3. Fetch recent 504/502 gateway errors from diagnostics logs ==="
# Requires Log Analytics workspace linked to APIM
WORKSPACE_ID=$(az apim show \
  --name "$APIM_NAME" \
  --resource-group "$RG" \
  --query 'id' -o tsv | xargs -I{} az monitor diagnostic-settings list \
  --resource {} --query '[0].workspaceId' -o tsv 2>/dev/null || echo "")

if [ -n "$WORKSPACE_ID" ]; then
  az monitor log-analytics query \
    --workspace "$WORKSPACE_ID" \
    --analytics-query "AzureDiagnostics | where Category == 'GatewayLogs' | where responseCode_d in (502, 504) | project TimeGenerated, operationId_s, backendTime_d, totalTime_d, lastError_message_s | order by TimeGenerated desc | take 20" \
    --output table
else
  echo '[WARN] No Log Analytics workspace found. Enable diagnostics: az apim update --name $APIM_NAME --resource-group $RG --set diagnosticSettings.logs[0].enabled=true'
fi

echo ""
echo "=== 4. Test backend latency directly (bypass APIM) ==="
BACKEND_URL="${BACKEND_URL:-https://your-backend.azurewebsites.net/health}"
echo "Testing $BACKEND_URL ..."
time curl -o /dev/null -s -w \
  "HTTP %{http_code} | DNS %{time_namelookup}s | Connect %{time_connect}s | TTFB %{time_starttransfer}s | Total %{time_total}s\n" \
  "$BACKEND_URL"

echo ""
echo "=== 5. Check App Service timeout settings ==="
APP_SERVICE_NAME="${APP_SERVICE_NAME:-my-backend-app}"
az webapp config appsettings list \
  --name "$APP_SERVICE_NAME" \
  --resource-group "$RG" \
  --output table 2>/dev/null | grep -E 'TIMEOUT|IDLE|FCNL' || echo '[INFO] No explicit timeout app settings found — App Service default 230s applies'

echo ""
echo "=== 6. Show current APIM backend entity config ==="
az apim api show \
  --name "$APIM_NAME" \
  --resource-group "$RG" \
  --api-id "$API_ID" \
  --query '{serviceUrl: serviceUrl, protocols: protocols, path: path}' \
  --output table

echo ""
echo "=== Diagnostics complete. Review forward-request timeout values and backendTime_d metrics above. ==="

Error Medic Editorial

The Error Medic Editorial team consists of senior DevOps and SRE engineers with hands-on experience operating Azure API Management, Azure Functions, and App Service workloads at scale. Our guides are validated against live Azure environments and updated with each major platform release.

Sources

Diagnose and fix Azure API timeout errors (504, 408, RequestTimeout) across API Management, Functions, and ARM. Includes policy fixes, host.json config, and CLI

Azure API Timeout: Fixing 'The request timed out' and 408/504 Errors in Azure APIs

Fix Azure API timeout errors (408, 504, RequestTimeout) fast. Covers ARM, APIM, Function App, and SDK timeouts with real commands and config fixes.

Azure API Timeout: How to Diagnose and Fix 408/504 Timeout Errors

Fix Azure API timeout errors (408, 504, OperationTimedOut) by adjusting timeout settings, enabling retries, and optimizing long-running calls. Step-by-step guid

Azure API Timeout: How to Fix 504, 408, and Operation Timed Out Errors

Fix Azure API timeouts (504, 408, TimeoutException) by adjusting APIM forward-request policy, Function host.json limits, and App Service timeouts. Step-by-step