Why does my Elasticsearch search time out only under heavy load but not otherwise?

Under load, the thread pool queue fills and requests wait longer before execution begins. The client-side timeout clock starts at request submission, not when the server starts processing. Increase the thread pool queue size or add data nodes to distribute shard traffic. Use `_cat/thread_pool?v` to confirm queue depth is growing during peak traffic.

My Elasticsearch bulk API returns a 429 instead of a timeout — is this the same problem?

Related but distinct. A 429 (Too Many Requests) means the write thread pool rejected the request immediately because the queue was full. A timeout means the request was accepted and queued but not completed in time. Both indicate thread pool saturation. The fix is the same: reduce ingestion rate, increase `thread_pool.write.queue_size`, or add nodes.

How do I set a per-query timeout without modifying cluster settings?

Pass a `timeout` parameter in your search request body: `{"timeout": "30s", "query": {...}}`. You can also pass it as a query string parameter: `GET /my-index/_search?timeout=30s`. This overrides the cluster default for that request only and returns partial results with `"timed_out": true` in the response rather than throwing an error.

Elasticsearch logs show 'search rejected due to missing shards' — is this a timeout?

No, this is a different error. It means one or more shards are unavailable (unassigned or initializing) when the search runs. Check `_cat/shards?v` for shards in `UNASSIGNED` state and `_cluster/allocation/explain` to understand why they are not allocated. Resolve the allocation issue first; timeouts on those queries will resolve automatically once all shards are primary-active.

After increasing request_timeout in my client, the timeout error went away but queries are still slow — what next?

Increasing the client timeout masks the underlying performance problem. Profile the slow query using the Profile API: add `"profile": true` to your search request body. The response includes per-shard timing for query, rewrite, and fetch phases. Common findings: expensive wildcard/regex queries, high cardinality aggregations without `execution_hint: map`, or missing index mappings causing dynamic field explosion.

Elasticsearch API Timeout: How to Diagnose and Fix Connection, Request, and Bulk Timeout Errors

Fix Elasticsearch API timeouts by tuning socket/request timeout settings, adjusting thread pools, and scaling cluster resources. Step-by-step guide with command

Last updated: February 23, 2026

Last verified: February 23, 2026

1,763 words

Key Takeaways

Elasticsearch API timeouts fall into three categories: connection timeouts (client cannot reach the node), request timeouts (node accepted the request but did not respond in time), and bulk/index timeouts (write operations exceed the configured deadline).
The most common root causes are undersized thread pools, GC pressure causing JVM pauses, hot shards from uneven data distribution, and client-side timeout values that are too low for the operation being performed.
Quick fixes: increase client request_timeout for long-running queries, raise search.default_search_timeout at the cluster level, add replicas to distribute read load, and monitor _cat/thread_pool and _nodes/hot_threads to identify the bottleneck before tuning.

Fix Approaches Compared
Method	When to Use	Time to Apply	Risk
Increase client request_timeout	Queries consistently finish but client gives up too early	< 5 min	Low — client-side only
Raise search.default_search_timeout	All searches time out cluster-wide, server is slow	5 min	Low — reversible setting
Tune thread_pool.search.size	search threadpool queue filling up (_cat/thread_pool shows rejections)	10 min	Medium — can starve other pools
Add replica shards	Hot primary shard handling all read traffic	15–30 min	Low — online operation
Increase JVM heap (restart required)	GC pauses > 5 s visible in logs, heap usage > 85%	30 min	High — requires rolling restart
Force-merge / reduce shard count	Too many small shards causing overhead on every request	Hours	Medium — CPU-intensive, do off-peak
Circuit breaker tuning	requests.breaker.total.limit too low, breaker trips before timeout	< 5 min	Medium — can allow OOM if set too high

Understanding Elasticsearch API Timeouts

When your application or curl command hits Elasticsearch and receives no response within the deadline, you will see one of several error messages depending on the client library and where in the stack the failure occurs:

# Python elasticsearch-py
ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
RequestError: RequestError(408, 'request_timeout', 'Request timed out')

# Java High-Level REST Client
org.elasticsearch.client.ResponseException: method [POST], host [...], status line [HTTP/1.1 408 Request Timeout]
java.net.SocketTimeoutException: Read timed out

# curl
curl: (28) Operation timed out after 30000 milliseconds

# Kibana / Elasticsearch logs
[o.e.s.SearchService] [node-1] Search request timed out: [...] took [30001ms], timeout [30000ms]

These errors map to distinct failure modes that require different remediation paths.

Step 1: Identify the Timeout Category

Connection timeout — The TCP handshake never completed. The cluster is unreachable, a load balancer dropped the connection, or the node is down.

Socket / read timeout — The connection was established but the server did not send back a complete response before the deadline. This is the most common production scenario.

Bulk reject / queue full — The thread pool queue is full; the node returned a 429 Too Many Requests or an implicit timeout because threads were unavailable.

Run these diagnostic commands first to triage:

# Check cluster health
curl -s 'http://localhost:9200/_cluster/health?pretty'

# Inspect thread pool rejections (look for 'rejected' > 0)
curl -s 'http://localhost:9200/_cat/thread_pool?v&h=name,active,queue,rejected,completed'

# Show hot threads on all nodes (identifies CPU-bound operations)
curl -s 'http://localhost:9200/_nodes/hot_threads'

# Check GC pause times in slow log
grep 'GC overhead' /var/log/elasticsearch/elasticsearch.log | tail -20

# Find slow queries (requires slow log enabled)
curl -s 'http://localhost:9200/_cat/indices?v&h=index,search.fetch_time,search.query_time'

Step 2: Fix Connection Timeouts

If the cluster health endpoint itself times out, the node is unreachable. Verify network connectivity and check whether the node process is alive:

# Verify the process is running
systemctl status elasticsearch

# Check bound address and port
curl -s 'http://localhost:9200'

# Test from application host (replace ES_HOST)
telnet ES_HOST 9200
nc -zv ES_HOST 9200

# Check firewall / security group rules
iptables -L -n | grep 9200

If nodes are reachable but the client reports connection timeouts, raise the connect_timeout in your client:

# Python elasticsearch-py v8
from elasticsearch import Elasticsearch
es = Elasticsearch(
    "http://localhost:9200",
    request_timeout=60,       # socket read timeout
    connections_per_node=10,
)

// Java REST Client
RestClientBuilder builder = RestClient.builder(new HttpHost("localhost", 9200))
    .setRequestConfigCallback(requestConfigBuilder ->
        requestConfigBuilder
            .setConnectTimeout(5000)   // 5 s connect
            .setSocketTimeout(60000)); // 60 s read

Step 3: Fix Search / Query Timeouts

Per-request timeout — Pass timeout in the request body. This is the safest option because it only affects the current query:

POST /my-index/_search
{
  "timeout": "30s",
  "query": { "match_all": {} }
}

Cluster-wide default — Set a global deadline for all searches. Queries that exceed it return partial results rather than an error:

curl -X PUT 'http://localhost:9200/_cluster/settings' \
  -H 'Content-Type: application/json' \
  -d '{
    "transient": {
      "search.default_search_timeout": "30s"
    }
  }'

Slow query analysis — Enable the slow log to find which queries are responsible:

curl -X PUT 'http://localhost:9200/my-index/_settings' \
  -H 'Content-Type: application/json' \
  -d '{
    "index.search.slowlog.threshold.query.warn": "5s",
    "index.search.slowlog.threshold.fetch.warn": "1s",
    "index.search.slowlog.level": "warn"
  }'

Then tail /var/log/elasticsearch/*_index_search_slowlog.log.

Step 4: Fix Thread Pool Exhaustion

If _cat/thread_pool shows rejected counts rising, the search or write thread pool is saturated. The safe fix is to reduce query complexity or add nodes. As a short-term measure you can increase the queue size (not the thread count, which is CPU-bound):

# elasticsearch.yml — increase search queue depth
thread_pool.search.queue_size: 2000
thread_pool.write.queue_size: 1000

Restart the node for static settings to take effect. For dynamic settings, use the Cluster Update Settings API.

Step 5: Fix JVM / GC-Induced Timeouts

Long GC pauses cause the JVM to stop all threads, making the node appear unresponsive. Signs: [gc][12345] overhead, spent [8.5s] collecting in the last [10s] in logs.

Heap must be ≤ 50% of RAM and never exceed 32 GB (compressed OOP limit).
Set JAVA_OPTS to use G1GC (default in ES 7+).
Reduce field data cache if fielddata circuit breaker trips frequently:

curl -X PUT 'http://localhost:9200/_cluster/settings' \
  -H 'Content-Type: application/json' \
  -d '{
    "persistent": {
      "indices.breaker.fielddata.limit": "40%",
      "indices.breaker.request.limit": "40%",
      "indices.breaker.total.limit": "70%"
    }
  }'

Step 6: Fix Bulk Indexing Timeouts

Bulk write timeouts (BulkRequestBuilder taking longer than timeout) are usually caused by refresh pressure or too-large batch sizes.

# Temporarily disable refresh during heavy indexing
curl -X PUT 'http://localhost:9200/my-index/_settings' \
  -H 'Content-Type: application/json' \
  -d '{"index.refresh_interval": "-1"}'

# Restore after indexing
curl -X PUT 'http://localhost:9200/my-index/_settings' \
  -H 'Content-Type: application/json' \
  -d '{"index.refresh_interval": "1s"}'

Reduce bulk batch size to 5–15 MB per request and target 1,000–5,000 documents per batch as a starting point, then tune based on _nodes/stats/indices/indexing metrics.

Step 7: Long-Term Prevention

Enable slow logs on all indices at warn threshold.
Alert on _cat/thread_pool rejections — any rejected > 0 per minute is a leading indicator.
Monitor GC pause time via _nodes/stats/jvm.
Use index lifecycle management (ILM) to force-merge and roll over hot indices, keeping shard counts healthy.
Distribute load with aliases pointing to multiple indices instead of querying one large index.

Frequently Asked Questions

bash

#!/usr/bin/env bash
# elasticsearch-timeout-diag.sh
# Run this script against a node to produce a triage report.

ES_HOST="${ES_HOST:-localhost}"
ES_PORT="${ES_PORT:-9200}"
BASE="http://${ES_HOST}:${ES_PORT}"

echo "=== Cluster Health ==="
curl -sf "${BASE}/_cluster/health?pretty" || echo "UNREACHABLE"

echo ""
echo "=== Thread Pool (look for rejected > 0) ==="
curl -sf "${BASE}/_cat/thread_pool?v&h=name,active,queue,rejected,completed,queue_size"

echo ""
echo "=== Node JVM GC Stats ==="
curl -sf "${BASE}/_nodes/stats/jvm?pretty" | \
  python3 -c "
import sys, json
d = json.load(sys.stdin)
for node, info in d['nodes'].items():
    gc = info['jvm']['gc']['collectors']
    name = info['name']
    for cname, cdata in gc.items():
        print(f'{name} | {cname} | count={cdata[\"collection_count\"]} | time_ms={cdata[\"collection_time_in_millis\"]}ms')
"

echo ""
echo "=== Hot Threads ==="
curl -sf "${BASE}/_nodes/hot_threads?threads=3"

echo ""
echo "=== Pending Tasks ==="
curl -sf "${BASE}/_cluster/pending_tasks?pretty"

echo ""
echo "=== Slow Indices (search query time > 60s cumulative) ==="
curl -sf "${BASE}/_cat/indices?v&h=index,search.query_time,search.query_total,search.fetch_time" | \
  awk 'NR==1 || $2 > 60000'

echo ""
echo "=== Circuit Breaker Status ==="
curl -sf "${BASE}/_nodes/stats/breaker?pretty" | \
  python3 -c "
import sys, json
d = json.load(sys.stdin)
for node, info in d['nodes'].items():
    for bname, bdata in info['breakers'].items():
        pct = round(bdata.get('overhead', 1) * bdata['estimated_size_in_bytes'] /
                    max(bdata['limit_size_in_bytes'], 1) * 100, 1) if bdata.get('limit_size_in_bytes', 0) > 0 else 0
        print(f\"{info['name']} | {bname} | used={bdata['estimated_size']} / {bdata['limit_size']} | tripped={bdata['tripped']}\")
"

echo ""
echo "=== Suggested Fixes ==="
REJECTED=$(curl -sf "${BASE}/_cat/thread_pool?h=rejected" | awk '{s+=$1} END{print s}')
if [ "${REJECTED}" -gt 0 ] 2>/dev/null; then
  echo "[!] Thread pool rejections detected (${REJECTED} total). Consider increasing queue_size or adding nodes."
fi
TRIPPED=$(curl -sf "${BASE}/_nodes/stats/breaker?pretty" | grep -c '"tripped" : [^0]')
if [ "${TRIPPED}" -gt 0 ] 2>/dev/null; then
  echo "[!] Circuit breaker has tripped. Review heap usage and breaker limits."
fi
echo "Done."

Error Medic Editorial

The Error Medic Editorial team is composed of senior DevOps engineers and SREs with experience operating large-scale Elasticsearch clusters in production. Our guides are derived from real incident postmortems and focus on actionable, command-level troubleshooting over theoretical explanations.

Sources

Fix Elasticsearch API timeout errors—ReadTimeoutError, SocketTimeoutException, 408/504 responses—with step-by-step diagnosis and config tuning.

Elasticsearch API Timeout: Diagnosing and Fixing Request Timeout Errors

Fix Elasticsearch API timeout errors fast. Covers socket timeouts, request_timeout settings, slow queries, and cluster health fixes with real commands.

Elasticsearch API Timeout: How to Diagnose and Fix Connection, Request, and Search Timeouts

Fix Elasticsearch API timeouts fast: tune request_timeout, adjust index.search.slowlog thresholds, scale shards, and configure circuit breakers to stop 504s.

How to Fix Elasticsearch API Timeout Errors (Request Timeout after 30000ms)

Resolve Elasticsearch API timeouts. Diagnose slow queries, GC pauses, and thread pool exhaustion. Learn to optimize queries and adjust client timeout settings.