How to Fix AWS API Rate Limit (ThrottlingException: Rate exceeded) and Timeout Errors
Resolve AWS API rate limit (ThrottlingException) and timeout errors by implementing exponential backoff, jitter, requesting quota increases, and optimizing API
- Root cause: Exceeding the maximum allowed API request rate for an AWS service, resulting in a ThrottlingException or HTTP 429 Too Many Requests.
- Root cause: Network congestion, slow endpoints, or aggressive client-side SDK configurations causing TimeoutError or HTTP 500/503/504.
- Quick fix: Implement exponential backoff with jitter in your retry logic, tune your AWS SDK timeouts, and request an AWS Service Quota increase if hitting a hard cap.
| Method | When to Use | Time | Risk |
|---|---|---|---|
| Implement Exponential Backoff | Immediate fix for bursty traffic causing intermittent throttling. | Medium | Low |
| Request Service Quota Increase | When consistently hitting baseline limits despite optimized code. | High (AWS Support SLA) | Low |
| Optimize API Calls (Batching/Caching) | To permanently reduce the overall volume of API requests. | Medium to High | Medium |
| Tune SDK Timeout Settings | When facing client-side or transient network timeouts on long-running tasks. | Low | Medium |
Understanding the Error
When building scalable cloud applications, interacting with the AWS API is a fundamental requirement. Whether you are provisioning resources, querying databases, or invoking serverless functions, your application relies on the AWS Control Plane and Data Plane APIs. However, as your application's throughput increases, you will inevitably encounter API rate limits (throttling) or API timeouts.
These errors manifest in various forms depending on the AWS SDK or CLI tool you are using. The most common error messages include:
ThrottlingException:An error occurred (ThrottlingException) when calling the [Operation] operation: Rate exceededTooManyRequestsException: HTTP 429 Too Many Requests.ProvisionedThroughputExceededException: Specific to services like Amazon DynamoDB.TimeoutError:Connection timed out after 120000msor HTTP 504 Gateway Timeout.
Why Does AWS Throttle API Requests?
AWS implements rate limiting to protect the underlying infrastructure from being overwhelmed by too many requests (either intentionally via DDoS attacks or unintentionally via runaway code). This ensures fair usage and high availability for all tenants in the shared cloud environment.
There are two primary types of API limits in AWS:
- Hard Limits (Service Quotas): These are absolute maximums on the number of resources you can create or the sustained rate of API calls you can make. Some of these can be increased by contacting AWS Support.
- Token Bucket (Burst) Limits: AWS uses a token bucket algorithm for many APIs. You accumulate tokens at a steady rate. Each API call consumes a token. If you burst and empty the bucket, subsequent calls are throttled until new tokens accumulate.
Step 1: Diagnose the Bottleneck
Before applying a fix, you must determine which API is throttling you and why. Blindly increasing retries can exacerbate the problem.
Analyzing CloudTrail Logs
AWS CloudTrail records API calls made within your account. You can query CloudTrail to identify throttling events. This is especially useful for Control Plane APIs (like ec2:DescribeInstances).
You can use Amazon Athena to query CloudTrail logs efficiently to find the worst offenders.
Monitoring AWS SDK Metrics
If you are encountering timeouts (TimeoutError), the issue might be client-side. The default HTTP timeout in many AWS SDKs is aggressive. If the AWS service takes longer to respond than the SDK's configured timeout, the SDK drops the connection and throws an error, even if the AWS service eventually completes the request.
Check CloudWatch metrics for the specific service (e.g., DynamoDB ThrottledRequests, API Gateway 4XXError and 5XXError, Lambda Throttles).
Step 2: Implement the Fix
Fixing AWS API rate limits and timeouts requires a multi-layered approach.
1. Implement Exponential Backoff with Jitter
The most critical defense against throttling is implementing robust retry logic. Standard retries (e.g., waiting exactly 1 second between each attempt) can cause the "thundering herd" problem, where multiple failing clients retry simultaneously, further overwhelming the API.
Exponential backoff increases the wait time between retries exponentially (e.g., 1s, 2s, 4s, 8s). Adding "jitter" introduces randomness to the wait time, spreading out the retries. Most modern AWS SDKs implement this automatically, but you may need to tune the maximum number of retries depending on your workload's tolerance for latency.
2. Tune Client-Side Timeouts
If you are seeing aws api timeout errors (e.g., TimeoutError or SocketTimeoutException), you may need to increase the HTTP socket timeout in your AWS SDK client configuration. This is particularly relevant for long-running operations like large S3 uploads, Athena queries, or invoking slow Lambda functions.
3. Optimize and Batch API Calls
The best way to avoid API limits is to make fewer API calls.
- Batching: Instead of sending 100 individual
PutItemrequests to DynamoDB, useBatchWriteItemto send them in a single network request. - Caching: If you are repeatedly polling an API that returns relatively static data (like
sts:GetCallerIdentityorssm:GetParameter), cache the response in memory for a few minutes. - Pagination Awareness: When listing resources (e.g.,
s3:ListObjectsV2), ensure you are properly handling pagination tokens rather than repeatedly requesting the first page.
4. Decouple Architecture with Amazon SQS
If your architecture is synchronous (e.g., API Gateway -> Lambda -> DynamoDB) and a downstream service throttles, the error bubbles all the way back to the user.
By introducing Amazon Simple Queue Service (SQS) (e.g., API Gateway -> SQS -> Lambda -> DynamoDB), you can decouple the components. The SQS queue acts as a shock absorber. If DynamoDB throttles the Lambda function, the message remains in the queue and Lambda will retry it automatically based on the visibility timeout, smoothing out traffic spikes without losing data or returning immediate 500 errors to the client.
5. Request a Service Quota Increase
If you have optimized your code, implemented backoff, and are still consistently hitting the ceiling, you are likely hitting a hard Service Quota limit.
- Navigate to the Service Quotas console in AWS.
- Search for the specific service and API limit.
- Select the quota and click Request quota increase.
- Provide a strong business justification and architectural details to AWS Support to ensure prompt approval.
Frequently Asked Questions
# Diagnostic command to find the top throttled AWS API calls using AWS CLI and jq (requires CloudTrail logs in JSON format)
cat cloudtrail_logs.json | jq -r '.Records[] | select(.errorCode != null) | select(.errorCode | contains("ThrottlingException") or contains("Rate exceeded")) | .eventName' | sort | uniq -c | sort -nr
# Example Python Boto3 Configuration with Custom Retries and Timeouts
# import boto3
# from botocore.config import Config
#
# custom_config = Config(
# region_name='us-east-1',
# signature_version='v4',
# retries={
# 'max_attempts': 10,
# 'mode': 'adaptive' # 'adaptive' mode automatically handles backoff and throttling
# },
# connect_timeout=10,
# read_timeout=120
# )
#
# client = boto3.client('s3', config=custom_config)Error Medic Editorial
Error Medic Editorial is a team of certified cloud architects and SREs dedicated to providing actionable, code-first solutions for complex infrastructure and deployment challenges.