Error Medic

How to Fix AWS API Rate Limit (ThrottlingException) and Timeout Errors

Resolve AWS API rate limit (ThrottlingException) and timeout errors by implementing exponential backoff, jitter, and requesting AWS Service Quota increases.

Last updated:
Last verified:
1,105 words
Key Takeaways
  • AWS APIs enforce account-level and region-level rate limits to protect service stability, often resulting in ThrottlingException or HTTP 429 errors.
  • Bursty traffic, inefficient polling, and lack of client-side retry logic are the primary root causes of API throttling.
  • Implementing exponential backoff with jitter in your AWS SDK clients is the most effective immediate fix.
  • Persistent limits require requesting a Service Quota increase via the AWS Management Console.
Fix Approaches Compared
MethodWhen to UseTimeRisk
Exponential Backoff & JitterImmediate fix for bursty traffic causing ThrottlingExceptions10-30 minsLow
AWS Service Quota IncreaseSustained high traffic exceeding default account limits1-2 daysLow
Caching (e.g., ElastiCache)Read-heavy workloads polling AWS APIs frequentlyHours to DaysMedium
Architecture Changes (SQS/SNS)Decoupling components to smooth out request spikesDays to WeeksHigh

Understanding the Error

When building scalable applications on AWS, interacting with the AWS API is inevitable. Whether you are provisioning infrastructure via CloudFormation, reading from DynamoDB, or triggering Lambda functions, you are making API calls. However, AWS implements strict rate limits (throttling) to ensure service stability and fair usage across all tenants.

When your application exceeds these limits, you will encounter rate limit or timeout errors. The most common error is a ThrottlingException, but it can manifest in various ways depending on the service:

  • EC2: RequestLimitExceeded
  • DynamoDB: ProvisionedThroughputExceededException
  • API Gateway/General: HTTP 429 Too Many Requests
  • General AWS SDK: ThrottlingException: Rate exceeded

Additionally, if the AWS API is overwhelmed or your client drops the connection prematurely while waiting in a throttled queue, you might see an AWS API timeout (e.g., ReadTimeoutError, ConnectTimeoutError, or HTTP 504 Gateway Timeout).

Why Does This Happen?

  1. Bursty Workloads: A sudden spike in traffic or a cron job that spins up hundreds of parallel threads making API calls simultaneously.
  2. Default Service Quotas: Every AWS account starts with default quotas (formerly known as limits). For example, the EC2 DescribeInstances API has a token bucket rate limit.
  3. Inefficient Polling: Constantly querying an AWS API to check the status of a resource instead of using EventBridge or SNS notifications.
  4. Lack of Retry Logic: Failing to implement retries with exponential backoff when an API call is temporarily throttled.

Step 1: Diagnose the Issue

Before applying a fix, you need to identify which API is being throttled and by how much.

CloudTrail and CloudWatch

The first step is to check AWS CloudTrail and CloudWatch. AWS automatically logs throttling events.

  1. Go to the CloudWatch Console.
  2. Navigate to Metrics > All Metrics.
  3. Look for the Usage namespace or specific service namespaces (e.g., AWS/DynamoDB, AWS/EC2).
  4. Search for metrics like ClientThrottling, ThrottledRequests, or ReadThrottleEvents.

Step 2: Implement Exponential Backoff and Jitter

The most robust way to handle ThrottlingException and temporary AWS API timeouts is to use exponential backoff with jitter. AWS SDKs typically have built-in retry mechanisms, but they might need tuning for your specific workload.

Standard Exponential Backoff: Wait 1s, retry. Wait 2s, retry. Wait 4s, retry...

Jitter: Adding randomness to the backoff time to prevent the "thundering herd" problem where multiple threads retry at the exact same millisecond.

If you are writing custom HTTP clients or using Boto3 (Python) and need to customize the retry config, you should define a custom configuration to increase max attempts and ensure jitter is active.

Step 3: Request a Service Quota Increase

If you have optimized your API calls, implemented caching, and are using exponential backoff, but you still consistently hit the limit, you need to request a quota increase.

  1. Open the AWS Management Console and navigate to Service Quotas.
  2. Select the AWS service (e.g., Amazon EC2).
  3. Search for the specific API or resource limit (e.g., DescribeInstances API rate).
  4. Select the quota and click Request quota increase.
  5. Enter the new desired value and submit. AWS Support usually processes these within 24-48 hours.

Step 4: Architectural Improvements

If quota increases are rejected or insufficient, you must rethink your architecture:

  • Use Event-Driven Patterns: Instead of polling Describe* APIs, use AWS EventBridge to react to state changes.
  • Implement Caching: If you frequently read the same data from an AWS API (e.g., fetching Secrets Manager secrets), cache it locally in memory or use an external cache like Redis.
  • Queueing: Place incoming requests in an SQS queue and use a worker to process them at a controlled rate that respects the AWS API limits.

Frequently Asked Questions

python
import boto3
from botocore.config import Config
from botocore.exceptions import ClientError
import logging

# Configure custom retry logic with exponential backoff
# Max retries set to 10 for heavily throttled environments
custom_retry_config = Config(
    retries = {
        'max_attempts': 10,
        'mode': 'standard' # standard mode includes exponential backoff and jitter
    }
)

# Initialize the client with the custom configuration
ec2 = boto3.client('ec2', region_name='us-east-1', config=custom_retry_config)

def get_instances():
    try:
        # The SDK will now automatically retry ThrottlingException up to 10 times
        response = ec2.describe_instances()
        return response['Reservations']
    except ClientError as e:
        if e.response['Error']['Code'] == 'RequestLimitExceeded':
            logging.error("CRITICAL: Rate limit exceeded even after max retries.")
        elif e.response['Error']['Code'] in ['TimeoutException', 'ReadTimeoutError']:
            logging.error("AWS API Timeout. Service might be degraded.")
        else:
            logging.error(f"Unexpected error: {e}")
        raise

if __name__ == "__main__":
    get_instances()
E

Error Medic Editorial

The Error Medic Editorial team consists of senior DevOps engineers, SREs, and cloud architects dedicated to providing actionable, code-first solutions to complex infrastructure problems.

Sources

Related Guides