Error Medic

How to Fix Twilio Rate Limit Exceeded (Error 20429 & Twilio 503)

Fix Twilio rate limited errors (Error 20429) and 503 Service Unavailable by implementing exponential backoff, message queuing, and concurrent request tuning.

Last updated:
Last verified:
1,552 words
Key Takeaways
  • Root Cause 1: Exceeding API concurrency limits (default 100 concurrent connections) triggers Error 20429 (Too Many Requests).
  • Root Cause 2: Exceeding Carrier Message Per Second (MPS) limits overloads Twilio's internal queue, dropping requests.
  • Root Cause 3: Returning slow webhook responses (>15s) to Twilio results in a Twilio 503 Service Unavailable error.
  • Quick Fix: Implement exponential backoff for HTTP 429/503 responses, and use a queue (Celery/Redis) to throttle outbound bulk messaging.
Fix Approaches Compared
MethodWhen to UseTime to ImplementRisk Profile
Exponential BackoffHandling intermittent 429s/503s on standard API requestsLow (1 hour)Low
Message Queuing (Redis/Celery)Sustained high-volume outbound SMS/Voice loopsHigh (Days)Low
Twilio Messaging ServicesScaling SMS throughput horizontally across multiple numbersMedium (Hours)Medium
Upgrading Sender TypeHitting hard 1 MPS limits on long codes (Need Short Code)High (Weeks for Verification)Low
Asynchronous WebhooksFixing Twilio 503 errors on inbound status callbacksMedium (Hours)Low

Understanding the Twilio Rate Limit Error

When scaling applications that rely on Twilio for SMS, Voice, or WhatsApp messaging, encountering rate limits is a rite of passage. If your application logs are suddenly flooded with Twilio 503 Service Unavailable or Error 20429: Too Many Requests, you have hit Twilio's infrastructure protection mechanisms.

Twilio enforces several types of limits to protect its APIs and ensure deliverability across strict carrier networks. Understanding exactly which limit you hit is the crucial first step to resolving it permanently.

The Two Main Types of Twilio Rate Limits

  1. API Concurrency Limits (Error 20429): Twilio limits the number of concurrent REST API requests. By default, most accounts are limited to 100 concurrent connections. If your application spawns 150 parallel threads trying to send SMS simultaneously, the 101st request will likely be rejected with an HTTP 429 (Too Many Requests) and Twilio Error Code 20429.
  2. Message Per Second (MPS) Limits: Carriers strictly regulate how fast messages can be sent from specific phone number types. For US/Canada destinations:
    • Local Long Codes (10DLC): Typically limited to 1 Message Per Second (MPS).
    • Toll-Free Numbers: Default is 3 MPS.
    • Short Codes: Default is 100+ MPS.

If you exceed the MPS limit, Twilio will actually accept the API request initially (returning a 200 OK) but will place the message in an internal queue. However, Twilio's queue holds a maximum of 4 hours' worth of traffic. If you exceed that maximum queue size, you will be hard rate limited with a 429 error.

Why am I getting a Twilio 503 Error?

A 503 Service Unavailable error usually points to a different bottleneck:

  1. Outbound to Twilio (Rare): Twilio's API itself might be experiencing temporary degradation or you have severely overloaded a specific regional edge location.
  2. Inbound Webhooks (Most Common): If Twilio is trying to send a webhook to your server (e.g., an incoming SMS notification or a status callback) and your server is overwhelmed, your server might return a 503. Alternatively, if your server takes more than 15 seconds to process the request, Twilio will time out and log a 503 Error in your console.

Step 1: Diagnose the Exact Bottleneck

Before refactoring your code, confirm exactly which limit is triggering the failure.

Action 1: Check the Twilio Debugger Navigate to Twilio Console > Monitor > Logs > Error Logs. Look for Error 20429. The error details payload will specify whether it was a concurrency API limit or a message queue overflow.

Action 2: Analyze Your Application Logs Determine the direction of the failure:

  • Failing on API requests (Sending): You are sending too fast. You need to implement backoff and outbound throttling.
  • Failing on Webhooks (Receiving): You are processing too slow. You need to decouple webhook reception from background processing.

Step 2: Fix - Implement Exponential Backoff

If you are hitting the API Concurrency limit (Error 20429) or temporary 503s, the immediate code-level fix is to handle the HTTP 429/503 responses gracefully.

Instead of failing the background job or dropping the message entirely, catch the TwilioRestException and retry after an exponentially increasing delay (e.g., wait 1s, then 2s, 4s, 8s). Twilio occasionally includes a Retry-After header in 429 responses, but an exponential backoff with random jitter is the industry standard approach to avoid the thundering herd problem.

Note: See the code block below for a robust Python implementation using the tenacity library.


Step 3: Fix - Architect for High Throughput (Queuing)

If you are triggering rate limits because your app iterates through a database and fires off thousands of SMS messages synchronously in a for loop, you must redesign the architecture.

Implement a Message Queue (Celery / Redis / AWS SQS)

Never send bulk messages synchronously in your main application thread.

  1. Push messaging tasks into a queue system (like RabbitMQ or Redis).
  2. Configure your workers to process the queue at a controlled rate that matches your Twilio MPS limits.

For example, if you have a single standard US long code (1 MPS limit), configure your Celery worker to process a maximum of 1 task per second using task-level rate limiting (e.g., rate_limit='1/s').

Utilize Twilio Messaging Services (Number Pooling)

If 1 MPS is too slow for your business needs, you don't necessarily have to spend thousands of dollars on a Short Code. You can create a Twilio Messaging Service and add multiple long codes to its Sender Pool.

If you add 10 local numbers to a Messaging Service pool, Twilio's Copilot feature will automatically distribute your outbound messages across all 10 numbers, effectively increasing your aggregate throughput to 10 MPS. You simply change your API call to use the MessagingServiceSid instead of a hardcoded From phone number.


Step 4: Fix Webhook 503s (Decouple Processing)

If Twilio is logging 503 errors when trying to reach your server via webhooks, your server is taking too long to respond to status callbacks or incoming messages.

The Golden Rule of Webhooks: Acknowledge first, process later.

When Twilio hits your /webhook endpoint, follow this pattern:

  1. Immediately parse and validate the incoming HTTP payload.
  2. Push the payload into an internal asynchronous queue (e.g., Kafka, SQS, Redis) or drop it into a fast database table.
  3. Instantly return an HTTP 200 OK with an empty <Response></Response> TwiML snippet (or a standard JSON 200 OK if it's a delivery status callback).
  4. Have a separate background worker pick up the event to process the heavy logic (updating user records, triggering follow-up emails, etc.).

By ensuring your webhook endpoint always responds in less than 200ms, you will completely eliminate "Twilio 503" and timeout errors on inbound webhook requests.

Frequently Asked Questions

python
import os
import logging
from twilio.rest import Client
from twilio.base.exceptions import TwilioRestException
from tenacity import retry, wait_exponential, stop_after_attempt, retry_if_exception_type

# Configure logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

# Initialize Twilio Client
client = Client(os.environ.get('TWILIO_ACCOUNT_SID'), os.environ.get('TWILIO_AUTH_TOKEN'))

@retry(
    retry=retry_if_exception_type(TwilioRestException),
    wait=wait_exponential(multiplier=1, min=2, max=30), # Wait 2^x * 1 seconds between retries, max 30s
    stop=stop_after_attempt(5),
    before_sleep=lambda retry_state: logger.warning(f"Rate limited/503 encountered. Retrying... (Attempt {retry_state.attempt_number})")
)
def send_sms_with_backoff(to_number, from_number, body):
    """
    Sends an SMS using Twilio with automatic exponential backoff 
    specifically for rate limits (429) and temporary service unavailabilities (503).
    """
    try:
        message = client.messages.create(
            body=body,
            from_=from_number,
            to=to_number
        )
        logger.info(f"Message sent successfully! SID: {message.sid}")
        return message.sid
        
    except TwilioRestException as e:
        # Check if the error is a rate limit or service unavailability
        if e.status in [429, 503]:
            logger.warning(f"Twilio API throttled/unavailable (Status: {e.status}). Raising for Tenacity retry.")
            raise  # Triggers tenacity exponential backoff retry
        else:
            # Fail fast for 400 Bad Request, 401 Unauthorized, 404 Not Found, etc.
            logger.error(f"Non-retriable Twilio error: {e}")
            raise

# Example Usage:
# try:
#     send_sms_with_backoff("+15558675309", "+15551234567", "Hello from the robust sender queue!")
# except Exception as final_error:
#     logger.error(f"Failed to send message after all retries: {final_error}")
E

Error Medic Editorial

The Error Medic Editorial team consists of Senior DevOps and Site Reliability Engineers specializing in API integrations, distributed systems, and cloud-native troubleshooting.

Sources

Related Guides