Error Medic

SendGrid Rate Limit, 401/403/502 Errors & Webhook Failures: Complete Troubleshooting Guide

Fix SendGrid rate limit exceeded, 401 authentication failed, 403 forbidden, 502 bad gateway, timeouts, and webhook errors with step-by-step diagnostic commands.

Last updated:
Last verified:
2,032 words
Key Takeaways
  • SendGrid 429 rate limit errors occur when you exceed 100 requests/second on shared IPs or hit daily sending limits tied to your plan; back off with exponential retry logic
  • 401 and 403 errors almost always trace to an invalid, revoked, or insufficiently-scoped API key — regenerate the key and verify the Authorization header uses Bearer format exactly
  • 502 bad gateway and connection refused errors indicate upstream infrastructure issues or IP block-listing; check SendGrid status page and your outbound firewall rules before escalating
  • Webhook delivery failures stem from TLS certificate mismatches, firewall rules blocking SendGrid's inbound IP ranges, or your endpoint returning non-2xx responses causing SendGrid to stop retrying
  • Always instrument your integration with retry logic (3 attempts, exponential backoff starting at 1s) and DLQ handling — transient SendGrid errors are expected at scale
SendGrid Error Fix Approaches Compared
ErrorRoot CauseFix MethodTime to ResolveProduction Risk
429 Rate LimitExceeding 100 req/s or plan daily capImplement exponential backoff + request queuing30–60 min (code change)Low — non-destructive retry
401 UnauthorizedInvalid or missing API keyRegenerate key, fix Authorization header5–10 minLow — no data loss
403 ForbiddenKey lacks required permission scopeCreate new key with Mail Send scope5–10 minLow — no data loss
502 Bad GatewaySendGrid infra degradation or IP blockWait for status resolution or rotate sending IP0–4 hoursMedium — may drop sends
Connection RefusedFirewall blocking outbound :443 to SendGridUpdate egress rules to allow api.sendgrid.com:44315–30 minMedium — requires infra change
TimeoutSlow TLS handshake or DNS resolution failureSet explicit 10s timeout, check DNS TTL20–40 minLow — retry covers it
Webhook Not FiringEndpoint returning 4xx/5xx, or IP not allowlistedReturn 200 immediately, allowlist SendGrid CIDRs20–30 minLow — historical events replayable

Understanding SendGrid API Errors

SendGrid's v3 REST API returns standard HTTP status codes, but the meaning is sometimes subtler than the code implies. Before diving into individual errors, always check https://status.sendgrid.com — roughly 20% of 502 and timeout reports are actually platform incidents unrelated to your code.


Step 1: Identify the Exact Error

Capture the full response body, not just the status code. SendGrid wraps errors in a JSON envelope:

{
  "errors": [
    {
      "message": "The provided authorization grant is invalid, expired, or revoked",
      "field": null,
      "help": "https://sendgrid.com/docs/API_Reference/Web_API_v3/How_To_Use_The_Web_API_v3/authentication.html"
    }
  ]
}

The message field is your primary diagnostic signal. Log it verbatim.


Step 2: Fix 429 — Rate Limit Exceeded

SendGrid enforces two distinct rate limits that developers frequently conflate:

Per-second limit: 100 API requests per second per API key on paid plans; lower on free tier.
Daily sending limit: Varies by plan (e.g., 100 emails/day free, 40,000/month Essentials).

When you hit the per-second limit, the response includes a X-RateLimit-Reset header (Unix timestamp) and X-RateLimit-Remaining: 0. Implement token bucket or leaky bucket rate limiting client-side:

import time
import sendgrid
from sendgrid.helpers.mail import Mail

def send_with_backoff(message: Mail, max_retries: int = 3):
    sg = sendgrid.SendGridAPIClient(api_key=os.environ['SENDGRID_API_KEY'])
    for attempt in range(max_retries):
        response = sg.send(message)
        if response.status_code == 429:
            retry_after = int(response.headers.get('X-RateLimit-Reset', time.time() + 2))
            sleep_secs = max(retry_after - time.time(), 0) + (2 ** attempt)
            time.sleep(sleep_secs)
            continue
        return response
    raise Exception('Max retries exceeded on rate limit')

For bulk sending, use SendGrid's /v3/mail/send endpoint with the personalizations array (up to 1,000 recipients per API call) to reduce request volume by orders of magnitude.


Step 3: Fix 401 — Authentication Failed

The error message The provided authorization grant is invalid, expired, or revoked means one of:

  1. Wrong header format: Must be Authorization: Bearer SG.xxxxx, not Authorization: SG.xxxxx or Authorization: Basic ...
  2. Key was revoked: Log in to SendGrid dashboard → Settings → API Keys. If the key shows "Revoked", generate a new one.
  3. Whitespace/encoding issue: API keys copy-pasted from terminals sometimes include trailing newlines. Trim your key.
  4. Wrong environment: Your staging service is using a production key associated with a different account, or vice versa.

Quick validation:

curl -X GET "https://api.sendgrid.com/v3/user/profile" \
  -H "Authorization: Bearer $SENDGRID_API_KEY" \
  -H "Content-Type: application/json"

A successful response returns your account profile. A 401 confirms the key itself is the problem.


Step 4: Fix 403 — Forbidden / Insufficient Permissions

403 differs from 401: your key is valid, but it lacks the required scope. This is common when:

  • Key was created with "Restricted Access" and mail.send was not checked
  • You are calling an endpoint (e.g., /v3/suppression/unsubscribes) that requires suppression.read or suppression.write

Fix: Go to SendGrid → Settings → API Keys → Edit the key → expand permissions. For sending mail, ensure Mail Send → Full Access is enabled. For webhook management, enable Mail Settings.

Never use a Full Access key in production application code. Use a Restricted key with only the scopes your app needs.


Step 5: Fix 502 — Bad Gateway

A 502 from SendGrid means their edge proxy received no valid response from the backend. Your options:

  1. Check status page first: https://status.sendgrid.com. If there's an active incident, implement a circuit breaker and queue sends locally.
  2. Check if your IP is block-listed: SendGrid may be blocking requests from IP ranges associated with abuse. Test from a different network or CI runner.
  3. Check DNS resolution: dig api.sendgrid.com should resolve to Twilio-owned IPs. If you see an internal IP or NXDOMAIN, your DNS resolver has a stale or poisoned record.
  4. TLS inspection proxies: Corporate networks or Kubernetes service meshes that perform TLS inspection can break HTTPS to external APIs. Confirm your pod's egress path.

Step 6: Fix Webhook Not Working

SendGrid Event Webhooks POST to your endpoint for every email event (delivered, opened, bounced, etc.). The most common failure modes:

Your endpoint returns non-2xx: SendGrid retries for 72 hours with exponential backoff, then stops. Check your application logs — a 500 during webhook processing is the #1 silent failure.

IP allowlisting: If your endpoint sits behind a WAF or security group, you must allowlist SendGrid's webhook IP ranges. Retrieve the current list programmatically:

curl -s https://api.sendgrid.com/v3/ips \
  -H "Authorization: Bearer $SENDGRID_API_KEY" | jq '.[].ip'

As of 2025, SendGrid webhook traffic also originates from Twilio's IP ranges published at https://www.twilio.com/en-us/help/account/general/twilio-ip-addresses.

Signed webhooks: If you enabled webhook signature verification, your endpoint must validate the X-Twilio-Email-Event-Webhook-Signature and X-Twilio-Email-Event-Webhook-Timestamp headers. A mismatch causes your endpoint to reject valid payloads:

from sendgrid.helpers.eventwebhook import EventWebhook, EventWebhookHeader

def verify_webhook(request):
    ew = EventWebhook()
    ec = ew.convert_public_key(os.environ['SENDGRID_WEBHOOK_PUBLIC_KEY'])
    valid = ew.verify_signature(
        ec,
        request.data,
        request.headers[EventWebhookHeader.SIGNATURE()],
        request.headers[EventWebhookHeader.TIMESTAMP()]
    )
    if not valid:
        return 'Invalid signature', 403
    # process events
    return '', 200

Always return 200 immediately, process asynchronously. Push events to a queue (Redis, SQS, RabbitMQ) and return 200 OK within 10 seconds, or SendGrid will consider the delivery failed.


Step 7: Fix Connection Refused / Timeout

These errors originate in your infrastructure, not SendGrid's:

  • Connection refused: Your outbound egress on port 443 is blocked. Check nc -zv api.sendgrid.com 443. If it fails, update your security group / NSG / iptables to allow TCP egress to 0.0.0.0/0:443 or specifically to 167.89.0.0/17 (SendGrid's primary CIDR).
  • Timeout: DNS TTL misconfiguration (set to > 300s for external API hosts), or you are not setting an explicit HTTP timeout. Always set connect_timeout=5s, read_timeout=30s in your HTTP client. Never rely on the OS default (which is often infinite).

Frequently Asked Questions

bash
#!/usr/bin/env bash
# SendGrid Diagnostic Script
# Run from the host experiencing issues
# Usage: SENDGRID_API_KEY=SG.xxx bash sendgrid_diag.sh

set -euo pipefail
SG_API="https://api.sendgrid.com/v3"
KEY="${SENDGRID_API_KEY:-}"

if [[ -z "$KEY" ]]; then
  echo "ERROR: Set SENDGRID_API_KEY environment variable"
  exit 1
fi

echo "=== 1. DNS Resolution ==="
dig +short api.sendgrid.com || echo "FAIL: DNS resolution failed"

echo "\n=== 2. TCP Connectivity (port 443) ==="
nc -zv -w5 api.sendgrid.com 443 2>&1 && echo "PASS" || echo "FAIL: Check egress firewall rules"

echo "\n=== 3. TLS Handshake ==="
curl -sSo /dev/null -w "TLS: %{ssl_verify_result} | HTTP: %{http_code} | Time: %{time_total}s\n" \
  --max-time 10 https://api.sendgrid.com/v3/user/profile \
  -H "Authorization: Bearer $KEY" || echo "FAIL: TLS or timeout"

echo "\n=== 4. API Key Validation ==="
HTTP_STATUS=$(curl -s -o /tmp/sg_response.json -w "%{http_code}" \
  --max-time 10 \
  -H "Authorization: Bearer $KEY" \
  "$SG_API/user/profile")
echo "HTTP Status: $HTTP_STATUS"
if [[ "$HTTP_STATUS" == "200" ]]; then
  echo "PASS: API key is valid"
  jq -r '.email // "(no email in response)"' /tmp/sg_response.json
elif [[ "$HTTP_STATUS" == "401" ]]; then
  echo "FAIL: 401 - Invalid or revoked API key"
  jq '.errors[].message' /tmp/sg_response.json
elif [[ "$HTTP_STATUS" == "403" ]]; then
  echo "FAIL: 403 - Key lacks required scope"
else
  echo "UNEXPECTED STATUS: $HTTP_STATUS"
  cat /tmp/sg_response.json
fi

echo "\n=== 5. Rate Limit Headers ==="
curl -s -I --max-time 10 \
  -H "Authorization: Bearer $KEY" \
  "$SG_API/user/profile" | grep -i 'x-ratelimit' || echo "No rate limit headers found"

echo "\n=== 6. Check Suppression List (bounces, spam reports) ==="
curl -s --max-time 10 \
  -H "Authorization: Bearer $KEY" \
  "$SG_API/suppression/bounces?limit=5" | jq 'length' 2>/dev/null || echo "Could not fetch bounces"

echo "\n=== 7. Webhook Endpoint Test (if WEBHOOK_URL is set) ==="
if [[ -n "${WEBHOOK_URL:-}" ]]; then
  WH_STATUS=$(curl -s -o /dev/null -w "%{http_code}" \
    --max-time 10 -X POST "$WEBHOOK_URL" \
    -H "Content-Type: application/json" \
    -d '[{"event":"test","email":"test@example.com"}]')
  echo "Webhook POST status: $WH_STATUS"
  [[ "$WH_STATUS" =~ ^2 ]] && echo "PASS" || echo "FAIL: Endpoint must return 2xx"
else
  echo "SKIP: Set WEBHOOK_URL to test your webhook endpoint"
fi

echo "\n=== Diagnostics Complete ==="
E

Error Medic Editorial

Error Medic Editorial is a team of senior SREs and backend engineers with 10+ years of production experience operating high-volume email and notification pipelines on SendGrid, AWS SES, and Postmark. Our guides are validated against live API behavior and updated as vendor APIs evolve.

Sources

Related Guides