Why are alerts firing in the Grafana UI but not showing up in my external Alertmanager?

This typically happens if Grafana cannot reach the Alertmanager URL (network issue), the URL is misconfigured (e.g., has a trailing slash or appended API path), or if Grafana's internal routing is set to use the built-in Alertmanager instead of the external one. Check your contact point routing policies.

What causes the 'Failed to send alert... context deadline exceeded' error?

A `context deadline exceeded` error indicates a timeout. Grafana attempted to connect to the Alertmanager endpoint, but the network request took too long. This is almost always caused by a firewall, a Kubernetes NetworkPolicy blocking egress traffic from the Grafana pod, or a proxy dropping the connection.

How do I fix '401 Unauthorized' when connecting to Grafana Mimir Alertmanager?

Mimir uses multi-tenancy. You must inject the `X-Scope-OrgID` HTTP header in your Grafana contact point configuration. Ensure the header value matches the exact tenant ID configured in your Mimir cluster.

Can I use both the Grafana Internal Alertmanager and an External Alertmanager simultaneously?

Yes. In Grafana Unified Alerting (GUA), you can configure an External Alertmanager as a contact point. In your Notification Policies, you can create a route that sends the alert to both the internal receiver (like an email) and the external Alertmanager contact point simultaneously.

Why am I receiving duplicate alerts from my Alertmanager cluster?

Duplicate alerts usually mean your external Alertmanager cluster is experiencing a 'split-brain' issue. The instances cannot communicate over their gossip protocol (port 9094 by default). Because they cannot share state, both instances process the alert independently and send duplicate notifications.

Troubleshooting Grafana Alertmanager: Fixing Sync Failures, Connection Drops, and Mimir/Loki Integration Errors

Comprehensive guide to fixing Grafana Alertmanager failures, including external Alertmanager sync issues, Grafana Cloud auth errors, and Mimir/Loki tenant misco

Last updated: February 24, 2026

Last verified: February 24, 2026

1,754 words

Key Takeaways

Root Cause 1: Network isolation or DNS resolution failures preventing the Grafana server from reaching the external Alertmanager API (e.g., `no such host`).
Root Cause 2: Misconfigured or missing `X-Scope-OrgID` headers causing 401/403 errors when routing alerts to Grafana Mimir or Loki Alertmanagers.
Root Cause 3: Legacy alerting configurations conflicting with Grafana Unified Alerting (GUA), leading to duplicate or silently dropped alert payloads.
Quick Fix: Validate API reachability from within the Grafana container using `curl`, check the Contact Points provisioning YAML for syntax errors, and ensure the Alertmanager URL does not include a trailing slash.

Alertmanager Configuration & Fix Approaches Compared
Method	When to Use	Time to Implement	Risk Level
Grafana UI (Contact Points)	Initial setup, quick debugging, or isolated testing of payload formats.	5 mins	High (Configuration drift, manual errors)
File Provisioning (YAML)	Production environments, GitOps workflows, Kubernetes deployments.	15 mins	Low (Version controlled, reproducible)
mimirtool (CLI)	Configuring distributed Alertmanager in Grafana Mimir or Grafana Cloud.	10 mins	Medium (Requires API key management)
Direct API POST	Verifying Alertmanager receiver functionality independently of Grafana.	5 mins	Low (Read/Test only)

Understanding Grafana and Alertmanager Architecture

When transitioning to Grafana 8+ and Grafana Unified Alerting (GUA), the architecture of how alerts are processed shifted significantly. Grafana now includes a built-in Alertmanager, but many enterprise architectures rely on a Grafana external Alertmanager, or distributed Alertmanagers backed by Grafana Mimir or Grafana Loki.

Troubleshooting issues in this ecosystem requires understanding the data flow: Grafana evaluates alert rules (or queries backend data sources like Prometheus/Loki) -> The state changes to Firing -> Grafana constructs a payload -> Grafana pushes this payload via HTTP POST to the Alertmanager API (/api/v2/alerts) -> Alertmanager routes, groups, and deduplicates the alert -> Alertmanager sends the notification to the receiver (Slack, PagerDuty, Webhook).

Failures can occur at any boundary in this pipeline. This guide focuses on the critical boundary between Grafana and the Alertmanager.

Common Error Messages

You are likely reading this guide because you have encountered one of the following exact error messages in your Grafana server logs (/var/log/grafana/grafana.log or kubectl logs deployment/grafana):

Failed to send alert to Alertmanager: Post "http://alertmanager:9093/api/v2/alerts": dial tcp: lookup alertmanager on 10.96.0.10:53: no such host
level=error msg="unable to sync alertmanager configuration" err="bad response status 400 Bad Request"
level=error msg="Failed to send alert notifications" err="context deadline exceeded"
level=error msg="failed to send alerts to all alertmanagers" err="1 errors: Post \"https://alertmanager-us-central1.grafana.net/api/prom/push\": net/http: request canceled (Client.Timeout exceeded while awaiting headers)"

Step 1: Diagnosing Network and DNS Issues

The most frequent cause of an external Alertmanager failing to receive alerts from Grafana is network reachability. Grafana must be able to resolve the hostname and establish a TCP connection to the Alertmanager port (usually 9093).

Validation from the Grafana Container

Do not test connectivity from your local workstation. You must test from the environment where the Grafana process is running.

Exec into the Grafana container/server:

kubectl exec -it deploy/grafana -n monitoring -- /bin/sh
# or
docker exec -it grafana /bin/sh

Test DNS Resolution:
```
nslookup alertmanager.monitoring.svc.cluster.local
```
If this fails, your CoreDNS or equivalent DNS service is failing to resolve the service name. Check your Kubernetes service definitions.
Test API Reachability with Curl:
```
curl -v http://alertmanager.monitoring.svc.cluster.local:9093/-/ready
```
You should receive a 200 OK response. If you get Connection refused, the Alertmanager process is not binding to 0.0.0.0 or the port is mismatched. If you get context deadline exceeded or it hangs, check your NetworkPolicies, security groups, or firewalls blocking traffic between the Grafana and Alertmanager nodes.

Step 2: Fixing Grafana External Alertmanager Configuration

If the network is healthy, the issue usually lies in how Grafana is configured to talk to the external Alertmanager.

Legacy `grafana.ini` vs. Provisioning UI

In older versions of Grafana, you might have configured the external Alertmanager in grafana.ini under the [alerting] or [unified_alerting] blocks.

Warning: Relying solely on grafana.ini for contact points can lead to silent failures if the API schema changes. The modern approach is to use the UI or Provisioning YAML.

If you are using Provisioning YAML (/etc/grafana/provisioning/alerting/alertmanager.yaml), verify the syntax carefully:

apiVersion: 1
contactPoints:
  - orgId: 1
    name: 'External Alertmanager'
    receivers:
      - uid: ext-am-1
        type: prometheus-alertmanager
        settings:
          url: http://alertmanager.monitoring.svc.cluster.local:9093

Crucial Fix: Ensure the url does NOT contain a trailing slash or the /api/v2/alerts path. Grafana automatically appends the correct API path. Providing http://alertmanager:9093/ or http://alertmanager:9093/api/v2/alerts will result in Grafana calling http://alertmanager:9093/api/v2/alerts/api/v2/alerts, yielding a 404 Not Found.

Step 3: Grafana Cloud Alertmanager & Mimir Integrations

Integrating Grafana with Grafana Cloud Alertmanager or a self-hosted Grafana Mimir Alertmanager introduces multi-tenancy. Multi-tenancy requires strict authentication, usually handled via HTTP headers.

The Missing Tenant ID Error (`401 Unauthorized` or `400 Bad Request`)

If you see authorization errors when syncing configurations or firing alerts to Mimir/Loki, you are likely missing the X-Scope-OrgID header. Mimir requires this header to know which tenant's Alertmanager configuration to apply the alert to.

Fixing Mimir Alertmanager Integration

If configuring via the Grafana UI (Alerting -> Alertmanagers -> Add Alertmanager):

Set the URL to your Mimir Alertmanager endpoint (e.g., http://mimir-gateway/alertmanager).
Under Custom HTTP headers, add:
- Header: X-Scope-OrgID
- Value: <your-tenant-id> (e.g., tenant-a or anonymous if auth is disabled but multitenancy is enabled).

If configuring via Provisioning:

apiVersion: 1
contactPoints:
  - orgId: 1
    name: 'Mimir Alertmanager'
    receivers:
      - uid: mimir-am
        type: prometheus-alertmanager
        settings:
          url: http://mimir-gateway.mimir.svc.cluster.local/alertmanager
          httpHeaderName1: 'X-Scope-OrgID'
          httpHeaderValue1: 'tenant-a'

Grafana Cloud Specifics

For Grafana Cloud Alertmanager, the URL and authentication are provided via Basic Auth. Ensure you are using the correct Cloud Access Policy token with alerts:write permissions.

URL: https://alertmanager-<region>.grafana.net
Basic Auth User: <your-cloud-username/instance-id>
Basic Auth Password: <your-access-policy-token>

Step 4: Troubleshooting High Availability (HA) Alertmanager Gossip

If you are running multiple instances of Alertmanager (HA mode) and users complain about receiving duplicate alert notifications (e.g., two Slack messages for the same incident), your external Alertmanager cluster is suffering from a "split-brain" scenario.

Alertmanager instances use a gossip protocol over TCP/UDP port 9094 to synchronize silence states and notification logs. If Grafana pushes an alert to an external AM cluster, and the cluster members cannot communicate, both members will independently evaluate the routing tree and send the notification.

Diagnosing Gossip Failures

Check the Alertmanager logs for gossip errors:

kubectl logs -l app=alertmanager -c alertmanager | grep gossip

Look for: msg="Failed to join cluster" err="1 error occurred: Failed to resolve alertmanager-0.alertmanager-headless:9094: no such host"

Fixing Gossip Synchronization

Ensure you have a headless service in Kubernetes specifically for the gossip ring.
Pass the --cluster.peer flag correctly to the Alertmanager binary arguments.

# Kubernetes deployment args snippet for Alertmanager
args:
  - "--config.file=/etc/alertmanager/config.yml"
  - "--storage.path=/alertmanager"
  - "--cluster.peer=alertmanager-0.alertmanager-headless.monitoring.svc.cluster.local:9094"
  - "--cluster.peer=alertmanager-1.alertmanager-headless.monitoring.svc.cluster.local:9094"

Ensure your Kubernetes NetworkPolicy allows TCP/UDP traffic on port 9094 between the Alertmanager pods.

Step 5: Advanced Debugging with `amtool` and Payload Inspection

Sometimes Grafana successfully sends the alert, but Alertmanager drops it or routes it to a "null" receiver. To isolate if the problem is Grafana generating a bad payload or Alertmanager misrouting it, bypass Grafana entirely.

Manually Pushing an Alert Payload

Use the Prometheus Alertmanager API to push a dummy alert. If this succeeds and routes correctly, the issue is Grafana's configuration. If this fails to route, the issue is in your alertmanager.yml routing tree.

curl -XPOST http://alertmanager.monitoring.svc.cluster.local:9093/api/v1/alerts -d '[
  {
    "labels": {
      "alertname": "TestManualAlert",
      "severity": "critical",
      "service": "web"
    },
    "annotations": {
      "summary": "This is a test alert bypassing Grafana."
    }
  }
]'

Check the Alertmanager UI (http://<alertmanager-ip>:9093/#/alerts). If "TestManualAlert" appears, your Alertmanager is healthy and accepting external payloads. Go back and check the Grafana contact point logs and ensure the labels matching your Alertmanager route are actually being generated by your Grafana alert rule.

Conclusion

Troubleshooting the Grafana to Alertmanager pipeline requires systematically verifying network boundaries, authentication headers (especially for Mimir/Loki/Cloud), and payload structures. By utilizing local curl tests, verifying provisioning syntax, and inspecting HA gossip logs, you can rapidly isolate and resolve alert delivery failures.

Frequently Asked Questions

bash

#!/bin/bash
# Diagnostic script to test Grafana to Alertmanager connectivity and API health

AM_URL="http://alertmanager.monitoring.svc.cluster.local:9093"
TENANT_HEADER="X-Scope-OrgID: my-tenant" # Optional: Only for Mimir/Loki

echo "1. Testing basic network connectivity..."
curl -s -o /dev/null -w "%{http_code}" ${AM_URL}/-/ready

if [ $? -ne 0 ]; then
  echo "ERROR: Cannot resolve or connect to ${AM_URL}. Check DNS and Firewalls."
  exit 1
fi

echo -e "\n2. Sending a test alert payload to Alertmanager API..."
curl -X POST -H "Content-Type: application/json" -H "${TENANT_HEADER}" ${AM_URL}/api/v2/alerts -d '[
  {
    "labels": {
      "alertname": "DiagnosticTestAlert",
      "severity": "info",
      "source": "troubleshooting-script"
    },
    "annotations": {
      "summary": "Validating Alertmanager API ingestion pipeline."
    }
  }
]'

echo -e "\n\n3. Checking Alertmanager logs for recent errors (Kubernetes)..."
kubectl logs -l app=alertmanager -n monitoring --tail=20 | grep -i -E "error|warn|failed|gossip"

Error Medic Editorial

Error Medic Editorial comprises senior Site Reliability Engineers and DevOps practitioners dedicated to solving complex infrastructure, observability, and cloud-native integration challenges.

Sources

Diagnose and fix 'connection refused', timeouts, and 'out of memory' errors in Grafana. Step-by-step solutions for network, permission, and resource issues.

Fixing "Connection Refused" and Timeout Errors in Grafana: A Comprehensive Troubleshooting Guide

Resolve Grafana 'connection refused' and timeout errors. Learn to fix OOM crashes, permission denied issues, and reverse proxy misconfigurations step-by-step.

How to Fix Grafana Connection Refused: A Complete Troubleshooting Guide

Fix 'Grafana connection refused', out of memory, permission denied, and timeout errors with our complete troubleshooting guide for DevOps and SRE teams.

Ansible Failed: Fix Connection Refused, Permission Denied & Timeout Errors

Fix Ansible failures including connection refused, permission denied, and timeout errors. Step-by-step diagnosis with real commands and verified solutions.