Fixing 'Connection Refused' on Azure VMs: A Comprehensive Troubleshooting Guide
Diagnose and resolve 'Connection Refused' (ERR_CONNECTION_REFUSED) and SSH/RDP timeouts on Azure Virtual Machines. Learn to fix NSGs, firewalls, and throttling.
- Network Security Group (NSG) misconfigurations are the most common cause of connection refused errors.
- Guest OS firewall rules (iptables, Windows Firewall) or stopped services can actively reject incoming requests.
- Azure resource throttling or exhaustion of SNAT ports can cause intermittent connection failures and timeouts.
- Quick Fix: Verify NSG rules on both the subnet and network interface (NIC) levels, and check the Azure VM boot diagnostics.
| Method | When to Use | Time | Risk |
|---|---|---|---|
| Azure Network Watcher (IP Flow Verify) | Testing NSG rules and routing without logging into the VM | 5 mins | Low |
| Azure Serial Console | When SSH/RDP are completely down but the VM is running | 10 mins | Low |
| Reset SSH / RDP Configuration | Suspected misconfiguration in sshd_config or Remote Desktop Services | 5 mins | Medium |
| Redeploy / Reapply VM | Underlying Azure host issues or stuck provisioning states | 15 mins | Medium |
Understanding the Error
The Connection Refused error (often seen as ERR_CONNECTION_REFUSED in browsers, or ssh: connect to host port 22: Connection refused in terminals) signifies that your client successfully reached a server, but the server explicitly rejected the connection attempt on the specified port. This is distinct from a timeout, where packets are simply dropped and ignored.
In the context of Azure Virtual Machines, this typically points to one of three layers:
- The Azure Network Fabric: Network Security Groups (NSGs) or Azure Firewall rules blocking the port.
- The Guest Operating System: The internal firewall (e.g.,
iptables,ufw,firewalld, or Windows Defender Firewall) is rejecting the traffic. - The Application/Service: The service (like Nginx, Apache, or SSHd) is not running, or it's listening on the wrong interface (e.g.,
127.0.0.1instead of0.0.0.0).
Additionally, if you are experiencing intermittent 'Connection Refused' or timeouts, you may be hitting Azure VM throttling limits (Network Bandwidth, Disk IOPS, or SNAT port exhaustion on Standard Load Balancers).
Step 1: Diagnose with Azure Network Watcher
Before digging into the OS, rule out Azure-level blocking. Use the IP Flow Verify tool in Azure Network Watcher.
- Navigate to Network Watcher in the Azure Portal.
- Select IP flow verify.
- Choose your VM, the network interface, and the protocol (TCP/UDP).
- Enter your local IP address as the source and the VM's private IP as the destination, along with the port (e.g., 22 or 443).
If the result is Access denied, the tool will tell you exactly which NSG rule is blocking the traffic. If it says Access allowed, the issue is inside the guest OS.
Step 2: Check Guest OS Services and Firewalls
If Azure allows the traffic, but the connection is refused, the service might be down. If you cannot SSH/RDP into the machine, use the Azure Serial Console or the Run Command feature.
Using Run Command to check services (Linux): Execute a shell script via the portal to check if your service is listening:
sudo netstat -tulnp | grep LISTEN
sudo systemctl status sshd
If the service is bound to 127.0.0.1, external connections will be refused. It must be bound to 0.0.0.0 (all IPv4) or :: (all IPv6).
Step 3: Mitigating Azure VM Throttling
Sometimes, heavy load causes a service to crash or the OS to drop connections, manifesting as refused connections or severe lag.
- SNAT Port Exhaustion: If your VM makes many outbound connections, it might run out of Source Network Address Translation (SNAT) ports. This causes outbound connections to fail, which can cascade into inbound services failing if they depend on external APIs or databases. Fix this by using a NAT Gateway or an Azure Standard Load Balancer with configured outbound rules.
- Network Limits: Check the VM metrics in the Azure portal for
Network In TotalandNetwork Out Total. Compare these against the documented limits for your VM size. If you are consistently hitting the cap, you need to resize the VM to a larger tier that supports higher network bandwidth.
Frequently Asked Questions
# Example Azure CLI commands to diagnose and fix NSG issues
# 1. List all NSG rules for a specific Network Interface to spot blocks
az network nic show --resource-group MyResourceGroup --name MyNic --query "networkSecurityGroup.id" -o tsv
az network nsg rule list --resource-group MyResourceGroup --nsg-name MyNsg --output table
# 2. Add an explicit allow rule for SSH (Port 22) from a specific IP
az network nsg rule create \
--resource-group MyResourceGroup \
--nsg-name MyNsg \
--name Allow-SSH-Custom \
--priority 100 \
--source-address-prefixes 203.0.113.50 \
--destination-port-ranges 22 \
--access Allow \
--protocol Tcp
# 3. Use 'Run Command' to reset SSH configuration inside a Linux VM if locked out
az vm run-command invoke \
--resource-group MyResourceGroup \
--name MyVm \
--command-id ResetSSHError Medic Editorial
Error Medic Editorial is a collective of senior Cloud Architects and DevOps engineers dedicated to solving the most frustrating infrastructure bugs. With over a combined 50 years of experience across Azure, AWS, and GCP, we break down complex network and system failures into actionable fixes.
Sources
- https://learn.microsoft.com/en-us/troubleshoot/azure/virtual-machines/troubleshoot-ssh-connection
- https://learn.microsoft.com/en-us/azure/network-watcher/network-watcher-ip-flow-verify-overview
- https://learn.microsoft.com/en-us/azure/load-balancer/load-balancer-outbound-connections
- https://github.com/MicrosoftDocs/azure-docs/issues/45612