Why do I get 'connection refused' when connecting remotely, but local connections work fine?

This is almost always a configuration issue. By default, PostgreSQL only listens on 'localhost'. You need to change 'listen_addresses = '*'' in postgresql.conf, add the remote IP subnet to pg_hba.conf, and ensure your Linux firewall (ufw/iptables) allows TCP traffic on port 5432.

What causes PostgreSQL to crash with an OOM (Out of Memory) error?

OOM crashes occur when the OS runs out of physical RAM and swap. In PostgreSQL, this is usually caused by setting 'shared_buffers' too high (above 25-40% of RAM), or having a high 'work_mem' combined with hundreds of active connections doing complex sorts/hashes. The Linux OOM killer will step in and terminate the postgres process.

How can I recover my PostgreSQL database after a disk full crash?

First, free up disk space on the $PGDATA partition by removing non-essential files, old logs, or temporary OS files. NEVER manually delete files inside the pg_wal or pg_xlog directories, as this will destroy your database. Once space is freed, start the PostgreSQL service; it will automatically perform crash recovery using the WAL files.

My PostgreSQL service won't start and says 'could not bind IPv4 address: Address already in use'. How do I fix this?

This means another process is already listening on port 5432, or a crashed postgres instance left a zombie process running. Use 'sudo netstat -plnt | grep 5432' or 'ss -nltp | grep 5432' to find the PID holding the port, kill that process, and then start PostgreSQL.

Troubleshooting 'PostgreSQL Connection Refused': Resolving Crashes, OOM, and Service Failures on Linux

Q: How do I fix 'FATAL: remaining connection slots are reserved for non-replication superuser connections'?

This means your application has hit the 'max_connections' limit defined in postgresql.conf. As a temporary fix, log in as the postgres superuser (who has reserved slots) and run 'SELECT pg_terminate_backend(pid) FROM pg_stat_activity' to kill idle connections, or increase 'max_connections' and restart. The permanent fix is implementing PgBouncer.

Diagnostic and Fix Approaches Compared
Method	When to Use	Time	Risk
Service Restart (systemctl)	PostgreSQL failed to start or stopped unexpectedly	< 5 mins	Low
Tuning max_connections	Seeing 'too many connections' FATAL errors	10 mins	Medium (Increases RAM usage)
Configuring pg_hba.conf	Remote clients get 'permission denied' or IP rejections	5 mins	Low (Requires reload, no downtime)
Clearing Disk Space / Archiving WAL	Database panicked due to 'disk full' in pg_wal	30+ mins	High (Never manually delete WAL files)
Adjusting shared_buffers / work_mem	Kernel OOM killer is terminating PostgreSQL instances	15 mins	Medium (Requires restart and performance testing)

Understanding the 'Connection Refused' Error in PostgreSQL

The dreaded psql: error: connection to server at "localhost" (127.0.0.1), port 5432 failed: Connection refused is a symptom, not a root cause. It explicitly means the operating system rejected the TCP/IP or Unix domain socket connection attempt. This happens either because the PostgreSQL daemon (postgres) is not running, it is not listening on the expected interface/port, or a network firewall is actively blocking the request.

When PostgreSQL is not working, the failure often cascades from underlying systemic issues: a PostgreSQL crash due to an Out of Memory (OOM) event, a disk full scenario corrupting the startup sequence, or resource exhaustion like 'too many connections'. In this comprehensive guide, we will dissect each scenario, from basic service interruptions to complex core dumps and high CPU bottlenecks.

Step 1: Validating the Service and Socket State

Before diving into complex logs, you must determine if the database process is alive.

1. Check the Service Status: Use systemd to verify the state of the PostgreSQL service on your Linux host.

systemctl status postgresql

If the service is marked as failed or inactive, PostgreSQL is not starting. You must check journalctl -xeu postgresql to find the exact startup failure reason.

2. Verify Listening Ports: If the service is active, ensure it is actually binding to the correct port.

ss -nltp | grep 5432

If you do not see postgres listening on 0.0.0.0:5432 (all IPv4) or 127.0.0.1:5432 (localhost), the server might only be listening on a Unix socket, or the listen_addresses parameter in postgresql.conf is misconfigured.

Step 2: Diagnosing PostgreSQL Crashes and Core Dumps

If PostgreSQL failed to start or stopped unexpectedly during operation, the system logs are your absolute source of truth.

The OOM Killer (PostgreSQL Out of Memory) PostgreSQL is highly memory-dependent. If shared_buffers or work_mem is configured too aggressively relative to your server's physical RAM, the Linux kernel's Out of Memory (OOM) killer will forcibly terminate the postgres process to save the OS environment. This results in a sudden crash and "connection refused" for all subsequent client requests. Check the kernel ring buffer for OOM events:

dmesg -T | grep -i -E 'killed process|oom'

Fix: Reduce shared_buffers (typically 25% of total RAM is the recommended starting point) and lower work_mem. Ensure your Linux system has adequate swap space configured to prevent sudden OOM termination. You can also adjust the oom_score_adj for the PostgreSQL process, though fixing the memory configuration is the permanent solution.

Segmentation Faults and Core Dumps A postgresql segfault usually indicates a bug in a PostgreSQL extension (like PostGIS or TimescaleDB), hardware memory corruption, or a severe data corruption issue. Check your PostgreSQL logs (usually located in /var/log/postgresql/ or /var/lib/pgsql/data/log/ depending on your distro):

tail -n 100 /var/log/postgresql/postgresql-14-main.log

Look for errors like: server process (PID ...) was terminated by signal 11: Segmentation fault. Fix: Disable suspected third-party extensions, run a memory test (memtester) on the host hardware, and ensure you are running the latest minor release of your PostgreSQL version to patch known C-level bugs. If a core dump is generated, you may need to use gdb to inspect the backtrace.

Step 3: Resolving 'Disk Full' and Storage Failures

PostgreSQL will immediately stop accepting connections, enter a read-only mode, and eventually crash if the partition containing its data directory ($PGDATA) or Write-Ahead Logs (pg_wal / pg_xlog) runs out of space. The logs will scream: PANIC: could not write to file "pg_wal/xlog...": No space left on device.

1. Identify the Full Mount:

df -h /var/lib/postgresql

2. Remediation: CRITICAL WARNING: Do NOT manually delete files in pg_wal/ or pg_xlog/. Doing so will corrupt your database permanently and result in unrecoverable data loss. Instead, clear temporary OS files, compress old application logs residing on the same partition, or add a new virtual disk and symlink non-critical directories. If your archive_command is failing and causing WAL logs to pile up, fix the destination storage so PostgreSQL can successfully archive and rotate the WAL files automatically.

Step 4: 'Too Many Connections' and Connection Exhaustion

If you receive the error message FATAL: remaining connection slots are reserved for non-replication superuser connections, PostgreSQL is actively running, but it has exhausted its max_connections limit.

This usually occurs due to connection leaks in your application logic or the complete lack of connection pooling.

Quick Fix: Increase max_connections in postgresql.conf:

max_connections = 300

(Note: Changing this parameter requires a full service restart: systemctl restart postgresql).

Long-term Fix: Deploy a connection pooler like PgBouncer or Pgpool-II. PostgreSQL forks a completely new OS process for every incoming connection. Meaning, high connection counts lead to severe memory bloat, high CPU context switching, and eventually an OOM crash or extreme database slowness. A pooler multiplexes thousands of client connections into a small handful of actual PostgreSQL backend processes.

Step 5: Fixing Permission Denied and Authentication Errors

Sometimes the network connection succeeds, but PostgreSQL actively rejects the login with FATAL: permission denied for database "mydb" or FATAL: Ident authentication failed for user "myuser".

1. Network Bindings (postgresql.conf) By default, PostgreSQL only listens on local loopback. To allow external network connections, edit postgresql.conf:

listen_addresses = '*'

2. Client Authentication (pg_hba.conf) You must explicitly allow the incoming IP address subnet and specify the authentication method in pg_hba.conf. Add a rule for your application servers at the top of the file:

host    all             all             10.0.0.0/16             scram-sha-256

After modifying these files, reload the configuration (no restart required for pg_hba.conf edits):

systemctl reload postgresql
# Or via SQL:
# SELECT pg_reload_conf();

Step 6: Addressing High CPU and Slow Performance

When PostgreSQL is excessively slow, experiencing postgresql high cpu, or blocking heavily on locks, application clients might time out, simulating a "connection refused" or "connection dropped" at the application layer.

Identify rogue, long-running queries using the pg_stat_activity system view:

SELECT pid, usename, state, query, extract(epoch FROM now() - query_start) AS duration_seconds
FROM pg_stat_activity
WHERE state != 'idle' 
ORDER BY duration_seconds DESC;

Kill stuck or blocked queries holding locks:

SELECT pg_terminate_backend(pid);

Long-running transactions hold locks and prevent the autovacuum daemon from clearing dead tuples. This leads to massive table bloat and catastrophic performance degradation. Ensure autovacuum is enabled, running frequently, and aggressively tuned for your high-transaction tables.