How do I fix a 'MariaDB too many connections' error without restarting?

If you are locked out completely, you might have to restart. However, if you configured an `extra_port` (which listens on a different port and is immune to max_connections limits), you can log in as an admin and run `SET GLOBAL max_connections = 2000;` dynamically. You should also run `SHOW PROCESSLIST;` and `KILL` sleeping or long-running queries.

What is the best way to troubleshoot a MariaDB slow query causing lock wait timeouts?

Enable the slow query log by setting `slow_query_log = 1` and `long_query_time = 2` (or lower). Analyze the log using `mysqldumpslow` or `pt-query-digest`. Timeouts (Error 1205) are usually caused by long-running `UPDATE` or `DELETE` statements holding row locks, or missing indexes causing full table scans.

My server crashed and now I have a 'MariaDB permission denied' error on start. How do I fix it?

This almost always relates to filesystem ownership, AppArmor, or SELinux. First, ensure the data directory is owned by the mysql user (`chown -R mysql:mysql /var/lib/mysql`). If that doesn't fix it, check `dmesg` or `/var/log/audit/audit.log` to see if SELinux or AppArmor is blocking mysqld from accessing the files, and update your security profiles accordingly.

How can I reduce MariaDB replication lag on a read-heavy application?

First, ensure your replica hardware (especially Disk I/O) matches the master. Second, verify you have Primary Keys on all replicated tables. Third, implement parallel replication by setting `slave_parallel_workers` (e.g., to 4 or 8) and `slave_parallel_mode = optimistic` to allow the replica to apply transactions concurrently.

Is it safe to use innodb_force_recovery=6 to fix a MariaDB crash?

No. Levels 4, 5, and 6 are extremely dangerous and can permanently corrupt your database. They disable vital safety checks (like ignoring redo logs and undo logs). Only use level 6 as an absolute last resort to salvage a `mysqldump` when you have no other working backups, and be prepared for potential data inconsistencies.

MariaDB Crash Recovery: Diagnosing OOM, Disk Full, and InnoDB Corruption

MariaDB Recovery and Mitigation Approaches Compared
Method	When to Use	Recovery Time	Data Loss Risk
Standard Crash Recovery	Unexpected restart or power loss (InnoDB auto-replays redo logs)	Minutes (depends on redo log size)	None
InnoDB Force Recovery (Levels 1-3)	Minor page corruption; database loops/crashes on startup	Hours (requires mysqldump and restore)	Low
InnoDB Force Recovery (Levels 4-6)	Severe corruption, redo log corruption, data dictionary mismatch	Hours to Days	High
Point-in-Time Restore (Binlogs)	Catastrophic storage failure or accidental DROP TABLE	Varies by database size	Minimal (depends on binlog sync interval)

Understanding MariaDB Crash Recovery

When a MariaDB instance crashes, the mysqld process terminates abnormally. Under normal operating conditions, MariaDB relies on the robust ACID compliance of the InnoDB storage engine. Upon restarting, InnoDB automatically performs MariaDB crash recovery by reading the redo logs (ib_logfile0, ib_logfile1) and applying committed transactions that were not yet flushed to the data files, while rolling back uncommitted transactions.

However, automatic crash recovery can fail. As an SRE, diagnosing why the process died and why it cannot recover is your first priority. Common culprits span the operating system, storage layers, and application misconfigurations.

Scenario 1: MariaDB Out of Memory (OOM Killer)

One of the most insidious crashes occurs when MariaDB abruptly disappears, leaving truncated or non-existent entries in /var/log/mysql/error.log. This usually points to a MariaDB out of memory event triggered by the Linux kernel's OOM Killer.

The Symptom: The database goes offline. Trying to check the status yields: Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2).

The Diagnosis: Check the kernel message buffer:

dmesg -T | grep -i oom

You will likely see an output like: Out of memory: Killed process 12345 (mysqld) total-vm:8543200kB, anon-rss:6291456kB.

The Fix: The OOM killer targets mysqld because it is consuming too much RAM. The primary culprit is an oversized InnoDB Buffer Pool.

Open /etc/mysql/mariadb.conf.d/50-server.cnf.
Adjust innodb_buffer_pool_size. A rule of thumb is 60-70% of total system RAM on a dedicated database server, but less if other services are co-located.
Check for memory leaks or excessive per-connection buffers (e.g., join_buffer_size, sort_buffer_size).

Scenario 2: MariaDB Disk Full Error

Running out of disk space brings MariaDB to an immediate halt.

The Symptom: You find the following exact error in the logs: [ERROR] mysqld: Disk is full writing './mysql-bin.0000X' (Errcode: 28 "No space left on device") or InnoDB: Operating system error number 28 in a file operation.

The Diagnosis: Run df -h to verify available storage and df -i to check for inode exhaustion. A MariaDB disk full situation often happens because binary logs (binlogs) have grown out of control, or temporary tables filled the /tmp partition.

The Fix: Do NOT delete active ibdata1 or InnoDB redo logs. Instead, safely clean up old binary logs. If the server cannot start, you can manually remove the oldest binlogs and update the mysql-bin.index file. Once started, properly configure binlog expiration:

SET GLOBAL expire_logs_days = 7;
PURGE BINARY LOGS BEFORE '2023-10-01 00:00:00';

Scenario 3: MariaDB Too Many Connections

While not strictly a hard crash, exhausting the connection pool acts like a Denial of Service (DoS), halting application functionality.

The Symptom: Applications and CLI attempts return: ERROR 1040 (HY000): Too many connections.

The Diagnosis & MariaDB Too Many Connections Fix: If you cannot log in normally, you may need to connect using the extra_port (if configured) or restart the service to apply changes. To apply the MariaDB too many connections fix, increase the limit in your configuration file:

[mysqld]
max_connections = 1000

However, simply raising the limit often masks the root cause. Investigate why connections are piling up. It is frequently caused by a MariaDB slow query holding locks, or applications failing to close connections. Tune wait_timeout and interactive_timeout down (e.g., to 300-600 seconds) to reap sleeping connections automatically.

Scenario 4: MariaDB Deadlock and Timeout Issues

Heavy concurrent write traffic can cause transactions to block each other.

The Symptom: Applications report: ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction or ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction.

The Diagnosis: A MariaDB deadlock occurs when two transactions wait on each other to release row locks. A MariaDB timeout (Error 1205) happens when a query waits for a lock longer than the innodb_lock_wait_timeout (default 50s). Enable deadlock logging to catch the culprits:

SET GLOBAL innodb_print_all_deadlocks = 1;

The Fix: Deadlocks must be resolved at the application layer by ensuring transactions access tables and rows in a consistent order, keeping transactions small, and optimizing queries with proper indexes to reduce the footprint of locked rows.

Scenario 5: Managing MariaDB Replication Lag

If the master crashes and recovers, or if it handles heavy bulk loads, read replicas can fall significantly behind.

The Symptom: SHOW SLAVE STATUS\G (or SHOW REPLICA STATUS\G in newer versions) shows Seconds_Behind_Master consistently rising.

The Diagnosis & Fix: MariaDB replication lag is often caused by the single-threaded nature of the default SQL thread applying events on the replica. If the replica's disk I/O is saturated, upgrade the storage. Otherwise, enable parallel replication by adjusting slave_parallel_workers (or replica_parallel_workers) and setting slave_parallel_mode = optimistic.

Scenario 6: MariaDB Permission Denied Errors

Often seen after migrating a database directory or restoring from backups, incorrect filesystem permissions will completely block crash recovery.

The Symptom: InnoDB: Operating system error number 13 in a file operation. (Permission denied)

The Fix: A MariaDB permission denied error means the mysql user cannot read/write the data directory.

Fix standard Linux permissions: chown -R mysql:mysql /var/lib/mysql chmod -R 700 /var/lib/mysql
If SELinux is enforcing (RHEL/CentOS), restore the security context: restorecon -Rv /var/lib/mysql
If AppArmor is enforcing (Ubuntu/Debian), ensure the new data directory is whitelisted in /etc/apparmor.d/tunables/alias.

Advanced: Forcing InnoDB Recovery

If the crash resulted in corrupted InnoDB pages and standard recovery loops infinitely (crashing immediately upon startup), you must resort to innodb_force_recovery.

Add this to your configuration under [mysqld]:

innodb_force_recovery = 1

Start the service. If it fails, increment the number up to 6. Warning: Levels 4-6 can cause permanent data loss. Once the database starts in recovery mode, it is read-only. You must immediately mysqldump all databases, wipe the /var/lib/mysql directory, re-initialize the database, remove the innodb_force_recovery flag, start the empty database, and restore your dump.