systemctl failed: How to Diagnose and Fix Service Failures on Linux
Fix systemctl failed errors fast. Learn to read journalctl logs, resolve permission denied issues, and repair broken service units with step-by-step commands.
- The most common root causes are misconfigured unit files, missing dependencies, incorrect file permissions on the service binary or working directory, and resource limits being exceeded.
- 'systemctl failed' almost always has a detailed root cause in the journal — run 'journalctl -xe -u <service>' immediately after any failure to see the actual error message.
- Permission denied errors typically mean the service binary is not executable, the user specified in the unit file lacks access to a required path, or the invocation requires elevated privileges without a proper sudoers/polkit rule.
- When systemctl is not working at all (command not found, D-Bus errors), the issue is usually a broken systemd installation, a non-systemd init environment, or a degraded D-Bus session.
| Method | When to Use | Time | Risk |
|---|---|---|---|
| journalctl -xe -u <service> | First step for any failure — reads the structured log | < 1 min | None — read-only |
| systemctl reset-failed <service> | Clears a stuck 'failed' state so the unit can be restarted | < 1 min | Low — only resets state, does not change config |
| Edit unit file (ExecStart, User, paths) | Service binary path is wrong, wrong user, missing env vars | 5-15 min | Low if tested in staging first |
| chmod/chown on binary or data dir | 'Permission denied' in ExecStart or file access errors | 2-5 min | Medium — wrong permissions can break other services |
| systemctl daemon-reload | Unit file was edited on disk but systemd hasn't re-read it | < 1 min | None |
| Override with systemctl edit | Inject drop-in overrides without touching the vendor unit | 5-10 min | Low — overrides are isolated |
| Rebuild/reinstall the package | Corrupt binary, missing shared libraries (ldd failures) | 5-30 min | Medium — may change config files |
| Check and raise resource limits (LimitNOFILE, MemoryMax) | Service exits with 'Too many open files' or OOM kills | 5-10 min | Low |
Understanding 'systemctl failed'
When systemd reports a service in a failed state you will see output like this from systemctl status:
● nginx.service - A high performance web server and a reverse proxy server
Loaded: loaded (/lib/systemd/system/nginx.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Mon 2026-02-23 10:14:32 UTC; 3s ago
Process: 18432 ExecStartPre=/usr/sbin/nginx -t (code=exited, status=1/FAILURE)
Main PID: 18432 (code=exited, status=1/FAILURE)
CPU: 42ms
The Result field tells you the failure category:
exit-code— the process started but returned a non-zero exit statussignal— the process was killed by a signal (e.g., SIGSEGV, SIGKILL from the OOM killer)core-dump— the process crashed and produced a core dumptimeout— the service did not become active withinTimeoutStartSecresources— systemd could not allocate the resources specified in the unitstart-limit-hit— the service restarted too many times and hit theStartLimitBurstthreshold
Understanding the Result field narrows the search space dramatically before you even read the logs.
Step 1: Read the Journal Immediately
The single most important command after any systemctl failure is:
journalctl -xe -u <service-name> --no-pager
Flags:
-x— adds explanatory catalog messages that describe what the log entry means-e— jumps to the end of the journal (most recent entries)-u <service-name>— filters to only messages from that unit
For persistent failures or to see entries across reboots:
journalctl -u <service-name> --since "10 minutes ago"
journalctl -u <service-name> -b -1 # previous boot
If the service logs to its own file rather than the journal, check:
systemctl cat <service-name> # shows unit file with StandardError/StandardOutput paths
tail -100 /var/log/<service>/<service>.log
Step 2: Identify the Root Cause
Scenario A: ExecStart binary not found or not executable
Journal will show:
Feb 23 10:14:32 host systemd[1]: nginx.service: Control process exited with error code.
Feb 23 10:14:32 host systemd[1]: Failed to start A high performance web server.
And closer inspection reveals:
exec /usr/sbin/nginx: no such file or directory
Fix:
which nginx # find actual binary path
ls -la /usr/sbin/nginx # verify it exists and is executable
chmod +x /usr/sbin/nginx # if missing execute bit
systemctl cat nginx.service # compare ExecStart path to actual binary
Scenario B: systemctl permission denied
This error appears in two distinct contexts:
- The user running systemctl lacks privileges:
Failed to connect to bus: Permission denied
Failed to disable unit: Access denied
Fix: Run with sudo systemctl ... or grant polkit rules for the specific action.
- The service process itself gets permission denied on a file or socket:
nginx: [emerg] open() "/var/run/nginx.pid" failed (13: Permission denied)
Fix:
ls -la /var/run/nginx.pid # check owner
chown root:root /var/run/nginx.pid # or the user specified in nginx.conf
systemctl show nginx.service -p User # check which user systemd starts nginx as
Scenario C: Missing dependency or ordered unit
Journal shows:
ConditionPathExists=/etc/myapp/config.yaml was not met
Or a dependency like network.target or a database socket was not ready.
Fix:
systemctl list-dependencies <service-name> # show full dependency tree
systemctl is-active network-online.target # check if network dependency is met
# Add to unit [Unit] section:
# After=network-online.target postgresql.service
# Requires=postgresql.service
Scenario D: Start limit hit
Journal shows:
systemd[1]: myapp.service: Start request repeated too quickly.
systemd[1]: myapp.service: Failed with result 'start-limit-hit'.
This happens when a service crashes immediately on startup and systemd has tried restarting it beyond StartLimitBurst times within StartLimitIntervalSec.
Fix the underlying crash first, then reset the failed state:
systemctl reset-failed myapp.service
systemctl start myapp.service
To raise the limit while debugging (not for production):
systemctl edit myapp.service
# Add:
# [Unit]
# StartLimitBurst=0
Scenario E: Environment variables or working directory missing
myapp[18501]: Error: DATABASE_URL environment variable is required
Fix — add to unit file with systemctl edit:
systemctl edit myapp.service
# [Service]
# Environment=DATABASE_URL=postgres://user:pass@localhost/db
# EnvironmentFile=/etc/myapp/env
# WorkingDirectory=/opt/myapp
Step 3: Apply the Fix and Reload
After editing any unit file:
systemctl daemon-reload # re-read all unit files
systemctl start <service-name> # attempt start
systemctl status <service-name> # verify active (running)
Verify the service will survive a reboot:
systemctl is-enabled <service-name> # should return 'enabled'
systemctl enable <service-name> # if it was disabled
Step 4: When systemctl Is Not Working At All
If systemctl itself errors:
System has not been booted with systemd as init system (PID 1). Can't operate.
Failed to connect to bus: Host is down
This means either:
- You are in a Docker container or chroot without systemd (use
serviceor direct binary management) - D-Bus is not running:
systemctl start dbusorapt install dbus - systemd is corrupted: reinstall with
apt install --reinstall systemd
Check what PID 1 actually is:
ps -p 1 -o comm=
ls -la /sbin/init
Preventive Practices
- Always validate unit files before deploying:
systemd-analyze verify /path/to/my.service - Test with
--dry-runwhen available:nginx -t,apache2ctl configtest - Use
systemctl editfor overrides rather than editing vendor unit files in/lib/systemd/system/— your changes survive package upgrades - Set
Restart=on-failureandRestartSec=5for long-running services so transient failures self-heal - Capture full logs with
StandardOutput=journalandStandardError=journalin the[Service]section
Frequently Asked Questions
#!/usr/bin/env bash
# systemctl-diagnose.sh — comprehensive service failure diagnostic
# Usage: ./systemctl-diagnose.sh <service-name>
SVC="${1:?Usage: $0 <service-name>}"
echo "=== systemctl status ==="
systemctl status "$SVC" --no-pager -l
echo ""
echo "=== Recent journal entries (last 100 lines) ==="
journalctl -u "$SVC" --no-pager -n 100
echo ""
echo "=== Unit file ==="
systemctl cat "$SVC"
echo ""
echo "=== Unit properties (ExecStart, User, WorkingDirectory, Restart) ==="
systemctl show "$SVC" -p ExecStart -p User -p Group -p WorkingDirectory \
-p Restart -p StartLimitBurst -p StartLimitIntervalSec \
-p MemoryMax -p LimitNOFILE -p Environment -p EnvironmentFiles
echo ""
echo "=== Dependencies ==="
systemctl list-dependencies "$SVC" --no-pager
echo ""
echo "=== Binary existence and permissions ==="
EXEC=$(systemctl show "$SVC" -p ExecStart --value | awk '{print $2}' | tr -d ';')
if [ -n "$EXEC" ]; then
echo "ExecStart binary: $EXEC"
ls -la "$EXEC" 2>/dev/null || echo "ERROR: binary not found at $EXEC"
ldd "$EXEC" 2>/dev/null | grep 'not found' && echo "WARNING: missing shared libraries"
fi
echo ""
echo "=== Failed units overview ==="
systemctl --failed --no-pager
echo ""
echo "=== systemd-analyze verify ==="
systemd-analyze verify "$SVC" 2>&1 || true
# Quick fix commands (uncomment as needed):
# sudo systemctl reset-failed "$SVC" # clear stuck failed state
# sudo systemctl daemon-reload # reload after unit file edits
# sudo systemctl start "$SVC" # attempt restart
# sudo journalctl -xe -u "$SVC" --no-pager # full log with catalog messagesError Medic Editorial
Error Medic Editorial is a team of senior DevOps engineers and SRE practitioners with combined experience across hyperscaler infrastructure, on-premise Linux fleets, and container platforms. Our guides are written from production incident post-mortems and peer-reviewed against official documentation before publication.
Sources
- https://www.freedesktop.org/software/systemd/man/systemd.service.html
- https://www.freedesktop.org/software/systemd/man/journalctl.html
- https://www.freedesktop.org/software/systemd/man/systemd-analyze.html
- https://unix.stackexchange.com/questions/225401/how-to-see-full-log-from-systemctl-status-service
- https://github.com/systemd/systemd/issues/9466