stale apparmor config / mysql refuses to start

stale apparmor config / mysql refuses to start

  • Written by
    Walter Doekes
  • Published on

So, recently we had an issue with a MariaDB server that refused to start. Or, actually, it would start, but before long, SystemD would kill it. But why?

# systemctl start mariadb.service
Job for mariadb.service failed because a timeout was exceeded.
See "systemctl status mariadb.service" and "journalctl -xe" for details.

After 90 seconds, it would be killed. systemctl status mariadb.service shows the immediate cause:

# systemctl status mariadb.service
systemd[1]: mariadb.service: Start operation timed out. Terminating.
systemd[1]: mariadb.service: Main process exited, code=killed, status=15/TERM
systemd[1]: mariadb.service: Failed with result 'timeout'.

Ok, a start operation timeout. That is caused by the notify type: apparently the mysqld doesn't get a chance to tell SystemD that it has succesfully completed startup.

First, a quickfix, so we can start at all:

# cat <<EOF >/etc/systemd/system/mariadb.service.d/override.conf

That fixes so we can start — because now SystemD won't require for any "started" notification anymore — but it doesn't explain what is wrong.

Second, an attempt at debugging the cause:

# cat <<EOF >/etc/systemd/system/mariadb.service.d/override.conf
ExecStart=/usr/bin/strace -fesendmsg,sendto,connect,socket -s8192 \
  /usr/sbin/mysqld $MYSQLD_OPTS

Okay, that one showed EACCESS errors on the sendmsg() call on the /run/systemd/notify unix socket:

strace[55081]: [pid 55084] socket(AF_UNIX, SOCK_DGRAM|SOCK_CLOEXEC, 0) = 46
strace[55081]: [pid 55084] sendmsg(46, {msg_name={sa_family=AF_UNIX,
  sun_path="/run/systemd/notify"}, msg_namelen=22,
  msg_iov=[{iov_base="READY=1\nSTATUS=Taking your SQL requests now...\n", iov_len=47}],
  msg_iovlen=1, msg_controllen=0, msg_flags=0},
  MSG_NOSIGNAL) = -1 EACCES (Permission denied)

Permission denied? But why?

# cat <<EOF >/etc/systemd/system/mariadb.service.d/override.conf
ExecStart=/usr/bin/strace -fesendmsg,sendto,connect,socket -s8192 \
  /bin/sh -c 'printf "READY=1\nSTATUS=Taking your SQL requests now...\n" | \
    socat - UNIX-SENDTO:/run/systemd/notify; sleep 3600

This worked:

strace[54926]: [pid 54931] socket(AF_UNIX, SOCK_DGRAM, 0) = 5
strace[54926]: [pid 54931] sendto(5,
  "READY=1\nSTATUS=Taking your SQL requests now...\n", 47, 0,
  {sa_family=AF_UNIX, sun_path="/run/systemd/notify"}, 21) = 47

(Unless someone is really trying to mess with you, you can regard sendto() and sendmsg() as equivalent here. socat simply uses the other one.)

That means that there is nothing wrong with SystemD or /run/systemd/notify. So the problem must be related to /usr/sbin/mysqld.

After looking at journalctl -u mariadb.service for the nth time, I decided to peek at all of journalctl without any filters. And there it was after all: audit logs.

# journalctl -t audit
audit[1428513]: AVC apparmor="DENIED" operation="sendmsg"
  info="Failed name lookup - disconnected path" error=-13
  profile="/usr/sbin/mysqld" name="run/systemd/notify" pid=1428513
  comm="mysqld" requested_mask="w" denied_mask="w" fsuid=104 ouid=0

(Observe the -t in the journalctl invocation above which looks for the SYSLOG_IDENTIFIER=audit key-value pair.)

Okay. And fixing it?

# aa-remove-unknown
Skipping profile in /etc/apparmor.d/disable: usr.sbin.mysqld
Skipping profile in /etc/apparmor.d/disable: usr.sbin.rsyslogd
Removing '/usr/sbin/mysqld'

A-ha! Stale cruft in /var/cache/apparmor.

# /etc/init.d/apparmor restart
Restarting apparmor (via systemctl): apparmor.service.

Finally we could undo the override.conf and everything started working as expected.

Back to overview Newer post: chromium snap / wrong fonts Older post: zfs / zvol / partition does not show up