While installing oVirt hosted engine, as part of the process sanlock is reconfigured by vdsm-tool. looking at the logs: http://ur1.ca/gmf4p I can see: Feb 13 15:24:53 localhost systemd-sanlock[8979]: Waiting for sanlock (3615) to stop:[FAILED] Feb 13 15:26:23 localhost systemd[1]: sanlock.service stopping timed out. Killing. Feb 13 15:26:23 localhost wdmd[625]: client dead ci 2 fd 9 pid 3615 renewal 6122 expire 6202 sanlock_83b03f85-5e6c-426d-8fc3-7626ff181d90:1 Feb 13 15:27:27 localhost kernel: [ 6200.608165] watchdog watchdog0: watchdog did not stop! Feb 13 15:27:27 localhost wdmd[625]: /dev/watchdog0 closed unclean So systemd is killing sanlock process causing the watchdog to issue a reboot. This should be avoided.
OK, I'll search through the systemd documentation for an option to prevent that. It is probably worth checking if vdsm could clean up all its sanlock lockspaces before it tries to stop sanlock.
The systemd documentation here: http://www.freedesktop.org/software/systemd/man/systemd.service.html#TimeoutStopSec= seems to say that setting TimeoutStopSec=0 will do what we want. However, it doesn't work in practice. systemctl stop sanlock still eventually sends SIGKILL. I'll plan add this setting to both /usr/lib/systemd/system/sanlock.service /usr/lib/systemd/system/wdmd.service and file a bug against systemd.
I was testing this on RHEL7 with systemd-207-11.el7.x86_64
opened bug 1065493 against RHEL7 systemd.
I'm submitting a patch for having wdmd.service and sanlock.service be plain unit files without relying much on the sysV scripts. They'll be configured with SendSIGKILL=No so that the error above doesn't happen.
This message is a notice that Fedora 19 is now at end of life. Fedora has stopped maintaining and issuing updates for Fedora 19. It is Fedora's policy to close all bug reports from releases that are no longer maintained. Approximately 4 (four) weeks from now this bug will be closed as EOL if it remains open with a Fedora 'version' of '19'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 19 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
(In reply to David Teigland from comment #4) > opened bug 1065493 against RHEL7 systemd. This was closed as NOTABUG (In reply to Antoni Segura Puimedon from comment #5) > I'm submitting a patch for having wdmd.service and sanlock.service be plain > unit files without relying much on the sysV scripts. They'll be configured > with SendSIGKILL=No so that the error above doesn't happen. There's no external tracker attached to this BZ, and I did not find any such patch in git (which doesn't necessarily mean it's not there, of course). Nir/David - what are our next steps? (Also, tentatively moving this bug to F21 so it isn't mistakenly closed)
I believe the addition of "SendSIGKILL=no" to the systemd unit file fixed this.