We want systemd to monitor vdsm and restart if it is not responding. We would like to user systemd.daemon.notify to let systemd know that vdsm is up and running.
Adding more info from the discussion on vdsm call. Systemd notify provides two important mechanisms that we would like to use: - startup completion detection: vdsmd will notify systemd when it has started and ready to accept requests. This will help other services (e.g. mom, hosted engine agent)to communicate with vdsmd without need to handle "connection refused" errors. - watching vdsmd hangs: vdsmd will notify systemd watchdog periodically. If vdsmd stops notifying the watchdog because of a deadlock or complete process hangup, or some other critical error, systemd will restart vdsmd. If vdsmd is blocked in D state and cannot be restarted, we will have logs about it in the journal. General solution: 1. Use Type=notify in vdsmd.service (READY=1) 2. Notify systemd via systemd.notify python module after vdsmd has started to listen on the vdsmd port. 3. Specify WatchdogSec in vdsmd.service 4. Add a health thread, checking vdsm subsystems periodically. If all subsystems are healthy, notify systemd watchdog using systemd.noitify python module (WATCHDOG=1). If one of the subsystems is considered as not-healthy, avoid notifying systemd, triggering a vdsm restart. Related docs: - https://www.freedesktop.org/software/systemd/man/systemd.service.html#Type= - https://www.freedesktop.org/software/systemd/man/systemd.service.html#WatchdogSec= - http://man7.org/linux/man-pages/man3/sd_notify.3.html
Please also consider watching supervdsmd as suggested in https://bugzilla.redhat.com/show_bug.cgi?id=1666123#c23. And maybe other host deamons like ovirt-ha.
We didn't get to this bug for more than 2 years, and it's not being considered for the upcoming 4.4. It's unlikely that it will ever be addressed so I'm suggesting to close it. If you feel this needs to be addressed and want to work on it please remove cond nack and target accordingly.
Closing old bug. Please reopen if still relevant/you want to work on it.