Red Hat Bugzilla – Bug 842635
[RHEV] When there is no connection between the host and storage domain,the host goes to reboot.
Last modified: 2012-10-24 11:34:20 EDT
Created attachment 599980 [details]
I have host with is SPM and I am blocking connection between this host and Storage Domain. The host goes to reboot in a minute or so.
According to David when the connection to storage is lost, it will try to kill all the pids using it. If it can't kill them in time, then the host will be reset by the watchdog. And for some reason the root sanlock helper process was killed. Sanlock has to use the helper process to do the kill(). If the helper is not there to kill the pids, then the watchdog will kill the host.
As I said above what I did is only blocking connection between the host and SD,so there is a problem with the sanlock root helper process that being killed.
vdsm and sanlock logs are attached.
After looking more closely, Federico and I found that it was not related to killing the helper process, but instead was related to an unmount from vdsm being stuck or taking too long.
Can we close this bz? I don't think this was a bug.