Created attachment 742108 [details] vdsm.log and libvirtd.log Description of problem: When connection to libvirt is broken, vdsm do not initiate prepareForShutdown. This happens when there are no running vms on that host. Version-Release number of selected component (if applicable): vdsm-4.10.2-16.0.el6ev.x86_64 libvirt-0.10.2-18.el6_4.4.x86_64 How reproducible: 100% Steps to Reproduce: on host that do not run any vms: 1. kill libvirt with sig abort: #kill -6 (libvirt_pid) 2. run 'vdsClient -s 0 getVdsCaps' Actual results: host answers to with getVdsCaps with 'unexpected exception' vdsm still answer to getVdsStats, which implies that he won't enter 'non-responsive' and will not initiate prepareForShutdown. Expected results: host should initiate prepareForShutdown as he does with running vms. Additional info: see logs attached
This issue has been with us since rhev-3.0. rhev-3.2 is now closed for such improvements. requesting rhev-3.3.
I Created a patch that makes sure prepareForShutdown is called whenever libvirt connection is broken, this is done by killing the vdsm process. However there is still a problem with the above scenario: vdsm coming after prepareForShutdown does not restart libvirt. It seems that in our service we take for-granted that once we start libvirt service it should respawn itsel which it does not. this could be due to a libvirt bug, change or behaviour or mybe our assumption is wrong. Will investigate further.
Tested downstream - libvirt re-spawns just fine. (libvirt version:0.10.2, vdsm version:4.10.2)
*** Bug 1022021 has been marked as a duplicate of this bug. ***
works in is20 [root@dell-r210ii-06 ~]# rpm -q vdsm vdsm-4.13.0-0.3.beta1.el6ev.x86_64 [root@dell-r210ii-06 ~]# rpm -q libvirt libvirt-0.10.2-29.el6.x86_64 [root@dell-r210ii-06 ~]# date && pgrep libvirt && pkill -6 libvirt && sleep 5 && date && pgrep libvirt Tue Oct 22 16:42:45 CEST 2013 15092 Tue Oct 22 16:42:50 CEST 2013 15174 [root@dell-r210ii-06 ~]# vdsClient -s 0 getVdsCapsUnexpected exception [root@dell-r210ii-06 ~]# date Tue Oct 22 16:43:10 CEST 2013 [root@dell-r210ii-06 ~]# vdsClient -s 0 getVdsCaps HBAInventory = {'FC': [], 'iSCSI': [{'InitiatorName': 'iqn.1994-05.com.redhat:d5e5b4cb74d'}]} ISCSIInitiatorName = 'iqn.1994-05.com.redhat:d5e5b4cb74d' bondings = {'bond0': {'addr': '', ....
This bug is currently attached to errata RHBA-2013:15291. If this change is not to be documented in the text for this errata please either remove it from the errata, set the requires_doc_text flag to minus (-), or leave a "Doc Text" value of "--no tech note required" if you do not have permission to alter the flag. Otherwise to aid in the development of relevant and accurate release documentation, please fill out the "Doc Text" field above with these four (4) pieces of information: * Cause: What actions or circumstances cause this bug to present. * Consequence: What happens when the bug presents. * Fix: What was done to fix the bug. * Result: What now happens when the actions or circumstances above occur. (NB: this is not the same as 'the bug doesn't present anymore') Once filled out, please set the "Doc Type" field to the appropriate value for the type of change made and submit your edits to the bug. For further details on the Cause, Consequence, Fix, Result format please refer to: https://bugzilla.redhat.com/page.cgi?id=fields.html#cf_release_notes Thanks in advance.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2014-0040.html