Bug 958367

Summary: [vdsm] prepareForShutdown is not called when connection to libvirt is broken [with no running vms]
Product: Red Hat Enterprise Virtualization Manager Reporter: Elad <ebenahar>
Component: vdsmAssignee: Mooli Tayer <mtayer>
Status: CLOSED ERRATA QA Contact: Martin Pavlik <mpavlik>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.1.3CC: acathrow, bazulay, danken, gklein, iheim, jkt, lpeer, mpavlik, mtayer, pstehlik, yeylon
Target Milestone: ---Keywords: Triaged
Target Release: 3.3.0   
Hardware: x86_64   
OS: Unspecified   
Whiteboard: infra
Fixed In Version: is8 Doc Type: Bug Fix
Doc Text:
When VDSM's connection to libvirt was broken, VDSM did not initiate the shutting down of the hosts when there were no running virtual machines. This update removes the clientIF instance dependency from libvirtconnection, and sends a SIGTERM to the VDSM process which triggers the host shutdown.
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-01-21 16:06:54 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 981974    
Bug Blocks:    
Attachments:
Description Flags
vdsm.log and libvirtd.log none

Description Elad 2013-05-01 07:47:15 UTC
Created attachment 742108 [details]
vdsm.log and libvirtd.log

Description of problem:

When connection to libvirt is broken, vdsm do not initiate prepareForShutdown. This happens when there are no running vms on that host.

Version-Release number of selected component (if applicable):

vdsm-4.10.2-16.0.el6ev.x86_64
libvirt-0.10.2-18.el6_4.4.x86_64


How reproducible:
100%

Steps to Reproduce:

on host that do not run any vms:

1. kill libvirt with sig abort:      #kill -6 (libvirt_pid)
2. run 'vdsClient -s 0 getVdsCaps' 
  
Actual results:

host answers to with getVdsCaps with 'unexpected exception'
vdsm still answer to getVdsStats, which implies that he won't enter 'non-responsive' and will not initiate prepareForShutdown.  


Expected results:

host should initiate prepareForShutdown as he does with running vms.


Additional info: see logs attached

Comment 1 Dan Kenigsberg 2013-05-01 12:02:25 UTC
This issue has been with us since rhev-3.0. rhev-3.2 is now closed for such improvements. requesting rhev-3.3.

Comment 3 Mooli Tayer 2013-07-03 11:38:57 UTC
I Created a patch that makes sure prepareForShutdown is called whenever libvirt connection is broken, this is done by killing the vdsm process.

However there is still a problem with the above scenario:
vdsm coming after prepareForShutdown does not restart libvirt.

It seems that in our service we take for-granted that once we start libvirt service it should respawn itsel which it does not. this could be due to a libvirt bug, change or behaviour or mybe our assumption is wrong.

Will investigate further.

Comment 4 Mooli Tayer 2013-07-03 13:23:16 UTC
Tested downstream - libvirt re-spawns just fine.  
(libvirt version:0.10.2, vdsm version:4.10.2)

Comment 5 Martin Pavlik 2013-10-22 14:10:04 UTC
*** Bug 1022021 has been marked as a duplicate of this bug. ***

Comment 7 Martin Pavlik 2013-10-22 15:00:26 UTC
works in is20

[root@dell-r210ii-06 ~]# rpm -q vdsm
vdsm-4.13.0-0.3.beta1.el6ev.x86_64

[root@dell-r210ii-06 ~]# rpm -q libvirt
libvirt-0.10.2-29.el6.x86_64

[root@dell-r210ii-06 ~]# date && pgrep libvirt && pkill -6 libvirt && sleep 5 && date && pgrep libvirt 
Tue Oct 22 16:42:45 CEST 2013
15092
Tue Oct 22 16:42:50 CEST 2013
15174

[root@dell-r210ii-06 ~]# vdsClient -s 0 getVdsCapsUnexpected exception

[root@dell-r210ii-06 ~]# date
Tue Oct 22 16:43:10 CEST 2013

[root@dell-r210ii-06 ~]# vdsClient -s 0 getVdsCaps
	HBAInventory = {'FC': [], 'iSCSI': [{'InitiatorName': 'iqn.1994-05.com.redhat:d5e5b4cb74d'}]}
	ISCSIInitiatorName = 'iqn.1994-05.com.redhat:d5e5b4cb74d'
	bondings = {'bond0': {'addr': '',
....

Comment 8 Charlie 2013-11-28 00:33:50 UTC
This bug is currently attached to errata RHBA-2013:15291. If this change is not to be documented in the text for this errata please either remove it from the errata, set the requires_doc_text flag to 
minus (-), or leave a "Doc Text" value of "--no tech note required" if you do not have permission to alter the flag.

Otherwise to aid in the development of relevant and accurate release documentation, please fill out the "Doc Text" field above with these four (4) pieces of information:

* Cause: What actions or circumstances cause this bug to present.
* Consequence: What happens when the bug presents.
* Fix: What was done to fix the bug.
* Result: What now happens when the actions or circumstances above occur. (NB: this is not the same as 'the bug doesn't present anymore')

Once filled out, please set the "Doc Type" field to the appropriate value for the type of change made and submit your edits to the bug.

For further details on the Cause, Consequence, Fix, Result format please refer to:

https://bugzilla.redhat.com/page.cgi?id=fields.html#cf_release_notes 

Thanks in advance.

Comment 9 errata-xmlrpc 2014-01-21 16:06:54 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-0040.html