Bug 746163

Summary: virt-who loses reading guest info upon libvirtd restarts
Product: Red Hat Enterprise Linux 6 Reporter: Keqin Hong <khong>
Component: virt-whoAssignee: Radek Novacek <rnovacek>
Status: CLOSED ERRATA QA Contact: Entitlement Bugs <entitlement-bugs>
Severity: high Docs Contact:
Priority: high    
Version: 6.3CC: apevec, cshao, dallan, gouyang, jsefler, leiwang, mkhusid, moli, ovasik, rvokal, ycui
Target Milestone: beta   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: virt-who-0.6-1.el6 Doc Type: Bug Fix
Doc Text:
Cause: when libvirt daemon stops virt-who no longer gets information about guest state Consequence: List of guest UUIDs is not acurate Fix: Use polling to check connection to libvirtd Result: The interval when the list of UUIDs is wrong is minimized
Story Points: ---
Clone Of:
: 756380 (view as bug list) Environment:
Last Closed: 2012-06-20 14:14:24 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 756380    

Description Keqin Hong 2011-10-14 07:43:28 UTC
Description of problem:
virt-who daemon will lose reading any info from libvirt upon libvirtd restarts

Version-Release number of selected component (if applicable):
virt-who-0.3-3.el6.noarch

How reproducible:
always

Steps to Reproduce:
1. ensure there are three vms installed by virt-manager/libvirt
2. enable virt-who debug
# vi /etc/sysconfig/virt-who
VIRTWHO_BACKGROUND=1
VIRTWHO_DEBUG=1
3. start virt-who daemon
# service virt-who restart
Virtual machine found: sn2-s-64: cf3abef2-4b92-aab0-1b04-e2d774f73494
Virtual machine found: 1005.1-s-64: 063a46e5-1e50-81e5-9ae4-94f45f41d79e
Virtual machine found: 1005.1-s-64-clone: d2464bf3-b0c4-e2f7-1462-31f2ca63706b
Sending update to updateConsumer: ['063a46e5-1e50-81e5-9ae4-94f45f41d79e', 'cf3abef2-4b92-aab0-1b04-e2d774f73494', 'd2464bf3-b0c4-e2f7-1462-31f2ca63706b']
Entering infinite loop
4. restart libvirtd service
# service libvirtd restart
5. destroy/add a VM

  
Actual results:
no guest info change will be reported by virt-who after step 5.

Expected results:
virt-who can survive when libvirtd restarts

Additional info:

Comment 2 Radek Novacek 2011-10-14 13:22:17 UTC
I can confirm this behavior. Problem is that libvirt API doesn't support events about it state, so virt-who is not able to detect that libvirtd dies (or restarts).

Running virt-who with interval checks is affected too, but the fix is simple in that case.

Comment 3 RHEL Program Management 2011-10-18 18:40:31 UTC
Since RHEL 6.2 External Beta has begun, and this bug remains
unresolved, it has been rejected as it is not proposed as
exception or blocker.

Red Hat invites you to ask your support representative to
propose this request, if appropriate and relevant, in the
next release of Red Hat Enterprise Linux.

Comment 5 Alan Pevec 2011-10-27 19:25:42 UTC
(In reply to comment #2)
> Problem is that libvirt API doesn't support events
> about it state, so virt-who is not able to detect that libvirtd dies (or
> restarts).

Adding Dave Allan to comment this from libvirt perspective.

But Radek, don't you get an exception when libvirt connection is lost?
In a simple test, after libvirtd restart I got:
libvir: RPC error : Cannot write data: Broken pipe
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib64/python2.6/site-packages/libvirt.py", line 2858, in listDomainsID
    if ret is None: raise libvirtError ('virConnectListDomainsID() failed', conn=self)
libvirt.libvirtError: Cannot write data: Broken pipe

Comment 6 Dave Allan 2011-10-27 20:09:34 UTC
I'm guessing that virt-who is totally event driven and thus no API calls are being made ordinarily, is that right?  To confirm that your connection is alive, using it, as Alan points out in comment 5, will tell you instantly whether the daemon that you connected to is still there.  Would that work for you?

Comment 7 Dave Allan 2011-10-27 20:11:36 UTC
BTW, I don't disagree with sending events notifying clients of daemon shutdown, but if the daemon dies really suddenly (e.g., is KILL'd) it may not have the opportunity to do so.

Comment 9 Radek Novacek 2011-10-31 09:04:55 UTC
Dave is correct, virt-who in background mode (-b argument) waits for events from libvirt and don't do any API call itself. When libvirtd is restarted no more event is received. 

I think best solution for this case is to do both event-driven guest change checking and interval checking (with longer period). Event-driven checking will ensure that guest change is reported immediately but if something goes wrong, the interval checking will make sure that virt-who won't break permanently.

How long should be the default interval in this case? One hour?

Comment 10 Dave Allan 2011-10-31 13:24:08 UTC
(In reply to comment #9)
> How long should be the default interval in this case? One hour?

Events will give you accurate information about whether guests have changed, so really all you have to do is to call any API to verify that the connection is alive.  virConnectGetLibVersion is pretty lightweight and would be a good candidate.  You can call that pretty much as often as you want, once a minute would be negligible load.

Comment 12 Radek Novacek 2011-11-30 12:59:02 UTC
I have done suggested changes, see upstream commit:

https://fedorahosted.org/virt-who/changeset/15d839c36e9aedd4d653833d6a713acd6a2f60ba

Virt-who should now handle various errors (like network problem, wrong configuration etc.) more correctly.

Comment 14 Radek Novacek 2012-03-01 15:04:43 UTC
virt-who version 0.6-1.el6 fixes this issue.

Comment 17 Radek Novacek 2012-05-02 06:18:33 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Cause: when libvirt daemon stops virt-who no longer gets information about guest state

Consequence: List of guest UUIDs is not acurate

Fix: Use polling to check connection to libvirtd

Result: The interval when the list of UUIDs is wrong is minimized

Comment 18 errata-xmlrpc 2012-06-20 14:14:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2012-0900.html