Hide Forgot
Description of problem: virt-who daemon will lose reading any info from libvirt upon libvirtd restarts Version-Release number of selected component (if applicable): virt-who-0.3-3.el6.noarch How reproducible: always Steps to Reproduce: 1. ensure there are three vms installed by virt-manager/libvirt 2. enable virt-who debug # vi /etc/sysconfig/virt-who VIRTWHO_BACKGROUND=1 VIRTWHO_DEBUG=1 3. start virt-who daemon # service virt-who restart Virtual machine found: sn2-s-64: cf3abef2-4b92-aab0-1b04-e2d774f73494 Virtual machine found: 1005.1-s-64: 063a46e5-1e50-81e5-9ae4-94f45f41d79e Virtual machine found: 1005.1-s-64-clone: d2464bf3-b0c4-e2f7-1462-31f2ca63706b Sending update to updateConsumer: ['063a46e5-1e50-81e5-9ae4-94f45f41d79e', 'cf3abef2-4b92-aab0-1b04-e2d774f73494', 'd2464bf3-b0c4-e2f7-1462-31f2ca63706b'] Entering infinite loop 4. restart libvirtd service # service libvirtd restart 5. destroy/add a VM Actual results: no guest info change will be reported by virt-who after step 5. Expected results: virt-who can survive when libvirtd restarts Additional info:
I can confirm this behavior. Problem is that libvirt API doesn't support events about it state, so virt-who is not able to detect that libvirtd dies (or restarts). Running virt-who with interval checks is affected too, but the fix is simple in that case.
Since RHEL 6.2 External Beta has begun, and this bug remains unresolved, it has been rejected as it is not proposed as exception or blocker. Red Hat invites you to ask your support representative to propose this request, if appropriate and relevant, in the next release of Red Hat Enterprise Linux.
(In reply to comment #2) > Problem is that libvirt API doesn't support events > about it state, so virt-who is not able to detect that libvirtd dies (or > restarts). Adding Dave Allan to comment this from libvirt perspective. But Radek, don't you get an exception when libvirt connection is lost? In a simple test, after libvirtd restart I got: libvir: RPC error : Cannot write data: Broken pipe Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib64/python2.6/site-packages/libvirt.py", line 2858, in listDomainsID if ret is None: raise libvirtError ('virConnectListDomainsID() failed', conn=self) libvirt.libvirtError: Cannot write data: Broken pipe
I'm guessing that virt-who is totally event driven and thus no API calls are being made ordinarily, is that right? To confirm that your connection is alive, using it, as Alan points out in comment 5, will tell you instantly whether the daemon that you connected to is still there. Would that work for you?
BTW, I don't disagree with sending events notifying clients of daemon shutdown, but if the daemon dies really suddenly (e.g., is KILL'd) it may not have the opportunity to do so.
Dave is correct, virt-who in background mode (-b argument) waits for events from libvirt and don't do any API call itself. When libvirtd is restarted no more event is received. I think best solution for this case is to do both event-driven guest change checking and interval checking (with longer period). Event-driven checking will ensure that guest change is reported immediately but if something goes wrong, the interval checking will make sure that virt-who won't break permanently. How long should be the default interval in this case? One hour?
(In reply to comment #9) > How long should be the default interval in this case? One hour? Events will give you accurate information about whether guests have changed, so really all you have to do is to call any API to verify that the connection is alive. virConnectGetLibVersion is pretty lightweight and would be a good candidate. You can call that pretty much as often as you want, once a minute would be negligible load.
I have done suggested changes, see upstream commit: https://fedorahosted.org/virt-who/changeset/15d839c36e9aedd4d653833d6a713acd6a2f60ba Virt-who should now handle various errors (like network problem, wrong configuration etc.) more correctly.
virt-who version 0.6-1.el6 fixes this issue.
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Cause: when libvirt daemon stops virt-who no longer gets information about guest state Consequence: List of guest UUIDs is not acurate Fix: Use polling to check connection to libvirtd Result: The interval when the list of UUIDs is wrong is minimized
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2012-0900.html