| Summary: | virt-who loses reading guest info upon libvirtd restarts | |||
|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | Keqin Hong <khong> | |
| Component: | virt-who | Assignee: | Radek Novacek <rnovacek> | |
| Status: | CLOSED ERRATA | QA Contact: | Entitlement Bugs <entitlement-bugs> | |
| Severity: | high | Docs Contact: | ||
| Priority: | high | |||
| Version: | 6.3 | CC: | apevec, cshao, dallan, gouyang, jsefler, leiwang, mkhusid, moli, ovasik, rvokal, ycui | |
| Target Milestone: | beta | |||
| Target Release: | --- | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | virt-who-0.6-1.el6 | Doc Type: | Bug Fix | |
| Doc Text: |
Cause: when libvirt daemon stops virt-who no longer gets information about guest state
Consequence: List of guest UUIDs is not acurate
Fix: Use polling to check connection to libvirtd
Result: The interval when the list of UUIDs is wrong is minimized
|
Story Points: | --- | |
| Clone Of: | ||||
| : | 756380 (view as bug list) | Environment: | ||
| Last Closed: | 2012-06-20 14:14:24 UTC | Type: | --- | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Bug Depends On: | ||||
| Bug Blocks: | 756380 | |||
|
Description
Keqin Hong
2011-10-14 07:43:28 UTC
I can confirm this behavior. Problem is that libvirt API doesn't support events about it state, so virt-who is not able to detect that libvirtd dies (or restarts). Running virt-who with interval checks is affected too, but the fix is simple in that case. Since RHEL 6.2 External Beta has begun, and this bug remains unresolved, it has been rejected as it is not proposed as exception or blocker. Red Hat invites you to ask your support representative to propose this request, if appropriate and relevant, in the next release of Red Hat Enterprise Linux. (In reply to comment #2) > Problem is that libvirt API doesn't support events > about it state, so virt-who is not able to detect that libvirtd dies (or > restarts). Adding Dave Allan to comment this from libvirt perspective. But Radek, don't you get an exception when libvirt connection is lost? In a simple test, after libvirtd restart I got: libvir: RPC error : Cannot write data: Broken pipe Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib64/python2.6/site-packages/libvirt.py", line 2858, in listDomainsID if ret is None: raise libvirtError ('virConnectListDomainsID() failed', conn=self) libvirt.libvirtError: Cannot write data: Broken pipe I'm guessing that virt-who is totally event driven and thus no API calls are being made ordinarily, is that right? To confirm that your connection is alive, using it, as Alan points out in comment 5, will tell you instantly whether the daemon that you connected to is still there. Would that work for you? BTW, I don't disagree with sending events notifying clients of daemon shutdown, but if the daemon dies really suddenly (e.g., is KILL'd) it may not have the opportunity to do so. Dave is correct, virt-who in background mode (-b argument) waits for events from libvirt and don't do any API call itself. When libvirtd is restarted no more event is received. I think best solution for this case is to do both event-driven guest change checking and interval checking (with longer period). Event-driven checking will ensure that guest change is reported immediately but if something goes wrong, the interval checking will make sure that virt-who won't break permanently. How long should be the default interval in this case? One hour? (In reply to comment #9) > How long should be the default interval in this case? One hour? Events will give you accurate information about whether guests have changed, so really all you have to do is to call any API to verify that the connection is alive. virConnectGetLibVersion is pretty lightweight and would be a good candidate. You can call that pretty much as often as you want, once a minute would be negligible load. I have done suggested changes, see upstream commit: https://fedorahosted.org/virt-who/changeset/15d839c36e9aedd4d653833d6a713acd6a2f60ba Virt-who should now handle various errors (like network problem, wrong configuration etc.) more correctly. virt-who version 0.6-1.el6 fixes this issue.
Technical note added. If any revisions are required, please edit the "Technical Notes" field
accordingly. All revisions will be proofread by the Engineering Content Services team.
New Contents:
Cause: when libvirt daemon stops virt-who no longer gets information about guest state
Consequence: List of guest UUIDs is not acurate
Fix: Use polling to check connection to libvirtd
Result: The interval when the list of UUIDs is wrong is minimized
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2012-0900.html |