Bug 1174226 - Reset agent error state flag after guest agent restarts, or guest restarts
Summary: Reset agent error state flag after guest agent restarts, or guest restarts
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: libvirt
Version: 7.2
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Peter Krempa
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-12-15 12:37 UTC by Daniel Berrangé
Modified: 2015-11-19 05:59 UTC (History)
5 users (show)

Fixed In Version: libvirt-1.2.16-1.el7
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-11-19 05:59:02 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2015:2202 0 normal SHIPPED_LIVE libvirt bug fix and enhancement update 2015-11-19 08:17:58 UTC

Description Daniel Berrangé 2014-12-15 12:37:00 UTC
Description of problem:
When libvirt has a problem parsing the reply from the QEMU guest agent, it will set an error flag  'priv->agentError = true;'. Any attempt to use the agent after that point will result in


error: Guest agent is not responding: QEMU guest agent is not available due to an error

The flag is only reset when the guest powers off. This is overly pessimistic.

QEMU issues an event whenever the guest agent channel is restarted. Libvirt should reset the error flag when seeing this event.  If the guest OS does a warm reboot (eg shutdown -r now inside the guest), libvirt shuld reset the error flag when it sees the RESET event from QEMU.

Version-Release number of selected component (if applicable):
libvirt-1.2.8-10.el7

How reproducible:
Somewhat

Steps to Reproduce:
1. Do something to the guest agent that causes libvirt to set 'agentError = true'. Unclear exactly how todo this reliably, but I think I managed it by killing & restarting the guest agent process while in the middle of handling a command
2. Attempt to send a guest agent command

$ virsh -c qemu:///system  qemu-agent-command f21x86_64 '{ "execute": "guest-get-fsinfo" }'
error: Guest agent is not responding: QEMU guest agent is not available due to an error

3. Run 'shutdown -r now' in the guest
4, Wait for it to finish restarting
5. Attempt to send a guest agent command

$ virsh -c qemu:///system  qemu-agent-command f21x86_64 '{ "execute": "guest-get-fsinfo" }'


Actual results:
Still get same error

Expected results:
THe guest agent is usable again.

Additional info:

Comment 1 Peter Krempa 2015-05-15 08:50:23 UTC
Upstream fixes the issue in:

commit a03e2d3a9927eb2d207dc83e3e45c2e403732330
Author: Peter Krempa <pkrempa@redhat.com>
Date:   Fri Apr 24 16:48:26 2015 +0200

    qemu: Connect to guest agent after channel hotplug
    
    If a user hot-attaches the guest agent channel libvirt would ignore it
    until the restart of libvirtd or shutdown/destroy and start of the VM
    itself.
    
    This patch adds code that opens or closes the guest agent connection
    according to the state of the guest agent channel according to
    connect/disconnect events.
    
    To allow opening the channel from the event handler qemuConnectAgent
    needed to be exported.

Comment 4 zhenfeng wang 2015-06-30 08:50:41 UTC
Hi peter
As the patch description in comment1, it seems for the guest agent hotplug not for the issue in the bug description, i'm not sure about it, so can you help check whether i mistakely understand it? so that i could verify this bug enough. thanks in advance.

Comment 5 Peter Krempa 2015-06-30 11:35:55 UTC
Well the patch that I've posted in comment 1 fixes agent hotplug in such way that it actually triggers on the guest-side connection and disconnection event on the agent socket. Thus the code catches all agent connections and disconnections disregarding the way how they were triggered. Since the disconnection event cleans the error flag the patch should fix this bug as well.

Comment 6 zhenfeng wang 2015-07-16 06:45:31 UTC
hi peter
I found the libvirt still wait for the timeout of the command while restart the libvirtd service after the guest agent service was unreachable, would this be accepble? can you help check it, thanks.

pkginfo
libvirt-1.2.17-1.el7.x86_64 

steps
1.Start a guest with agent configured
2.stop the guest agent service inside the guest, then excute some commands depends on qemu-guest-agent, the command will report error right away.

guest#systemctl stop qemu-guest-agent

# time virsh domtime virt-tests-vm1
error: Guest agent is not responding: QEMU guest agent is not connected


real	0m0.016s
user	0m0.007s
sys	0m0.005s


3.Restart the libvirtd service, found libvirt still wait for the timeout of the command 
#systemctl restart libvirtd

# time virsh domtime virt-tests-vm1
error: Guest agent is not responding: Guest agent not available for now


real	0m5.018s
user	0m0.009s
sys	0m0.002s

Comment 7 Peter Krempa 2015-07-17 13:18:57 UTC
That is a cosmetic issue since both errors in the described cases are technically correct. If you want, please open a new bug to track that sub-issue.

Comment 8 zhenfeng wang 2015-07-20 02:50:19 UTC
Thanks Peter's reply, file a seperate bug 1244567 to track the sub-issue and mark this bug verifed according to the comment6

Comment 10 errata-xmlrpc 2015-11-19 05:59:02 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-2202.html


Note You need to log in before you can comment on or make changes to this bug.