Bug 1090550

Summary: QEMU guest agent is not available due to an error
Product: [Community] Virtualization Tools Reporter: Wangpan <hzwangpan>
Component: libvirtAssignee: Libvirt Maintainers <libvirt-maint>
Status: CLOSED DUPLICATE QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: unspecifiedCC: acathrow, pkrempa
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-04-23 15:21:30 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Wangpan 2014-04-23 15:02:03 UTC
Description of problem:
Do guest-ping as quickly as possible to a vm or many vms(20+ in my test), a vm will reports this "QEMU guest agent is not available due to an error" error after few minutes, and restart libvirtd can fix this issue, the guest-ping agent command can be done correctly again.

Version-Release number of selected component (if applicable):
the upstream master branch with top commit id 22a92eb0a79f398c88bec4bb5e59804029ad7ca0.

How reproducible:


Steps to Reproduce:
here is the reproduce script of mine:
https://github.com/aspirer/study/blob/master/qemu-guest-agent/test2.py
you can shorten the sleep time at Line63 to accelerate the reproduce process:
https://github.com/aspirer/study/blob/master/qemu-guest-agent/test2.py#L63

Actual results:
a vm reports libvirt: QEMU Driver error : Guest agent is not responding: QEMU guest agent is not available due to an error

Expected results:
all guest-ping commands to all vms are correctly done(return json data '{"return":{}}').

Additional info:
qemuAgentIOProcessLine return -1 to qemuAgentIOProcessData qemuAgentIOProcess qemuAgentIO , then the error check codes of qemuAgentIO as below will call the callback function qemuProcessHandleAgentError, the qga is marked as errored one, and can not be used before we restart libvirtd;
} else if (error) {
        void (*errorNotify)(qemuAgentPtr, virDomainObjPtr)
            = mon->cb->errorNotify;
        virDomainObjPtr vm = mon->vm;

        /* Make sure anyone waiting wakes up now */
        virCondSignal(&mon->notify);
        virObjectUnlock(mon);
        virObjectUnref(mon);
        VIR_DEBUG("Triggering error callback");
        (errorNotify)(mon, vm);

Breakpoint 2, qemuAgentIOProcessLine (mon=0x7f17d8000e00, line=0x7f182e6958c0 "{\"return\": {}}", msg=0x0) at qemu/qemu_agent.c:341
341                 if (virJSONValueObjectGetNumberUlong(obj, "return", &id) == 0) {
(gdb) bt
#0  qemuAgentIOProcessLine (mon=0x7f17d8000e00, line=0x7f182e6958c0 "{\"return\": {}}", msg=0x0) at qemu/qemu_agent.c:341
#1  0x00007f1822ab76b4 in qemuAgentIOProcessData (mon=0x7f17d8000e00, data=0x7f182e6958c0 "{\"return\": {}}", len=15, msg=0x0) at qemu/qemu_agent.c:384
#2  0x00007f1822ab77ca in qemuAgentIOProcess (mon=0x7f17d8000e00) at qemu/qemu_agent.c:426
#3  0x00007f1822ab7fe0 in qemuAgentIO (watch=63, fd=42, events=0, opaque=0x7f17d8000e00) at qemu/qemu_agent.c:626
#4  0x00007f182b738bed in virEventPollDispatchHandles (nfds=53, fds=0x7f182e695670) at util/vireventpoll.c:510
#5  0x00007f182b739424 in virEventPollRunOnce () at util/vireventpoll.c:660
#6  0x00007f182b737324 in virEventRunDefaultImpl () at util/virevent.c:308
#7  0x00007f182ceb7440 in virNetServerRun (srv=0x7f182e68c2b0) at rpc/virnetserver.c:1139
#8  0x00007f182ce6bb24 in main (argc=2, argv=0x7fff3bc9cb68) at libvirtd.c:1536

(gdb) l
336                  * which is now processing our previous
337                  * guest-sync commands. Check if this is
338                  * the case and don't report an error but
339                  * return silently.
340                  */
341                 if (virJSONValueObjectGetNumberUlong(obj, "return", &id) == 0) {
342                     VIR_DEBUG("Ignoring delayed reply to guest-sync: %llu", id);
343                     ret = 0;
344                     goto cleanup;
345                 }
(gdb) p virJSONValueObjectGetNumberUlong(obj, "return", &id)
$1 = -1
(gdb) p id
$2 = 5

Comment 1 Peter Krempa 2014-04-23 15:21:30 UTC

*** This bug has been marked as a duplicate of bug 1090551 ***