Description of problem: Do guest-ping as quickly as possible to a vm or many vms(20+ in my test), a vm will reports this "QEMU guest agent is not available due to an error" error after few minutes, and restart libvirtd can fix this issue, the guest-ping agent command can be done correctly again. Version-Release number of selected component (if applicable): the upstream master branch with top commit id 22a92eb0a79f398c88bec4bb5e59804029ad7ca0. How reproducible: Steps to Reproduce: here is the reproduce script of mine: https://github.com/aspirer/study/blob/master/qemu-guest-agent/test2.py you can shorten the sleep time at Line63 to accelerate the reproduce process: https://github.com/aspirer/study/blob/master/qemu-guest-agent/test2.py#L63 Actual results: a vm reports libvirt: QEMU Driver error : Guest agent is not responding: QEMU guest agent is not available due to an error Expected results: all guest-ping commands to all vms are correctly done(return json data '{"return":{}}'). Additional info: qemuAgentIOProcessLine return -1 to qemuAgentIOProcessData qemuAgentIOProcess qemuAgentIO , then the error check codes of qemuAgentIO as below will call the callback function qemuProcessHandleAgentError, the qga is marked as errored one, and can not be used before we restart libvirtd; } else if (error) { void (*errorNotify)(qemuAgentPtr, virDomainObjPtr) = mon->cb->errorNotify; virDomainObjPtr vm = mon->vm; /* Make sure anyone waiting wakes up now */ virCondSignal(&mon->notify); virObjectUnlock(mon); virObjectUnref(mon); VIR_DEBUG("Triggering error callback"); (errorNotify)(mon, vm); Breakpoint 2, qemuAgentIOProcessLine (mon=0x7f17d8000e00, line=0x7f182e6958c0 "{\"return\": {}}", msg=0x0) at qemu/qemu_agent.c:341 341 if (virJSONValueObjectGetNumberUlong(obj, "return", &id) == 0) { (gdb) bt #0 qemuAgentIOProcessLine (mon=0x7f17d8000e00, line=0x7f182e6958c0 "{\"return\": {}}", msg=0x0) at qemu/qemu_agent.c:341 #1 0x00007f1822ab76b4 in qemuAgentIOProcessData (mon=0x7f17d8000e00, data=0x7f182e6958c0 "{\"return\": {}}", len=15, msg=0x0) at qemu/qemu_agent.c:384 #2 0x00007f1822ab77ca in qemuAgentIOProcess (mon=0x7f17d8000e00) at qemu/qemu_agent.c:426 #3 0x00007f1822ab7fe0 in qemuAgentIO (watch=63, fd=42, events=0, opaque=0x7f17d8000e00) at qemu/qemu_agent.c:626 #4 0x00007f182b738bed in virEventPollDispatchHandles (nfds=53, fds=0x7f182e695670) at util/vireventpoll.c:510 #5 0x00007f182b739424 in virEventPollRunOnce () at util/vireventpoll.c:660 #6 0x00007f182b737324 in virEventRunDefaultImpl () at util/virevent.c:308 #7 0x00007f182ceb7440 in virNetServerRun (srv=0x7f182e68c2b0) at rpc/virnetserver.c:1139 #8 0x00007f182ce6bb24 in main (argc=2, argv=0x7fff3bc9cb68) at libvirtd.c:1536 (gdb) l 336 * which is now processing our previous 337 * guest-sync commands. Check if this is 338 * the case and don't report an error but 339 * return silently. 340 */ 341 if (virJSONValueObjectGetNumberUlong(obj, "return", &id) == 0) { 342 VIR_DEBUG("Ignoring delayed reply to guest-sync: %llu", id); 343 ret = 0; 344 goto cleanup; 345 } (gdb) p virJSONValueObjectGetNumberUlong(obj, "return", &id) $1 = -1 (gdb) p id $2 = 5
*** This bug has been marked as a duplicate of bug 1090551 ***