Red Hat Bugzilla – Bug 868575
libvirt is often failing to show qemu stderr/stdout when startup fails
Last modified: 2013-10-31 16:39:01 EDT
One of the most common class of bugs we get are errors launching qemu, and in at least 95% of the time the only useful info is qemu stderr/stdout. However we aren't doing a good job of getting that info to the user right now. Having detail-less error's here like 'handshake failed' or 'couldn't connect to monitor' suck for users and libvirt devs alike.
There are 2 problems:
1) The conditions in qemuWaitForMonitor for scraping the logs don't always trigger.
I can consistently reproduce an issue here on libvirt.git, by sticking <readonly/> in an IDE disk block, which correctly causes qemu to bail out. When we reach the kill(2) check in the cleanup: block, the guest is still running, so we don't scrape the log output.
Possible solution could be to give a short wait loop for the VM to exit (we used to do this but not sure what happened to it). I'm sure there's a more c
2) Anything that errors between qemuProcessStart:virCommandRun and qemuProcessStart:qemuProcessWaitForMonitor won't report log output.
I was consistently seeing an issue here when hitting #809910, but it can be artificially reproduced quite easily by using an XML config that upsets qemu and sticking a sleep before each function call after virCommandRun.
All the bits here that can error depending on the qemu process state (particularly the virCommandHandshake bit) need to show log output when they fail.
(that's what the long dead '#if 0' block in the code was trying to accomplish but it was turned off without an alternative provided)
This isn't specific to F18 but it would be nice if we could get fixes queued here before the testday.
dallan, can this be prioritized? pretty much every failure I'm hitting on F19 gives me the useless startup error 'connection reset by peer'
*** Bug 922425 has been marked as a duplicate of this bug. ***
This should already be fixed in F20, so closing