Bug 741217
Summary: | KVM guest install fails with libvirtError: cannot send monitor command '{"execute":"query-balloon"}': Connection reset by peer | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Kristijan <kbesu> | ||||
Component: | libvirt | Assignee: | Jiri Denemark <jdenemar> | ||||
Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> | ||||
Severity: | urgent | Docs Contact: | |||||
Priority: | urgent | ||||||
Version: | 6.1 | CC: | acathrow, dallan, jdenemar, jeder, jwest, linuxdev-ofuku, mjenner, mmahudha, moshiro, msvoboda, myamazak, mzhan, rwu, weizhan | ||||
Target Milestone: | rc | Keywords: | ZStream | ||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | libvirt-0.9.4-16.el6 | Doc Type: | Bug Fix | ||||
Doc Text: |
If the QEMU driver failed to update information about currently allocated memory, installing a new virtual machine could have failed with the following error message:
ERROR cannot send monitor command '{"execute":"query-balloon"}':
Connection reset by peer
With this update, the driver has been modified to not consider this behavior as fatal. Installation now proceeds and finishes as expected.
|
Story Points: | --- | ||||
Clone Of: | Environment: | ||||||
Last Closed: | 2011-12-06 11:34:32 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 708393, 747120, 747358 | ||||||
Attachments: |
|
Description
Kristijan
2011-09-26 09:57:10 UTC
The error ERROR cannot send monitor command '{"execute":"query-balloon"}': Connection reset by peer most likely means qemu-kvm process crashed. Can you send us /var/log/libvirt/qemu/csms02.zone31.log file, which contains the output of qemu-kvm process started for domain "csms02.zone31"? Created attachment 524879 [details]
file
Please find csms02.zone31.log file in attachment That's interesting, there is nothing about any kind of failure in the qemu log. It just starts and after a few minutes (looks like not more than 4) it normally shuts down. Also from virt-install output Starting install... Creating domain... | 0 B 00:00 Domain installation still in progress. Waiting for installation to complete. Domain has shutdown. Continuing. Starting domain... | 0 B 00:00 Domain installation still in progress. Waiting for installation to complete. ERROR cannot send monitor command '{"execute":"query-balloon"}': Connection reset by peer it seems the domain installation was started, virt-install was waiting for reboot and wanted to start the installed domain, which failed. It looks like the "Connection reset by peer" may be caused by qemu shutting down early (and not crashing as I had thought) perhaps because no bootable disk was found. Do you have any idea what happened inside the guest OS between "Domain installation still in progress." and "Domain has shutdown"? Perhaps something during guest OS installation failed and it was automatically rebooted (from inside, by the install process), which made virt-install think installation is complete. Can you check that? BTW, I'm a bit confused with the second occurrence of | 0 B 00:00 Domain installation still in progress. Waiting for installation to complete. after "Starting domain..." was it really there or is that just a copy&paste mistake? Anyway, can you reproduce the issue with libvirt debug logs enabled and send us the libvirtd.log file? To enable libvirt debug logs, set log_filters="1:qemu" log_outputs="1:file:/var/log/libvirt/libvirtd.log" in /etc/libvirt/libvirtd.conf and restart libvirtd service. Ok, I accidentally reproduced similar issue with virt-manager and understood what's going on here. The monitor error doesn't come from starting an already installed domain. It happens at the end of installation but not as a result of qemu crash. During the installation phase, virt-install polls every two seconds for domain shutdown using virDomainGetInfo, which tries to update balloon info. If virDomainGetInfo is called just at the moment when the domain is shutting down, the query-baloon monitor command may fail because qemu just exits instead of giving us any reply. Virt-install doesn't tolerate the error coming from virDomainGetInfo and aborts. Similar thing happens to virt-manager. This is now fixed upstream by commit fcd2bd55d714b8e939c5bfd44fd2ddc32a09c141 Author: Jiri Denemark <jdenemar> Date: Wed Oct 5 16:07:36 2011 +0200 qemu: Don't fail virDomainGetInfo if we can't update balloon info Qemu driver tries to update balloon data in virDomainGetInfo and if it can't do so because there is another monitor job running, it just reports what's known in domain def. However, if there was no job running but getting the data from qemu fails, we would fail the whole API. This doesn't make sense. Let's make the failure nonfatal. verify pass on kernel-2.6.32-206.el6.x86_64 qemu-kvm-0.12.1.2-2.195.el6.x86_64 libvirt-0.9.4-16.el6.x86_64 According to comment 10, reproduce on rhel6.1 with steps 1. start a win2003-x64 guest 2. on console 1, do # while true; do virsh dominfo win2008r2-x64 >> /tmp/dominfo_log ; done 3. on console 2, do # virsh shutdown win2003-x64 at the end of guest shutdown, on console 1, there will be an error line output: error: cannot send monitor command '{"execute":"query-balloon"}': Connection reset by peer on libvirt-0.9.4-16.el6.x86_64, there is no error, so verify pass Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: If the QEMU driver failed to update information about currently allocated memory, installing a new virtual machine could have failed with the following error message: ERROR cannot send monitor command '{"execute":"query-balloon"}': Connection reset by peer With this update, the driver has been modified to not consider this behavior as fatal. Installation now proceeds and finishes as expected. *** Bug 747555 has been marked as a duplicate of this bug. *** Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2011-1513.html |