Bug 741217

Summary: KVM guest install fails with libvirtError: cannot send monitor command '{"execute":"query-balloon"}': Connection reset by peer
Product: Red Hat Enterprise Linux 6 Reporter: Kristijan <kbesu>
Component: libvirtAssignee: Jiri Denemark <jdenemar>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 6.1CC: acathrow, dallan, jdenemar, jeder, jwest, linuxdev-ofuku, mjenner, mmahudha, moshiro, msvoboda, myamazak, mzhan, rwu, weizhan
Target Milestone: rcKeywords: ZStream
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: libvirt-0.9.4-16.el6 Doc Type: Bug Fix
Doc Text:
If the QEMU driver failed to update information about currently allocated memory, installing a new virtual machine could have failed with the following error message: ERROR cannot send monitor command '{"execute":"query-balloon"}': Connection reset by peer With this update, the driver has been modified to not consider this behavior as fatal. Installation now proceeds and finishes as expected.
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-12-06 11:34:32 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 708393, 747120, 747358    
Attachments:
Description Flags
file none

Description Kristijan 2011-09-26 09:57:10 UTC
Description of problem:

installation of Windows guest failed.

I've tried to install Windows 2003 virtual machine and got the following error from "virt-install":

/usr/bin/virt-install --cdrom=/WINDOWS2003SP2_KVM_E08.00.00.01.iso -r 1000 -n csms02.zone31 --network bridge=br0,model=virtio --disk path=/dev/VX/csms02.zone31hddC,cache=writeback,bus=virtio --disk path=/dev/VX/csms02.zone31_setup,cache=writeback,bus=virtio --os-variant=win2k3 --vcpus=1,maxvcpus=1 --cpuset=7 --cpu=host --noautoconsole --graphics vnc

Starting install...
Creating domain...                                                                                                    |    0 B     00:00
Domain installation still in progress. Waiting for installation to complete.
Domain has shutdown. Continuing.
Starting domain...                                                                                                    |    0 B     00:00
Domain installation still in progress. Waiting for installation to complete.
ERROR    cannot send monitor command '{"execute":"query-balloon"}': Connection reset by peer
Domain installation does not appear to have been successful.
If it was, you can restart your domain by running:
  virsh --connect qemu:///system start csms02.zone31
otherwise, please restart your installation.
ERROR: virt-install  \
                          --cdrom=/WINDOWS2003SP2_KVM_E08.00.00.01.iso \
                          -r 1000  \
                          -n csms02.zone31 \
                           --network bridge=br0,model=virtio   \
                          --disk path=/dev/VX/csms02.zone31hddC,cache=writeback,bus=virtio --disk path=/dev/VX/csms02.zone31_setup,cache=writeback,bus=virtio   \
                          --os-variant=win2k3 \
                          --vcpus=1,maxvcpus=1  \
                          --cpuset=7\
                          --cpu=host \
                          --noautoconsole \
                          --graphics vnc failed for csms02.zone31.


Version-Release number of selected component (if applicable):

libvirt-client-0.8.7-18.el6.x86_64
Linux 2.6.32-131.6.1.el6.x86_64 #1 SMP Mon Jun 20 14:15:38 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux

How reproducible:

Executing following command

/usr/bin/virt-install --cdrom=/WINDOWS2003SP2_KVM_E08.00.00.01.iso -r 1000 -n csms02.zone31 --network bridge=br0,model=virtio --disk path=/dev/VX/csms02.zone31hddC,cache=writeback,bus=virtio --disk path=/dev/VX/csms02.zone31_setup,cache=writeback,bus=virtio --os-variant=win2k3 --vcpus=1,maxvcpus=1 --cpuset=7 --cpu=host --noautoconsole --graphics vnc

  
Actual results:

Installation of Windows 2003 guest fails. 

Expected results:

Installation of Windows 2003 guest should pass without problems.


Additional info:

1. We first thought that the issue is related with memory, 
Customer has stop some guests running on the system, or clear the caches:

echo 1 > /proc/sys/vm/drop_caches

did not help

2. Customer has saw the same issue when installing Linux guests (however only 2 times)
   They haven't tried to install using virt-manager/virt-viewer as they are preparing un-attanded (silent) installation of Windows so they need to rely on "virt-install" (using GUI for installation of guests is not an option in their situation)

3. WINDOWS ISO which they're using for the installation of Windows guests, already includes in the image "virtio" drivers both for HDD and network (virtio drivers were included from /usr/share/virtio-win/virtio-win-1.2.0.iso delivered with virtio-win-1.2.0-1.el6.noarch.rpm). Even if they installed Windows guest without support for virtio (for HDD bus=ide, for network devices type=e1000 or "rtl8139") the installation crashed after several iterations (please see examples below)

4. after installation of Windows machine (which crashed with '{"execute":"query-balloon"}' message) directory /var/lib/libvirt/qemu/save is empty, so there was not .save file to remove.

5.  Adding the following to Customer's /etc/libvirt/qemu.conf


  Locate this in your conf file:
#cgroup_device_acl = [
#    "/dev/null", "/dev/full", "/dev/zero",
#    "/dev/random", "/dev/urandom",
#    "/dev/ptmx", "/dev/kvm", "/dev/kqemu",
#    "/dev/rtc", "/dev/hpet", "/dev/net/tun",
#]

Then insert the following line:

cgroup_controllers = [ ]

Afterwards restart the libvirt service and try the install once more.
Didn't help

6. specifying --os-variant but not --os-type, the virt-install man page says --os-type is not required, but perhaps it is. also didn't help.

Comment 2 Jiri Denemark 2011-09-26 10:16:31 UTC
The error

    ERROR    cannot send monitor command '{"execute":"query-balloon"}': 
    Connection reset by peer

most likely means qemu-kvm process crashed. Can you send us /var/log/libvirt/qemu/csms02.zone31.log file, which contains the output of qemu-kvm process started for domain "csms02.zone31"?

Comment 3 Kristijan 2011-09-26 10:28:22 UTC
Created attachment 524879 [details]
file

Comment 4 Kristijan 2011-09-26 10:29:40 UTC
Please find csms02.zone31.log file in attachment

Comment 5 Jiri Denemark 2011-09-28 16:51:54 UTC
That's interesting, there is nothing about any kind of failure in the qemu log. It just starts and after a few minutes (looks like not more than 4) it normally shuts down.

Also from virt-install output

Starting install...
Creating domain...                                                             
                                      |    0 B     00:00
Domain installation still in progress. Waiting for installation to complete.
Domain has shutdown. Continuing.
Starting domain...                                                             
                                      |    0 B     00:00
Domain installation still in progress. Waiting for installation to complete.
ERROR    cannot send monitor command '{"execute":"query-balloon"}': Connection
reset by peer

it seems the domain installation was started, virt-install was waiting for reboot and wanted to start the installed domain, which failed. It looks like the "Connection reset by peer" may be caused by qemu shutting down early (and not crashing as I had thought) perhaps because no bootable disk was found. Do you have any idea what happened inside the guest OS between "Domain installation still in progress." and "Domain has shutdown"? Perhaps something during guest OS installation failed and it was automatically rebooted (from inside, by the install process), which made virt-install think installation is complete. Can you check that?

BTW, I'm a bit confused with the second occurrence of
                                      |    0 B     00:00
Domain installation still in progress. Waiting for installation to complete.

after "Starting domain..." was it really there or is that just a copy&paste mistake?

Anyway, can you reproduce the issue with libvirt debug logs enabled and send us the libvirtd.log file? To enable libvirt debug logs, set
    log_filters="1:qemu"
    log_outputs="1:file:/var/log/libvirt/libvirtd.log"
in /etc/libvirt/libvirtd.conf and restart libvirtd service.

Comment 10 Jiri Denemark 2011-10-05 14:32:16 UTC
Ok, I accidentally reproduced similar issue with virt-manager and understood what's going on here. The monitor error doesn't come from starting an already installed domain. It happens at the end of installation but not as a result of qemu crash. During the installation phase, virt-install polls every two seconds for domain shutdown using virDomainGetInfo, which tries to update balloon info. If virDomainGetInfo is called just at the moment when the domain is shutting down, the query-baloon monitor command may fail because qemu just exits instead of giving us any reply. Virt-install doesn't tolerate the error coming from virDomainGetInfo and aborts. Similar thing happens to virt-manager.

Comment 11 Jiri Denemark 2011-10-05 14:46:08 UTC
This is now fixed upstream by

commit fcd2bd55d714b8e939c5bfd44fd2ddc32a09c141
Author: Jiri Denemark <jdenemar>
Date:   Wed Oct 5 16:07:36 2011 +0200

    qemu: Don't fail virDomainGetInfo if we can't update balloon info
    
    Qemu driver tries to update balloon data in virDomainGetInfo and if it
    can't do so because there is another monitor job running, it just
    reports what's known in domain def. However, if there was no job running
    but getting the data from qemu fails, we would fail the whole API. This
    doesn't make sense. Let's make the failure nonfatal.

Comment 18 weizhang 2011-10-08 08:13:05 UTC
verify pass on 
kernel-2.6.32-206.el6.x86_64
qemu-kvm-0.12.1.2-2.195.el6.x86_64
libvirt-0.9.4-16.el6.x86_64

According to comment 10, reproduce on rhel6.1 with steps
1. start a win2003-x64 guest
2. on console 1, do 
# while true; do virsh dominfo win2008r2-x64 >> /tmp/dominfo_log ; done
3. on console 2, do
# virsh shutdown win2003-x64

at the end of guest shutdown, on console 1, there will be an error line output:
error: cannot send monitor command '{"execute":"query-balloon"}': Connection reset by peer

on libvirt-0.9.4-16.el6.x86_64, there is no error, so verify pass

Comment 22 Miroslav Svoboda 2011-10-20 13:53:12 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
If the QEMU driver failed to update information about currently allocated memory, installing a new virtual machine could have failed with the following error message:

     ERROR    cannot send monitor command '{"execute":"query-balloon"}':
     Connection reset by peer

 With this update, the driver has been modified to not consider this behavior as fatal. Installation now proceeds and finishes as expected.

Comment 23 Jiri Denemark 2011-10-21 10:32:13 UTC
*** Bug 747555 has been marked as a duplicate of this bug. ***

Comment 24 errata-xmlrpc 2011-12-06 11:34:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2011-1513.html