Bug 1142290

Summary: guest is stuck when setting balloon memory with large guest-stats-polling-interval
Product: Red Hat Enterprise Linux 7 Reporter: Luiz Capitulino <lcapitulino>
Component: qemu-kvmAssignee: Luiz Capitulino <lcapitulino>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.1CC: dyuan, hhuang, huding, jiahu, jmiao, juli, juzhang, knoel, lcapitulino, mzhan, rbalakri, virt-bugs, virt-maint, xfu
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: qemu-kvm-1.5.3-75.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1140997 Environment:
Last Closed: 2015-03-05 08:12:32 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1140997    
Bug Blocks:    

Description Luiz Capitulino 2014-09-16 13:54:58 UTC
qemu-kvm is also affected by this bug, cloning the BZ.

+++ This bug was initially created as a clone of Bug #1140997 +++

Description of problem:
guest is stuck when setting balloon memory with large guest-stats-polling-interval

Version-Release number of selected component (if applicable):
libvirt-1.2.8-1.el7.x86_64
qemu-kvm-rhev-2.1.0-3.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1. set value to '21474836'

# virsh edit r7a
...
    <memballoon model='virtio'>
      <stats period='21474836'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </memballoon>
...

2. start it
# virsh start r7a
[hung]

# ps -ef | grep qemu
qemu     28339     1 99 16:04 ?        00:00:18 /usr/libexec/qemu-kvm -name r7a -S -machine pc-i440fx-rhel7.0.0,accel=kvm,usb=off -m 1024 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid c4fa19e8-e8c9-49ab-b6bf-0427ed4e750e -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/r7a.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=2014-09-12T08:04:13 -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x3 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -drive file=/var/lib/libvirt/images/r7.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -vnc 127.0.0.1:0 -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,bus=pci.0,addr=0x2 -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x7 -msg timestamp=on


in qemu event:
6102.143 > 0x7fd9e8008510 {"execute":"qom-list","arguments":{"path":"//machine/i440fx/pci.0/child[9]"},"id":"libvirt-87"}
6102.144 < 0x7fd9e8008510 {"return": [{"name": "virtio-pci[0]", "type": "child<qemu:memory-region>"}, {"name": "virtio-bus", "type": "child<virtio-pci-bus>"}, {"name": "bus master[0]", "type": "child<qemu:memory-region>"}, {"name": "guest-stats-polling-interval", "type": "int"}, {"name": "guest-stats", "type": "guest statistics"}, {"name": "virtio-backend", "type": "child<virtio-balloon-device>"}, {"name": "parent_bus", "type": "link<bus>"}, {"name": "command_serr_enable", "type": "bool"}, {"name": "multifunction",
6102.144 > 0x7fd9e8008510 {"execute":"qom-set","arguments":{"path":"//machine/i440fx/pci.0/child[9]","property":"guest-stats-polling-interval","value":21474836},"id":"libvirt-88"}
6102.145 < 0x7fd9e8008510 {"return": {}, "id": "libvirt-88"}
6102.146 > 0x7fd9e8008510 {"execute":"balloon","arguments":{"value":1073741824},"id":"libvirt-89"}

--- Additional comment from Jincheng Miao on 2014-09-14 23:41:46 EDT ---

Actually, I also hit this problem just using qemu-kvm-rhev,
the importance is set guest-stats-polling-interval to '21474836'.

start guest with virtio-balloon:
# /usr/libexec/qemu-kvm -name r7a -S -machine pc-i440fx-rhel7.0.0,accel=kvm,usb=off -m 1024 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid c4fa19e8-e8c9-49ab-b6bf-0427ed4e750e -no-user-config -nodefaults -qmp tcp:0:5555,server,nowait -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/r7a.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=2014-09-15T03:33:57 -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x3 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -drive file=/var/lib/libvirt/images/r7.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -vnc 127.0.0.1:0 -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x7 -msg timestamp=on

execute some QMP cmds:
# telnet 127.0.0.1 5555
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
{"QMP": {"version": {"qemu": {"micro": 0, "minor": 1, "major": 2}, "package": " (qemu-kvm-rhev-2.1.0-3.el7)"}, "capabilities": []}}

{"execute":"qmp_capabilities"}
{"return": {}}

{"execute":"qom-list","arguments":{"path":"//machine/i440fx/pci.0/child[9]"},"id":"libvirt-87"}
{"return": [{"name": "virtio-pci[0]", "type": "child<qemu:memory-region>"}, {"name": "virtio-bus", "type": "child<virtio-pci-bus>"}, {"name": "bus master[0]", "type": "child<qemu:memory-region>"}, {"name": "guest-stats-polling-interval", "type": "int"}, {"name": "guest-stats", "type": "guest statistics"}, {"name": "virtio-backend", "type": "child<virtio-balloon-device>"}, {"name": "parent_bus", "type": "link<bus>"}, {"name": "command_serr_enable", "type": "bool"}, {"name": "multifunction", "type": "bool"}, {"name": "rombar", "type": "uint32"}, {"name": "romfile", "type": "str"}, {"name": "addr", "type": "int32"}, {"name": "legacy-addr", "type": "str"}, {"name": "event_idx", "type": "bool"}, {"name": "indirect_desc", "type": "bool"}, {"name": "class", "type": "uint32"}, {"name": "hotplugged", "type": "bool"}, {"name": "hotpluggable", "type": "bool"}, {"name": "realized", "type": "bool"}, {"name": "type", "type": "string"}], "id": "libvirt-87"}

{"execute":"qom-set","arguments":{"path":"//machine/i440fx/pci.0/child[9]","property":"guest-stats-polling-interval","value":21474836},"id":"libvirt-88"}
{"return": {}, "id": "libvirt-88"}

{"execute":"balloon","arguments":{"value":1073741824},"id":"libvirt-89"}

Then this 'balloon' doesn't return.

So I think this bug belongs to qemu-kvm-rhev, could you check it again?

--- Additional comment from Luiz Capitulino on 2014-09-16 09:52:12 EDT ---

Thanks. This is an integer overflow in the virtio-balloon driver, I've posted the fix upstream and will backport it as soon as it's merged.

Comment 3 Miroslav Rezanina 2014-10-08 10:07:19 UTC
Fix included in qemu-kvm-1.5.3-75.el7

Comment 5 Jun Li 2014-10-15 10:27:20 UTC
Reproduce:
Version of components:
qemu-kvm-1.5.3-68.el7.x86_64

Steps:
1, boot guest with -S and "-device virtio-balloon-pci,id=balloon0,bus=pci.0", and leave it as stop status.
# /usr/libexec/qemu-kvm -m 1G -machine pc-i440fx-rhel7.0.0,accel=kvm,usb=off -smp 2 -monitor stdio -spice port=5931,disable-ticketing -qmp tcp::8888,server,nowait -boot menu=on -drive file=/home/RHEL-Server-7.0-64-virtio.qcow2,if=none,id=drive-virtio-disk,format=qcow2,cache=none,aio=native,werror=stop,rerror=stop -device ide-drive,bus=ide.0,unit=0,wwn=0x5000c50015ea71bb,drive=drive-virtio-disk,id=virtio-disk,bootindex=1 -netdev tap,id=tap0,script=/etc/qemu-ifup,vhost=on -device virtio-net-pci,netdev=tap0,id=net0,mac=24:be:05:14:11:11,mq=on -drive file=/home/kernel-3.10.0-184.el7.iso,if=none,id=hd,format=raw,media=cdrom,readonly=on,cache=none,werror=stop,rerror=stop -device ide-drive,bus=ide.0,unit=1,wwn=0x5000c50015ea71ad,drive=hd,id=cdrom \
-device virtio-balloon-pci,id=balloon0,bus=pci.0 -S

2, run following command inside QMP.
$ telnet 10.66.82.225 8888
Trying 10.66.82.225...
Connected to 10.66.82.225.
Escape character is '^]'.
{"QMP": {"version": {"qemu": {"micro": 3, "minor": 5, "major": 1}, "package": " (qemu-kvm-1.5.3-68.el7)"}, "capabilities": []}}
{"execute":"qmp_capabilities"}
{"return": {}}

{"execute":"qom-list","arguments":{"path":"//machine/i440fx/pci.0/child[6]"},"id":"libvirt-87"}

{"execute":"qom-set","arguments":{"path":"//machine/i440fx/pci.0/child[6]","property":"guest-stats-polling-interval","value":21474836},"id":"libvirt-88"}
{"return": {}, "id": "libvirt-88"}

{"execute":"balloon","arguments":{"value":1073741824},"id":"libvirt-89"}

Results:
After step 2, qemu-kvm hang.

As above show, this bz has been reproduce.
===================
Verify:
Version of components:
qemu-kvm-1.5.3-75.el7.x86_64

Steps as above show, after step 2, 
{"execute":"qom-set","arguments":{"path":"//machine/i440fx/pci.0/child[6]","property":"guest-stats-polling-interval","value":21474836},"id":"libvirt-88"}
{"return": {}, "id": "libvirt-88"}

{"execute":"balloon","arguments":{"value":1073741824},"id":"libvirt-89"}
{"return": {}, "id": "libvirt-89"}

qemu-kvm works well. So this bz has been verified.
===================
Verify on qemu-kvm-rhev-2.1.2-3.el7.x86_64, steps as above show, after step2, qemu-kvm works well. QMP works well, give info as followings:
# telnet 10.66.8.240 8888
Trying 10.66.8.240...
Connected to 10.66.8.240.
Escape character is '^]'.
{"QMP": {"version": {"qemu": {"micro": 2, "minor": 1, "major": 2}, "package": " (qemu-kvm-rhev-2.1.2-3.el7)"}, "capabilities": []}}
{"execute":"qmp_capabilities"} 
{"return": {}}

{"execute":"qom-set","arguments":{"path":"//machine/i440fx/pci.0/child[6]","property":"guest-stats-polling-interval","value":21474836},"id":"libvirt-88"}
{"return": {}, "id": "libvirt-88"}
{"execute":"balloon","arguments":{"value":1073741824},"id":"libvirt-89"}
{"return": {}, "id": "libvirt-89"}

Comment 8 errata-xmlrpc 2015-03-05 08:12:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-0349.html