Bug 1469981

Summary: [qmp] Decrease guest's balloon value, and then shutdown guest via QMP, but QMP cannot quit normally
Product: Red Hat Enterprise Linux 7 Reporter: yilzhang
Component: qemu-kvmAssignee: David Gibson <dgibson>
Status: CLOSED DUPLICATE QA Contact: Virtualization Bugs <virt-bugs>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.4-AltCC: dgibson, knoel, qzhang, rbalakri, virt-maint, xuma, yilzhang
Target Milestone: rc   
Target Release: ---   
Hardware: ppc64le   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-07-14 06:12:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description yilzhang 2017-07-12 08:10:00 UTC
Description of problem:
Decrease guest's balloon value, and then shutdown guest via QMP "system_powerdown" command, QMP cannot quit after a very long time.
x86 and Power8 don't have this issue.


Version-Release number of selected component (if applicable):
Host:  kernel: 4.11.0-10.el7a.ppc64le
       qemu-kvm-2.9.0-16.el7a.ppc64le
Guest: 4.11.0-10.el7a.ppc64le

How reproducible: 100%


Steps to Reproduce:
1. Boot guest with balloon device and QMP
2. From any box with telnet client, run
   #telnet $HostIP 9999
   {"execute": "qmp_capabilities"}
3. Decrease the memory value with balloon to 1G (guest's original memory is 8G):
   {"execute":"balloon","arguments":{"value":1073741824}}
4. Shutdown guest via qmp "system_powerdown" command:
   {"execute":"system_powerdown"}



Actual results:
In step4, QMP cannot quit after a very long time(maybe 30 minutes). While guest is powered off immediately in this case.

Expected results:
QMP quit after issuing {"execute":"system_powerdown"}

Additional info:
1. Only Power9 has this problem, x86 and Power8 work well
2. Qemu command line to boot up a guest:
/usr/libexec/qemu-kvm \
 -name yilzhang_vm \
-serial unix:/tmp/myserial.log,server,nowait \
-smp 8,sockets=2,cores=4,threads=1 -m 8192 \
-monitor stdio \
-qmp tcp:0:9999,server,nowait \
-nodefaults -enable-kvm \
-device virtio-balloon-pci,id=balloon0 \
\
-device virtio-scsi-pci,bus=pci.0,addr=0x3,id=scsi0 \
 -drive file=/home/yilzhang/rhel7.4-alt-20170626.4.qcow2,if=none,id=drive-system-disk,format=qcow2,cache=none,aio=native,werror=stop,rerror=stop \
-device scsi-hd,bus=scsi0.0,drive=drive-system-disk,id=system-disk,bootindex=1 \
-netdev tap,id=hostnet0,vhost=on,script=/etc/qemu-ifup \
-device virtio-net-pci,netdev=hostnet0,id=virtio-net-pci0,mac=52:54:00:c3:e7:84,bus=pci.0,addr=0x4,ioeventfd=off \

Comment 2 David Gibson 2017-07-13 02:47:11 UTC
This looks like it might be a duplicate of bug 1382914.

Can you talk to Xianxian and see if that seems to be the case.

Comment 3 yilzhang 2017-07-13 10:19:40 UTC
Hi David,
I think they two should be the same issue.

This bug: also can be reproduced on x86 if using an even lower balloon value(e.g. 512M)

But neither this bug nor Bug 1382914 could be reproduced on Power8 with rhel7.4, when using balloon value 1024M. What happens?

Comment 4 yilzhang 2017-07-14 03:27:23 UTC
Besides, Bug 1382914 can be reproduce on Power9 with rhel7.4-alt

Comment 5 David Gibson 2017-07-14 06:12:40 UTC
I think the fact this has appeared is just because newer versions of the software (kernel, systemd and others) are now consuming more memory.  That's making it easier to trigger the problem with a larger balloon value.

I think the problem could be reproduced on rhel7.4 (both power8 and x86), but it would require a lowere balloon value.

*** This bug has been marked as a duplicate of bug 1382914 ***