Bug 1942011

Summary: can't unplug balloon device on winows guest under q35
Product: Red Hat Enterprise Linux 9 Reporter: xiagao
Component: qemu-kvmAssignee: Virtualization Maintenance <virt-maint>
qemu-kvm sub component: PCI QA Contact: xiagao
Status: CLOSED CURRENTRELEASE Docs Contact:
Severity: high    
Priority: high CC: ailan, coli, demeng, jinzhao, jusual, lijin, qizhu, virt-maint, yuhuang
Version: unspecifiedKeywords: Triaged
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Windows   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
If this bug requires documentation, please select an appropriate Doc Type value.can't unplug balloon device on win2019 guest under q35
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-08-25 01:36:17 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1744438    

Description xiagao 2021-03-23 12:28:15 UTC
Description of problem:
Try to unplug/plug balloon device in a loop, but failed at the fifth time.

Version-Release number of selected component (if applicable):
qemu-kvm-5.2.0-13.module+el8.4.0+10369+fd280775.x86_64
kernel-4.18.0-298.el8.x86_64
seabios-bin-1.14.0-1.module+el8.4.0+8855+a9e237a9.noarch
virtio-win-prewhql-0.1-196

How reproducible:
100%

Steps to Reproduce:
1.start a win2019 guest with q35
2.install balloon driver with 196 version
3.try to unplug/plug balloon device in a loop

#!/bin/bash
# some simply scripts for balloon device hotplug/unplug in a loop
let i=0
exec 3<>/dev/tcp/localhost/**4455 #note modify this to qmp port**
echo -e "{ 'execute': 'qmp_capabilities' }" >&3
read response <&3
echo $response
while [ $i -lt 100 ]
do
echo -e "{ 'execute': 'device_del', 'arguments': {'id': 'balloon1' }}">&3 ;
sleep 2 ;
read response <&3 ;
echo "$i: $response"
sleep 2 ;
echo -e "{'execute':'device_add','arguments':{'id':'balloon0','driver':'virtio-balloon-pci','bus':'pcie-root-port-$i','addr':'0x0'}}">&3 ;
sleep 2 ;
read response <&3
echo "$i: $response"
let i=$i+1
done

Actual results:
wait for 30 mins,still didn't receive qmp "DEVICE_DELETED" response info as below.
{"timestamp": {"seconds": 1616500535, "microseconds": 292559}, "event": "DEVICE_DELETED", "data": {"path": "/machine/peripheral/balloon0/virtio-backend"}}
{"timestamp": {"seconds": 1616500535, "microseconds": 343703}, "event": "DEVICE_DELETED", "data": {"device": "balloon0", "path": "/machine/peripheral/balloon0"}}

continue to send hotplug qmp cmd.
{'execute':'device_add','arguments':{'id':'balloon0','driver':'virtio-balloon-pci','bus':'pci.6'}}
{"error": {"class": "GenericError", "desc": "Duplicate ID 'balloon0' for device"}}

send hotunplug qmp cmd.
{ 'execute': 'device_del', 'arguments': {'id': 'balloon0'}}
{"error": {"class": "GenericError", "desc": "Device balloon0 is already in the process of unplug"}}


Expected results:
balloon device can be deleted successfully.

Additional info:
1.Test passed on RHEL.8.4.0 guest.
2.Virtio-win-prewhql-0.1-194 still hit this issue,so it's not a regression

Comment 1 xiagao 2021-03-23 14:32:06 UTC
add qemu cmd line:
[root@dell-per440-01 home]# cat balloon.sh 
CLI="/usr/libexec/qemu-kvm -name win2019 -enable-kvm -m 8G -smp 4,maxcpus=12,cores=4,threads=1,sockets=3 -rtc base=localtime,driftfix=none -boot order=cd,menu=on -monitor stdio -qmp tcp:0:1235,server,nowait -M q35 -vnc :6 \
-cpu Skylake-Server,hv_stimer,hv_synic,hv_vpindex,hv_relaxed,hv_spinlocks=0xfff,hv_vapic,hv_time,hv_frequencies,hv_runtime,hv_tlbflush,hv_reenlightenment,hv_stimer_direct,hv_ipi,+kvm_pv_unhalt \
-device pcie-root-port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x3 \
-device pcie-root-port,port=0x11,chassis=2,id=pci.2,bus=pcie.0,addr=0x3.0x1 \
-device pcie-root-port,port=0x12,chassis=3,id=pci.3,bus=pcie.0,addr=0x3.0x2 \
-device pcie-root-port,port=0x13,chassis=4,id=pci.4,bus=pcie.0,addr=0x3.0x3 \
-device pcie-root-port,port=0x14,chassis=5,id=pci.5,bus=pcie.0,addr=0x3.0x4 \
-device pcie-root-port,port=0x15,chassis=6,id=pci.6,bus=pcie.0,addr=0x3.0x5 \
-device pcie-root-port,port=0x16,chassis=7,id=pci.7,bus=pcie.0,addr=0x3.0x6 \


-drive file=win2019.qcow2,if=none,id=drive_system_disk,format=qcow2,cache=none,aio=native,werror=stop,rerror=stop \
-device ide-drive,bus=ide.0,unit=0,drive=drive_system_disk,id=ide0-0-0 \


-netdev tap,script=/etc/qemu-ifup,downscript=no,id=hostnet0,vhost=on,queues=4 \
-device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:52:11:36:4d:1b,bus=pci.2,mq=on,vectors=10 \

-blockdev driver=file,cache.direct=off,cache.no-flush=on,filename=/home/data-disk1.qcow2,node-name=data_file \
-blockdev driver=qcow2,node-name=drive_data_disk,file=data_file \
-device virtio-scsi-pci,id=scsi0,num_queues=4,bus=pci.4 -device scsi-hd,bus=scsi0.0,drive=drive_data_disk,id=scsi-disk0 \

-device virtio-balloon-pci,id=balloon0,bus=pci.6 \

-device virtio-keyboard-pci,id=kbd0,serial=virtio-keyboard,bus=pci.7 -device virtio-mouse-pci,id=mouse0,serial=virtio-mouse,bus=pci.8 -device virtio-tablet-pci,id=tablet0,serial=virtio-tablet,bus=pci.9 \

-cdrom /home/kvm_autotest_root/iso/windows/virtio-win-prewhql-0.1-196.iso \

"

echo $CLI > boot.sh 

sh boot.sh

Comment 2 Yumei Huang 2021-03-24 08:30:30 UTC
Seems it's not 100% reproducible. I ran an auto case which repeat 100 times 'hotplug -> balloon -> unplug', didn't hit the issue.

Test env: 
qemu-kvm-5.2.0-14.module+el8.4.0+10425+ad586fa5
kernel-4.18.0-295.el8.x86_64
virtio-win-prewhql-0.1-196

Comment 3 Yumei Huang 2021-03-24 08:33:37 UTC
BTW, we have another bug 1690256 about unplug balloon device with windows 2019 guest, not sure if they have same root cause.

Comment 4 xiagao 2021-03-24 08:57:35 UTC
Based on comment3, change the bug's component.
And add to a similar bug tracker. https://bugzilla.redhat.com/show_bug.cgi?id=1744438

Comment 7 xiagao 2021-05-12 01:02:05 UTC
Can reproduce as comment0 with qemu-kvm-6.0.
pkg:
qemu-kvm-6.0.0-16.module+el8.5.0+10848+2dccc46d.x86_64
kernel-4.18.0-305.1.el8.x86_64
win2019 guest

Comment 8 Yumei Huang 2021-06-07 04:23:37 UTC
Hit same issue with Win10 guest(both 32bits and 64 bits).

qemu-kvm-6.0.50-17.scrmod+el8.5.0+11228+e900a9b5.wrb210602
kernel-4.18.0-312.el8.x86_64

Comment 9 Yumei Huang 2021-06-22 06:43:35 UTC
Reproduced on rhel9 with win2022 guest.

qemu-kvm-6.0.0-5.el9
kernel-5.13.0-0.rc3.25.el9.x86_64
virtio-win-prewhql-0.1-201.iso

Comment 12 xiagao 2021-08-09 01:33:00 UTC
Can reproduce it on rhel850 with latest qemu-kvm.
guest:win2019,win8.1-32,win2012
host:
kernel-4.18.0-325.el8.x86_64
qemu-img-6.0.0-27.module+el8.5.0+12121+c40c8708.x86_64
virtio-win-prewhql-207

Can't reproduce it with pc machine type.

Comment 13 xiagao 2021-08-09 01:33:55 UTC
(In reply to xiagao from comment #12)
> Can reproduce it on rhel850 with latest qemu-kvm.
> guest:win2019,win8.1-32,win2012
> host:
> kernel-4.18.0-325.el8.x86_64
> qemu-img-6.0.0-27.module+el8.5.0+12121+c40c8708.x86_64
qemu-kvm-6.0.0-27.module+el8.5.0+12121+c40c8708.x86_64

> virtio-win-prewhql-207
> 
> Can't reproduce it with pc machine type.

Comment 14 John Ferlan 2021-09-09 13:55:17 UTC
Bulk update: Move RHEL-AV bugs to RHEL9. If necessary to resolve in RHEL8, then clone to the current RHEL8 release.

Comment 15 ChenNana 2021-11-08 03:22:49 UTC
Reproduced on rhel9 with balloon-win2022 guest.

qemu-kvm-6.1.0-5.el9.x86_64
kernel-5.14.0-10.el9.x86_64
virtio-win-prewhql-0.1-214.iso

Comment 16 ChenNana 2021-11-08 03:24:04 UTC
(In reply to ChenNana from comment #15)
> Reproduced on rhel9 with balloon-win2022 guest.
> 
> qemu-kvm-6.1.0-5.el9.x86_64
> kernel-5.14.0-10.el9.x86_64
> virtio-win-prewhql-0.1-214.iso

 sorry,Reproduced on rhel9 with balloon-win11 guest.

Comment 19 xiagao 2022-08-25 01:36:17 UTC
Recently I test win10-64/win2022/win2019 guests with balloon device to hotplug/unplug for 100 times,the result is good, so I think this issue was resolved and close it, feel free to reopen it if you have any concern.

Test pkg:
qemu-kvm-7.0.0-11.el9.x86_64
kernel-5.14.0-145.el9.x86_64
seabios-bin-1.16.0-4.el9.noarch
virtio-win-prewhql-0.1-225

Thanks,
Xiaoling