RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1743098 - QEMU core dumped after unplug balloon device under q35 with Win2019 guest
Summary: QEMU core dumped after unplug balloon device under q35 with Win2019 guest
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: qemu-kvm
Version: 8.1
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: ---
Assignee: Julia Suvorova
QA Contact: Yumei Huang
Jiri Herrmann
URL:
Whiteboard:
Depends On: 1690256
Blocks: 1744438 1746622 1771318 1897024 1948357
TreeView+ depends on / blocked
 
Reported: 2019-08-19 05:57 UTC by Yumei Huang
Modified: 2023-03-14 14:31 UTC (History)
16 users (show)

Fixed In Version: qemu-kvm-6.1.0-3.module+el8.6.0+12952+612d1b20
Doc Type: Bug Fix
Doc Text:
.Hot-unplugging a balloon device from a Windows Server 2019 guest now works correctly Previously, attempting to detach a memory balloon device from a running Q35 Windows Server 2019 guest operating system in some cases caused the guest to terminate unexpectedly. With this update, detaching balloon devices in the described circumstances works correctly.
Clone Of: 1690256
Environment:
Last Closed: 2022-05-10 13:18:34 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2022:1759 0 None None None 2022-05-10 13:20:07 UTC

Description Yumei Huang 2019-08-19 05:57:46 UTC
Reproduced with qemu-kvm-2.12.0-84.module+el8.1.0+3980+a02d9447.

+++ This bug was initially created as a clone of Bug #1690256 +++

Description of problem:
Boot win2019 guest, then hotplug balloon device, do balloon to evict guest memory, then unplug it, qemu core dumped.

Version-Release number of selected component (if applicable):
qemu-kvm-3.1.0-20.module+el8+2888+cdc893a8
kernel-4.18.0-80.el8.x86_64

How reproducible:
alawys

Steps to Reproduce:
1. Boot guest with q35 machine type

2. Run following scripts to hotplug balloon device, do balloon, and unplug balloon.

# cat balloon-hotplug.sh 
for i in `seq 10`;
do
	echo "=========================== round $i ============"
	echo "info balloon" | nc -U  /tmp/monitor3
	echo "device_add virtio-balloon-pci,id=balloon0,bus=pcie.0-root-port-5,addr=0x0" | nc -U  /tmp/monitor3
	echo "info balloon" | nc -U  /tmp/monitor3
	echo "balloon 4096" | nc -U  /tmp/monitor3
	echo "info balloon" | nc -U  /tmp/monitor3
	sleep 30
	echo "info balloon" | nc -U  /tmp/monitor3
	sleep 20     <---------might need wait longer to let balloon take effect
	echo "info balloon" | nc -U  /tmp/monitor3
	echo "device_del balloon0" | nc -U /tmp/monitor3
	sleep 10
done

Actual results:
QEMU core dumped:
(qemu) ./win2019-pci.sh: line 24: 23692 Segmentation fault      (core dumped) /usr/libexec/qemu-kvm -name 'avocado-vt-vm1' -machine q35 -nodefaults -device VGA,bus=pcie.0,addr=0x1 -device pvpanic,ioport=0x505,id=id5SK4co -device pcie-root-port,id=pcie.0-root-port-2,slot=2,chassis=2,addr=0x2,bus=pcie.0 -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie.0,addr=0x3 -drive id=drive_image1,if=none,snapshot=off,aio=threads,cache=none,format=qcow2,file=/home/kvm_autotest_root/images/win2019-64-virtio-scsi.qcow2 -device scsi-hd,id=image1,drive=drive_image1 -device pcie-root-port,id=pcie.0-root-port-4,slot=4,chassis=4,addr=0x4,bus=pcie.0 -device virtio-net-pci,mac=9a:39:3a:3b:3c:3d,id=idzyzw7g,vectors=4,netdev=idhia6GM,bus=pcie.0-root-port-4,addr=0x0 -netdev tap,id=idhia6GM -m 8192 -smp 16,maxcpus=16,cores=8,threads=1,sockets=2 -cpu 'IvyBridge',+kvm_pv_unhalt -vnc :1 -rtc base=utc,clock=host,driftfix=slew -boot order=cdn,once=c,menu=off,strict=off -enable-kvm -device pcie-root-port,id=pcie.0-root-port-5,slot=5,chassis=5,addr=0x5,bus=pcie.0 -monitor stdio -serial tcp:0:4445,server,nowait -monitor unix:/tmp/monitor3,server,nowait

Expected results:
Balloon device got deleted and guest work well.

Additional info:
1. QEMU cli:
# /usr/libexec/qemu-kvm \
    -name 'avocado-vt-vm1' \
    -machine q35  \
    -nodefaults \
    -device VGA,bus=pcie.0,addr=0x1  \
    -device pvpanic,ioport=0x505,id=id5SK4co  \
    -device pcie-root-port,id=pcie.0-root-port-2,slot=2,chassis=2,addr=0x2,bus=pcie.0 \
    -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie.0,addr=0x3 \
    -drive id=drive_image1,if=none,snapshot=off,aio=threads,cache=none,format=qcow2,file=/home/kvm_autotest_root/images/win2019-64-virtio-scsi.qcow2 \
    -device scsi-hd,id=image1,drive=drive_image1 \
    -device pcie-root-port,id=pcie.0-root-port-4,slot=4,chassis=4,addr=0x4,bus=pcie.0 \
    -device virtio-net-pci,mac=9a:39:3a:3b:3c:3d,id=idzyzw7g,vectors=4,netdev=idhia6GM,bus=pcie.0-root-port-4,addr=0x0  \
    -netdev tap,id=idhia6GM \
    -m 8192  \
    -smp 16,maxcpus=16,cores=8,threads=1,sockets=2  \
    -cpu 'IvyBridge',+kvm_pv_unhalt \
    -vnc :1  \
    -rtc base=utc,clock=host,driftfix=slew  \
    -boot order=cdn,once=c,menu=off,strict=off \
    -enable-kvm \
    -device pcie-root-port,id=pcie.0-root-port-5,slot=5,chassis=5,addr=0x5,bus=pcie.0 \
    -monitor stdio \
    -serial tcp:0:4445,server,nowait \
    -monitor unix:/tmp/monitor3,server,nowait

--- Additional comment from Yumei Huang on 2019-03-19 08:57:13 UTC ---

Hit same issue with win2016 guest as well, but it's harder to reproduce since https://bugzilla.redhat.com/show_bug.cgi?id=1553633#c10. After do balloon, it may only work after do "system_reset", then do unplug, it may  reproduce sometimes.

--- Additional comment from Yumei Huang on 2019-03-20 03:01:14 UTC ---

The balloon driver version for win2019 guest is virtio-win-prewhql-163(&169
), and balloon service is not installed.

--- Additional comment from Yumei Huang on 2019-03-20 05:41:40 UTC ---

Only hit the issue with q35, works fine with pc.  

Can reproduce with qemu-kvm-core-2.12.0-34.el8+2018+8f9f13ec.

--- Additional comment from Yumei Huang on 2019-07-01 09:32:45 UTC ---

Hit same issue with qemu-kvm-4.0.0-4.module+el8.1.0+3356+cda7f1ee and qemu-kvm-2.12.0-78.module+el8.1.0+3434+46ed87c2. 

Guest: Win10.x86_64, Win2016, Win2019
balloon driver: virtio-win-prewhql-0.1-172
host kernel: 4.18.0-107.el8.x86_64

--- Additional comment from Yumei Huang on 2019-08-19 05:19:17 UTC ---

Reproduced with qemu-kvm-4.1.0-1.module+el8.1.0+3966+4a23dca1.

(gdb) bt
#0  0x000055a4b8b28fdd in virtio_pci_notify_write
    (opaque=0x55a4bb06a060, addr=0, val=<optimized out>, size=<optimized out>)
    at hw/virtio/virtio-pci.c:1306
#1  0x000055a4b895b053 in memory_region_write_accessor
    (mr=<optimized out>, addr=<optimized out>, value=<optimized out>, size=<optimized out>, shift=<optimized out>, mask=<optimized out>, attrs=...)
    at /usr/src/debug/qemu-kvm-4.1.0-1.module+el8.1.0+3966+4a23dca1.x86_64/memory.c:508
#2  0x000055a4b8959266 in access_with_adjusted_size
    (addr=addr@entry=0, value=value@entry=0x7f25373fe548, size=size@entry=2, access_size_min=<optimized out>, access_size_max=<optimized out>, access_fn=access_fn@entry=
    0x55a4b895b000 <memory_region_write_accessor>, mr=0x55a4bb062bc0, attrs=...)
    at /usr/src/debug/qemu-kvm-4.1.0-1.module+el8.1.0+3966+4a23dca1.x86_64/memory.c:574
#3  0x000055a4b895d200 in memory_region_dispatch_write
    (mr=0x55a4bb062bc0, addr=0, data=<optimized out>, size=2, attrs=...)
    at /usr/src/debug/qemu-kvm-4.1.0-1.module+el8.1.0+3966+4a23dca1.x86_64/memory.c:1502
#4  0x000055a4b890a2f3 in flatview_write_continue
    (fv=0x7f25300a1ea0, addr=4244647936, attrs=..., buf=0x7f2551fda028 <error: Cannot access memory at address 0x7f2551fda028>, len=2, addr1=<optimized out>, l=<optimized out>, mr=0x55a4bb062bc0)
    at /usr/src/debug/qemu-kvm-4.1.0-1.module+el8.1.0+3966+4a23dca1.x86_64/exec.c:3337
#5  0x000055a4b890a516 in flatview_write
    (fv=0x7f25300a1ea0, addr=4244647936, attrs=..., buf=0x7f2551fda028 <error: Cannot access memory at address 0x7f2551fda028>, len=2)
    at /usr/src/debug/qemu-kvm-4.1.0-1.module+el8.1.0+3966+4a23dca1.x86_64/exec.c:3376
#6  0x000055a4b890e73f in address_space_write
    (as=<optimized out>, addr=<optimized out>, attrs=..., buf=<optimized out>, len=<optimized out>) at /usr/src/debug/qemu-kvm-4.1.0-1.module+el8.1.0+3966+4a23dca1.x86_64/exec.c:3466
--Type <RET> for more, q to quit, c to continue without paging--c
#7  0x000055a4b896be9a in kvm_cpu_exec (cpu=<optimized out>) at /usr/src/debug/qemu-kvm-4.1.0-1.module+el8.1.0+3966+4a23dca1.x86_64/accel/kvm/kvm-all.c:2298
#8  0x000055a4b8950f3e in qemu_kvm_cpu_thread_fn (arg=0x55a4bb11a650) at /usr/src/debug/qemu-kvm-4.1.0-1.module+el8.1.0+3966+4a23dca1.x86_64/cpus.c:1285
#9  0x000055a4b8c70174 in qemu_thread_start (args=0x55a4bb13df00) at util/qemu-thread-posix.c:502
#10 0x00007f254cba82de in start_thread (arg=<optimized out>) at pthread_create.c:486
#11 0x00007f254c8d9133 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Comment 1 Yumei Huang 2019-08-19 06:02:30 UTC
(gdb) bt
#0  0x000055b6c5bbdf0d in virtio_pci_notify_write
    (opaque=0x55b6c6dff170, addr=0, val=<optimized out>, size=<optimized out>)
    at hw/virtio/virtio-pci.c:1360
#1  0x000055b6c59fe596 in memory_region_write_accessor
    (mr=<optimized out>, addr=<optimized out>, value=<optimized out>, size=<optimized out>, shift=<optimized out>, mask=<optimized out>, attrs=...)
    at /usr/src/debug/qemu-kvm-2.12.0-84.module+el8.1.0+3980+a02d9447.x86_64/memory.c:530
#2  0x000055b6c59fc9e6 in access_with_adjusted_size
    (addr=addr@entry=0, value=value@entry=0x7f1ad1d72628, size=size@entry=2, access_size_min=<optimized out>, access_size_max=<optimized out>, access_fn=access_fn@entry=
    0x55b6c59fe550 <memory_region_write_accessor>, mr=0x55b6c6df7cd0, attrs=...)
    at /usr/src/debug/qemu-kvm-2.12.0-84.module+el8.1.0+3980+a02d9447.x86_64/memory.c:597
#3  0x000055b6c5a0084a in memory_region_dispatch_write
    (mr=0x55b6c6df7cd0, addr=0, data=<optimized out>, size=2, attrs=...)
    at /usr/src/debug/qemu-kvm-2.12.0-84.module+el8.1.0+3980+a02d9447.x86_64/memory.c:1474
#4  0x000055b6c59aebbc in flatview_write
    (fv=0x7f1abc0407c0, addr=<optimized out>, attrs=..., buf=<optimized out>, len=<optimized out>)
    at /usr/src/debug/qemu-kvm-2.12.0-84.module+el8.1.0+3980+a02d9447.x86_64/exec.c:3099
#5  0x000055b6c59b3243 in address_space_write
    (as=<optimized out>, addr=<optimized out>, attrs=..., buf=<optimized out>, len=<optimized out>) at /usr/src/debug/qemu-kvm-2.12.0-84.module+el8.1.0+3980+a02d9447.x86_64/exec.c:3265
#6  0x000055b6c5a0f878 in kvm_cpu_exec (cpu=<optimized out>)
    at /usr/src/debug/qemu-kvm-2.12.0-84.module+el8.1.0+3980+a02d9447.x86_64/accel/kvm/kvm-all.c:2004
#7  0x000055b6c59ec21e in qemu_kvm_cpu_thread_fn (arg=0x55b6c6bdddd0)
    at /usr/src/debug/qemu-kvm-2.12.0-84.module+el8.1.0+3980+a02d9447.x86_64/cpus.c:1215
#8  0x00007f1adb4732de in start_thread (arg=<optimized out>) at pthread_create.c:486
#9  0x00007f1adb1a4133 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Comment 10 Ademar Reis 2020-02-05 23:03:26 UTC
QEMU has been recently split into sub-components and as a one-time operation to avoid breakage of tools, we are setting the QEMU sub-component of this BZ to "General". Please review and change the sub-component if necessary the next time you review this BZ. Thanks

Comment 14 dehanmeng 2021-02-09 04:12:45 UTC
Hi Julia,
Recently I hit similar issues on win2019-64(q35) guest based on rhel9 host.  but it not hit coredump after unplug balloon device, but no "DEVICE_DELETED" event. 
The thing we're planned is to file same driver issue on one bz as tracker, so I update it here. if you don't think they are the same issue ,please feel free to let me know, thanks in advance.
oh right, one more question for you btw: when will we fix this issue? just need to plan our test progress. thanks. good day.

BRs
Dehan Meng

Comment 15 ybendito 2021-02-14 06:30:42 UTC
(In reply to dehanmeng from comment #14)
> Hi Julia,
> Recently I hit similar issues on win2019-64(q35) guest based on rhel9 host. 
> but it not hit coredump after unplug balloon device, but no "DEVICE_DELETED"
> event. 
If this is not qemu core dump this is different issue by definition (this BZ is for qemu core dump).
Additionally if you reproduce the problem you described under avocado setup this might be completely different issue.
Please open new BZ, mention avocado setup, provide qemu command line and refer the tracker BZ

Comment 16 RHEL Program Management 2021-02-19 07:30:16 UTC
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.

Comment 19 ybendito 2021-04-05 09:46:29 UTC
The fix was merged to upstream
https://github.com/qemu/qemu/commit/c3fd706165e9875a10606453ee2785dd51e987a5

Comment 21 Danilo de Paula 2021-06-17 14:28:27 UTC
Hi 

Is this fixed in current qemu in RHEL?
Is it expected to be fixed automatically by the rebase on RHEL-8.6?

Please clarify this. Then set ITR accordingly.

Comment 22 Julia Suvorova 2021-06-23 16:06:47 UTC
(In reply to Danilo Cesar Lemes de Paula from comment #21)
> Hi 
> 
> Is this fixed in current qemu in RHEL?
> Is it expected to be fixed automatically by the rebase on RHEL-8.6?
> 
> Please clarify this. Then set ITR accordingly.

The fix is not in RHEL, but it's merged upstream. According to Amnon,
it should be in RHEL 8.6 with rebase.

Comment 27 Yanan Fu 2021-10-13 02:55:58 UTC
Set 'Verified:Tested,SanityOnly' as gating test with qemu-kvm-6.0.0-29.module+el8.5.0+12386+43574bac pass.

Comment 28 Yanan Fu 2021-10-15 03:22:53 UTC
Hi Julia, Danilo, 

I i understand correctly, the Fixed In Version 'qemu-kvm-6.0.0-29.module+el8.5.0+12386+43574bac' is incorrect.
From the errata, the qemu-kvm version should be qemu-kvm-15:6.1.0-3.module+el8.6.0+12952+612d1b20.
Could you help confirm ? Thanks!

Best regards
Yanan Fu

Comment 29 Yanan Fu 2021-10-15 03:24:00 UTC
Correct a typo, from the errata, the qemu-kvm version should be qemu-kvm-6.1.0-3.module+el8.6.0+12952+612d1b20.

Comment 30 Yumei Huang 2021-10-15 10:09:56 UTC
Tested with qemu-kvm-6.1.0-3.module+el8.6.0+12952+612d1b20 / kernel-4.18.0-348.1.el8.x86_64, the issue is not reproduced, Win2019 & Win2022 guest works well after repeating hotplug/unplug balloon device.

Comment 33 Danilo de Paula 2021-10-25 12:44:23 UTC
Thank you for spotting the issue Yanan.
Corrected on our part.

Comment 34 Yanan Fu 2021-10-26 02:27:05 UTC
(In reply to Yanan Fu from comment #27)
> Set 'Verified:Tested,SanityOnly' as gating test with
> qemu-kvm-6.0.0-29.module+el8.5.0+12386+43574bac pass.

Gating test with the updated 'fixed in version': qemu-kvm-6.1.0-3.module+el8.6.0+12952+612d1b20 pass too.
So, the 'Verified' filed remain valid, thanks. Pre verification pass.

Comment 35 Yumei Huang 2021-10-26 02:37:19 UTC
Moving to verified per comment 30&34.

Comment 38 errata-xmlrpc 2022-05-10 13:18:34 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: virt:rhel and virt-devel:rhel security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:1759


Note You need to log in before you can comment on or make changes to this bug.