Bug 1432382
| Summary: | Hot-unplug "device_del dimm1" induce qemu-kvm coredump (hotplug at guest boot up stage) | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Min Deng <mdeng> |
| Component: | qemu-kvm-rhev | Assignee: | Laurent Vivier <lvivier> |
| Status: | CLOSED ERRATA | QA Contact: | Min Deng <mdeng> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 7.4 | CC: | dgibson, hannsj_uhl, knoel, lvivier, mdeng, michen, mrezanin, qzhang, virt-maint, yuhuang, zhengtli |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | ppc64le | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | qemu-kvm-rhev-2.9.0-1.el7 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2017-08-02 03:39:56 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 1448344 | ||
For x86 test,QE will update it to the bug as soon as the result burns out. Both host and guest's kernel is kernel-3.10.0-600.el7.ppc64le. Test on x86 host, and couldn't reproduce. qemu-kvm-rhev-2.8.0-6.el7 kernel-3.10.0-610.el7.x86_64 This is a real bug. However, I know this code has been changed in qemu-2.9, so I'm not sure there's much point debugging this in detail until we have the qemu-2.9 rebase. Are you able to try this with the preliminary qemu-2.9 based packages? Hi David, QE tried the bug on preliminary qemu-2.9.Unfortunately,QE still can *reproduce* the issue.Thanks. build info, kernel-3.10.0-600.el7.ppc64le (guest) qemu-kvm-rhev-2.9.0-0.el7.mrezanin201703210848.ppc64le SLOF-20160223-6.gitdbbfda4.el7.noarch The detail steps please just refer to comment0 Error messages, (qemu) object_add memory-backend-ram,id=mem1,size=1G (qemu) device_add pc-dimm,id=dimm1,memdev=mem1 (qemu) device_del dimm1 (qemu) qemu-kvm: used ring relocated for ring 2 qemu-kvm: /builddir/build/BUILD/qemu-2.9.0/hw/virtio/vhost.c:651: vhost_commit: Assertion `r >= 0' failed. Program received signal SIGABRT, Aborted. [Switching to Thread 0x3fffb3baeaa0 (LWP 11470)] 0x00003fffb6f2edc8 in raise () from /lib64/libc.so.6 (gdb) bt #0 0x00003fffb6f2edc8 in raise () from /lib64/libc.so.6 #1 0x00003fffb6f30f4c in abort () from /lib64/libc.so.6 #2 0x00003fffb6f24b44 in __assert_fail_base () from /lib64/libc.so.6 #3 0x00003fffb6f24c34 in __assert_fail () from /lib64/libc.so.6 #4 0x0000000046f93838 in vhost_commit (listener=0x48340288) at /usr/src/debug/qemu-2.9.0/hw/virtio/vhost.c:651 #5 0x0000000046f3a628 in memory_region_transaction_commit () at /usr/src/debug/qemu-2.9.0/memory.c:931 #6 0x00000000471149dc in pc_dimm_memory_unplug (dev=0x482f1290, hpms=0x48350530, mr=0x481e43c0) at hw/mem/pc-dimm.c:125 #7 0x0000000046f9fc60 in spapr_memory_unplug (errp=0x479482a8 <error_abort>, dev=0x482f1290, hotplug_dev=0x48350340) at /usr/src/debug/qemu-2.9.0/hw/ppc/spapr.c:2606 #8 spapr_machine_device_unplug (hotplug_dev=0x48350340, dev=0x482f1290, errp=0x479482a8 <error_abort>) at /usr/src/debug/qemu-2.9.0/hw/ppc/spapr.c:2867 #9 0x0000000047102280 in hotplug_handler_unplug (plug_handler=0x48350340, plugged_dev=0x482f1290, errp=0x479482a8 <error_abort>) at hw/core/hotplug.c:56 #10 0x0000000046f9fa48 in spapr_lmb_release (dev=0x482f1290, opaque=<optimized out>) at /usr/src/debug/qemu-2.9.0/hw/ppc/spapr.c:2566 #11 0x0000000046fb6ca0 in detach (drc=0x48240840, d=<optimized out>, detach_cb=0x46f9f9f0 <spapr_lmb_release>, detach_cb_opaque=0x48c82610, errp=<optimized out>) at /usr/src/debug/qemu-2.9.0/hw/ppc/spapr_drc.c:447 #12 0x0000000046fb72d0 in set_allocation_state (drc=0x48240840, state=<optimized out>) at /usr/src/debug/qemu-2.9.0/hw/ppc/spapr_drc.c:145 #13 0x0000000046fadf54 in rtas_set_indicator (cpu=<optimized out>, spapr=0x48350340, token=<optimized out>, nargs=<optimized out>, args=<optimized out>, nret=<optimized out>, rets=<optimized out>) at /usr/src/debug/qemu-2.9.0/hw/ppc/spapr_rtas.c:460 #14 0x0000000046faf0bc in spapr_rtas_call (cpu=<optimized out>, spapr=<optimized out>, token=<optimized out>, nargs=<optimized out>, args=<optimized out>, nret=<optimized out>, rets=<optimized out>) at /usr/src/debug/qemu-2.9.0/hw/ppc/spapr_rtas.c:666 #15 0x0000000046fa9c24 in h_rtas (cpu=0x488c0000, spapr=0x48350340, opcode=<optimized out>, args=<optimized out>) at /usr/src/debug/qemu-2.9.0/hw/ppc/spapr_hcall.c:663 #16 0x0000000046fac3f8 in spapr_hypercall (cpu=0x488c0000, opcode=61440, args=0x3fffb3390030) at /usr/src/debug/qemu-2.9.0/hw/ppc/spapr_hcall.c:1055 #17 0x0000000047065ab4 in kvm_arch_handle_exit (cs=0x488c0000, run=0x3fffb3390000) at /usr/src/debug/qemu-2.9.0/target/ppc/kvm.c:1688 #18 0x0000000046f353d8 in kvm_cpu_exec (cpu=0x488c0000) at /usr/src/debug/qemu-2.9.0/kvm-all.c:2113 #19 0x0000000046f1a980 in qemu_kvm_cpu_thread_fn (arg=0x488c0000) at /usr/src/debug/qemu-2.9.0/cpus.c:1087 #20 0x00003fffb70e8728 in start_thread () from /lib64/libpthread.so.0 #21 0x00003fffb70113d0 in clone () from /lib64/libc.so.6 Thanks Min I think this is crashing because the memory we hot-unplug is in use by vhost.
But I'm not able to reproduce.
What I understand is you hotplug the memory before the kernel has finished to boot and you hot-unplug it once it has booted.
Could you:
- try to add on the command line the hotplugged memory instead of hotplugging it
manually: "... -object memory-backend-ram,id=mem1,size=1G \
-device pc-dimm,id=dimm1,memdev=mem1 ..."
- try with the latest built kernel in the guest (I'm testing with -612)
Thanks
OK... I'm able to reproduce if I connect via ssh to the guest before unplugging the memory I have strange message in the guest kernel log whil i'm unplugging the memory: (qemu) device_del dimm1 [ 39.422692] pseries-hotplug-mem: Attempting to hot-add 4 LMB(s) 2017-03-24T10:12:26.396259Z qemu-system-ppc64: used ring relocated for ring 2 qemu-system-ppc64: /home/lvivier/Projects/qemu/hw/virtio/vhost.c:651: vhost_commit: Assertion `r >= 0' failed. This problem can be reproduced with upstream qemu. We have the crash because kernel is answering to hotplug event while we have started to hot-unplug the memory, so there is an inconsistency between the internal state of QEMU and the information sent by the kernel. Some details to reproduce the problem:
- start QEMU with:
-S -serial mon:stdio \
-netdev tap,script=/etc/qemu-ifup,\
downscript=/etc/qemu-down,id=hostnet0,vhost=on \
-device virtio-net-pci,netdev=hostnet0
- swith to the monitor and execute:
(qemu) object_add memory-backend-ram,id=mem1,size=1G
(qemu) device_add pc-dimm,id=dimm1,memdev=mem1
(qemu) continue
- once the OS is started, start an ssh connection to the guest
- switch to the monitor and execute:
(qemu) device_del dimm1
Merged upstream: commit fe6824d ("spapr: fix memory hot-unplugging")
Laurent, Last I heard we may not be getting rebases to the later 2.9 rcs, so can you post the relevant patch downstream as well please. Verified the bug on the following builds kernel-3.10.0-655.el7.ppc64le (guest and host) qemu-kvm-rhev-2.9.0-1.el7.ppc64le SLOF-20170303-1.git66d250e.el7.noarch Detail steps, please refer to comment0&comment14. Expected results, The original issue has already been fixed. Actual results, The issue has been fixed already. So move the bug to verified status,thanks for everyone's effort. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:2392 |
Description of problem: Hot-unplug "device_del dimm1" induce qemu-kvm coredump Version-Release number of selected component (if applicable): ppc64le kernel-3.10.0-600.el7.ppc64le qemu-kvm-rhev-2.8.0-6.el7.ppc64le SLOF-20160223-6.gitdbbfda4.el7.noarch How reproducible: 2/3 Steps to Reproduce: 1.boot up guest with the following cli /usr/libexec/qemu-kvm -name virt-tests-vm1 -sandbox off -machine pseries-rhel7.4.0 -nodefaults -vga std -chardev socket,id=hmp_id_humanmonitor1,path=/tmp/monitor-humanmonitor1-20151207-185515-CKlGrjUv,server,nowait -mon chardev=hmp_id_humanmonitor1,mode=readline -chardev socket,id=qmp_id_qmp1,path=/tmp/monitor-qmp1-20151207-185515-CKlGrjUv,server,nowait -mon chardev=qmp_id_qmp1,mode=control -chardev socket,id=hmp_id_catch_monitor,path=/tmp/monitor-catch_monitor-20151207-185515-CKlGrjUv,server,nowait -mon chardev=hmp_id_catch_monitor,mode=readline -chardev socket,id=serial_id_serial0,path=/tmp/serial-serial0-20151207-185515-CKlGrjUv,server,nowait -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=03,disable-legacy=off,disable-modern=on -drive id=drive_image1,if=none,cache=none,snapshot=off,aio=native,format=qcow2,file=rhel74-ppc64le-virtio-scsi-latest.qcow2 -device scsi-hd,id=image1,drive=drive_image1 -numa node -qmp tcp:0:4444,server,nowait -vnc :1 -rtc base=utc,clock=host,driftfix=slew -boot order=cdn,once=c,menu=off,strict=off -enable-kvm -monitor stdio -device pci-ohci,id=usb1 -device usb-kbd,id=input0 -device usb-mouse,id=input1 -device usb-tablet,id=input2 -netdev tap,script=/etc/qemu-ifup,downscript=/etc/qemu-down,id=hostnet1,vhost=on -device virtio-net-pci,netdev=hostnet1,id=net1,mac=00:52:11:36:3f:00 -m 4G,slots=4,maxmem=8G -numa node 2.Hotplug memory for the guest during stage of booting up.*It is a must*. (qemu) object_add memory-backend-ram,id=mem1,size=1G (qemu) device_add pc-dimm,id=dimm1,memdev=mem1 3.And then try to unplug it (qemu) device_del dimm1 Actual results: (qemu) device_del dimm1 (qemu) qemu-kvm: used ring relocated for ring 2 qemu-kvm: /builddir/build/BUILD/qemu-2.8.0/hw/virtio/vhost.c:622: vhost_commit: Assertion `r >= 0' failed. Program received signal SIGABRT, Aborted. [Switching to Thread 0x3fffb3bbeab0 (LWP 48326)] 0x00003fffb6f3eb98 in raise () from /lib64/libc.so.6 Missing separate debuginfos, use: debuginfo-install alsa-lib-1.1.3-3.el7.ppc64le bzip2-libs-1.0.6-13.el7.ppc64le cyrus-sasl-lib-2.1.26-21.el7.ppc64le cyrus-sasl-md5-2.1.26-21.el7.ppc64le cyrus-sasl-plain-2.1.26-21.el7.ppc64le dbus-libs-1.6.12-17.el7.ppc64le elfutils-libelf-0.168-5.el7.ppc64le elfutils-libs-0.168-5.el7.ppc64le flac-libs-1.3.0-5.el7_1.ppc64le glib2-2.46.2-4.el7.ppc64le glibc-2.17-171.el7.ppc64le gmp-6.0.0-12.el7_1.ppc64le gnutls-3.3.26-6.el7.ppc64le gperftools-libs-2.4-8.el7.ppc64le gsm-1.0.13-11.el7.ppc64le keyutils-libs-1.5.8-3.el7.ppc64le krb5-libs-1.15-2.el7.ppc64le libICE-1.0.9-5.el7.ppc64le libSM-1.2.2-2.el7.ppc64le libX11-1.6.4-4.el7.ppc64le libXau-1.0.8-2.1.el7.ppc64le libXext-1.3.3-3.el7.ppc64le libXi-1.7.9-1.el7.ppc64le libXtst-1.2.3-1.el7.ppc64le libaio-0.3.109-13.el7.ppc64le libasyncns-0.8-7.el7.ppc64le libattr-2.4.46-12.el7.ppc64le libcap-2.22-9.el7.ppc64le libcom_err-1.42.9-9.el7.ppc64le libcurl-7.29.0-39.el7.ppc64le libdb-5.3.21-19.el7.ppc64le libfdt-1.4.0-2.el7.ppc64le libffi-3.0.13-18.el7.ppc64le libgcc-4.8.5-11.el7.ppc64le libgcrypt-1.5.3-14.el7.ppc64le libgpg-error-1.12-3.el7.ppc64le libibverbs-12-2.el7.ppc64le libidn-1.28-4.el7.ppc64le libiscsi-1.9.0-7.el7.ppc64le libnl3-3.2.28-3.el7_3.ppc64le libogg-1.3.0-7.el7.ppc64le libpng-1.5.13-7.el7_2.ppc64le librdmacm-12-2.el7.ppc64le libseccomp-2.3.1-2.el7.ppc64le libselinux-2.5-9.el7.ppc64le libsndfile-1.0.25-10.el7.ppc64le libssh2-1.4.3-10.el7_2.1.ppc64le libstdc++-4.8.5-11.el7.ppc64le libtasn1-4.10-1.el7.ppc64le libusbx-1.0.20-1.el7.ppc64le libuuid-2.23.2-33.el7.ppc64le libvorbis-1.3.3-8.el7.ppc64le libxcb-1.12-1.el7.ppc64le lzo-2.06-8.el7.ppc64le nettle-2.7.1-8.el7.ppc64le nspr-4.13.1-1.0.el7.ppc64le nss-3.28.3-2.el7.ppc64le nss-softokn-freebl-3.28.3-2.el7.ppc64le nss-util-3.28.3-2.el7.ppc64le numactl-libs-2.0.9-6.el7_2.ppc64le openldap-2.4.44-1.el7.ppc64le openssl-libs-1.0.2k-3.el7.ppc64le p11-kit-0.23.5-1.el7.ppc64le pcre-8.32-17.el7.ppc64le pixman-0.34.0-1.el7.ppc64le pulseaudio-libs-10.0-2.el7.ppc64le snappy-1.1.0-3.el7.ppc64le systemd-libs-219-32.el7.ppc64le tcp_wrappers-libs-7.6-77.el7.ppc64le xz-libs-5.2.2-1.el7.ppc64le zlib-1.2.7-17.el7.ppc64le (gdb) bt #0 0x00003fffb6f3eb98 in raise () from /lib64/libc.so.6 #1 0x00003fffb6f40d1c in abort () from /lib64/libc.so.6 #2 0x00003fffb6f34924 in __assert_fail_base () from /lib64/libc.so.6 #3 0x00003fffb6f34a14 in __assert_fail () from /lib64/libc.so.6 #4 0x0000000059ba1a58 in vhost_commit (listener=0x5adf0000) at /usr/src/debug/qemu-2.8.0/hw/virtio/vhost.c:622 #5 0x0000000059b4a658 in memory_region_transaction_commit () at /usr/src/debug/qemu-2.8.0/memory.c:929 #6 0x0000000059d19c4c in pc_dimm_memory_unplug (dev=0x5af60f30, hpms=0x5ae501e0, mr=0x5adb5b20) at hw/mem/pc-dimm.c:125 #7 0x0000000059bac870 in spapr_memory_unplug (errp=0x5a513e00 <error_abort>, dev=0x5af60f30, hotplug_dev=0x5ae50000) at /usr/src/debug/qemu-2.8.0/hw/ppc/spapr.c:2421 #8 spapr_machine_device_unplug (hotplug_dev=0x5ae50000, dev=0x5af60f30, errp=0x5a513e00 <error_abort>) at /usr/src/debug/qemu-2.8.0/hw/ppc/spapr.c:2523 #9 0x0000000059d08880 in hotplug_handler_unplug (plug_handler=0x5ae50000, plugged_dev=0x5af60f30, errp=0x5a513e00 <error_abort>) at hw/core/hotplug.c:56 #10 0x0000000059bac668 in spapr_lmb_release (dev=0x5af60f30, opaque=<optimized out>) at /usr/src/debug/qemu-2.8.0/hw/ppc/spapr.c:2381 #11 0x0000000059bc2d80 in detach (drc=0x5ae10600, d=<optimized out>, detach_cb=0x59bac610 <spapr_lmb_release>, detach_cb_opaque=0x5ad5fa88, errp=<optimized out>) at /usr/src/debug/qemu-2.8.0/hw/ppc/spapr_drc.c:442 #12 0x0000000059bc33b0 in set_allocation_state (drc=0x5ae10600, state=<optimized out>) at /usr/src/debug/qemu-2.8.0/hw/ppc/spapr_drc.c:145 #13 0x0000000059bba274 in rtas_set_indicator (cpu=<optimized out>, spapr=0x5ae50000, token=<optimized out>, nargs=<optimized out>, args=<optimized out>, nret=<optimized out>, rets=<optimized out>) at /usr/src/debug/qemu-2.8.0/hw/ppc/spapr_rtas.c:459 #14 0x0000000059bbb3dc in spapr_rtas_call (cpu=<optimized out>, spapr=<optimized out>, token=<optimized out>, nargs=<optimized out>, args=<optimized out>, nret=<optimized out>, rets=<optimized out>) at /usr/src/debug/qemu-2.8.0/hw/ppc/spapr_rtas.c:665 #15 0x0000000059bb6164 in h_rtas (cpu=0x5b480000, spapr=0x5ae50000, opcode=<optimized out>, args=<optimized out>) at /usr/src/debug/qemu-2.8.0/hw/ppc/spapr_hcall.c:666 #16 0x0000000059bb8738 in spapr_hypercall (cpu=0x5b480000, opcode=61440, args=0x3fffb33a0030) at /usr/src/debug/qemu-2.8.0/hw/ppc/spapr_hcall.c:1081 #17 0x0000000059c672b4 in kvm_arch_handle_exit (cs=0x5b480000, run=0x3fffb33a0000) at /usr/src/debug/qemu-2.8.0/target-ppc/kvm.c:1757 #18 0x0000000059b45458 in kvm_cpu_exec (cpu=0x5b480000) at /usr/src/debug/qemu-2.8.0/kvm-all.c:2038 #19 0x0000000059b2baf0 in qemu_kvm_cpu_thread_fn (arg=<optimized out>) at /usr/src/debug/qemu-2.8.0/cpus.c:998 #20 0x00003fffb70e8728 in start_thread () from /lib64/libpthread.so.0 #21 0x00003fffb701de50 in clone () from /lib64/libc.so.6 Expected results: The operation is successfully Additional info: