Bug 1437337
| Summary: | Hotplug cpu cores with invalid nr_threads causes qemu-kvm coredump | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Min Deng <mdeng> |
| Component: | qemu-kvm-rhev | Assignee: | David Gibson <dgibson> |
| Status: | CLOSED ERRATA | QA Contact: | Min Deng <mdeng> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 7.4 | CC: | dgibson, knoel, michen, mrezanin, qzhang, virt-maint, zhengtli |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | ppc64le | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | qemu-kvm-rhev-2.9.0-1.el7 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2017-08-02 04:35:59 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
It is a ppc64le specific issue since it is not reproduced on x86. Additional information about this bug.
Tried it with nr_threads = 1,the hotplug successfully.Does it worked as expected ?
Steps,
1.{"execute":"qmp_capabilities"}
{"return": {}}
2.{"execute": "query-hotpluggable-cpus"}
{"return": [{"props": {"core-id": 2}, "vcpus-count": 2, "type": "host-spapr-cpu-core"}, {"props": {"core-id": 0}, "vcpus-count": 2, "qom-path": "/machine/unattached/device[0]", "type": "host-spapr-cpu-core"}]}
3.{"execute": "device_add", "arguments": {"driver": "host-spapr-cpu-core", "core-id": 2, "nr-threads": 1, "id": "core1"}}
{"return": {}}
4.{"execute": "query-hotpluggable-cpus"}
{"return": [{"props": {"core-id": 2}, "vcpus-count": 2, "qom-path": "/machine/peripheral/core1", "type": "host-spapr-cpu-core"}, {"props": {"core-id": 0}, "vcpus-count": 2, "qom-path": "/machine/unattached/device[0]", "type": "host-spapr-cpu-core"}]}
5.{"execute": "query-cpus"}
{"return": [{"arch": "ppc", "current": true, "CPU": 0, "nip": -4611686018426750380, "qom_path": "/machine/unattached/device[0]/thread[0]", "halted": false, "thread_id": 47258}, {"arch": "ppc", "current": false, "CPU": 1, "nip": -4611686018426750380, "qom_path": "/machine/unattached/device[0]/thread[1]", "halted": false, "thread_id": 47259}, {"arch": "ppc", "current": false, "CPU": 2, "nip": -4611686018426750380, "qom_path": "/machine/peripheral/core1/thread[0]", "halted": false, "thread_id": 47391}]}
Problem also exists upstream. Upstream patch sent for review. Karen, I'm about to send a patch upstream, and it's pretty straightforward. Can you give this a devel_ack please? Fix is merged upstream for 2.9, so we should get it in the rebase. The bug can be reproduced on the previous build QE verified the bug on the following builds kernel-3.10.0-657.el7.ppc64le qemu-kvm-rhev-2.9.0-1.el7.ppc64le SLOF-20170303-1.git66d250e.el7.noarch Steps, 1.boot up guest with the similar cli - ..."-m 4G,slots=4,maxmem=8G -smp 2,maxcpus=4,cores=2,threads=2,sockets=1" 2.did the following steps - "nr-threads is 2" - (based on comment0 and comment3) 2.1{"execute": "device_add", "arguments": {"driver": "host-spapr-cpu-core", "core-id": 2, "nr-threads": 3, "id": "core1"}} {"error": {"class": "GenericError", "desc": "invalid nr-threads 3, must be 2"}} 2.2{"execute": "device_add", "arguments": {"driver": "host-spapr-cpu-core", "core-id": 2, "nr-threads": 1, "id": "core1"}} {"error": {"class": "GenericError", "desc": "invalid nr-threads 1, must be 2"}} Expected results, Invalid nr-threads should not be added. Actual results, Invalid nr-threads could not be added any more. Base on above test results,the bug has been fixed already,thanks for everyone's help.So move it to status verified. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:2392 |
Description of problem: Hotplug cpu cores with invalid nr_threads causes qemu-kvm coredump Version-Release number of selected component (if applicable): kernel-3.10.0-628.el7.ppc64le qemu-kvm-rhev-2.9.0-0.el7.patchwork201703291116.ppc64le SLOF-20170303-1.git66d250e.el7.noarch How reproducible: 2/2 Steps to Reproduce: 1.boot up guest with /usr/libexec/qemu-kvm -name virt-tests-vm1 -sandbox off -machine pseries-rhel7.4.0 -nodefaults -vga std -chardev socket,id=hmp_id_humanmonitor1,path=/tmp/monitor-humanmonitor1-20151207-185515-CKlGrjUv,server,nowait -mon chardev=hmp_id_humanmonitor1,mode=readline -chardev socket,id=qmp_id_qmp1,path=/tmp/monitor-qmp1-20151207-185515-CKlGrjUv,server,nowait -mon chardev=qmp_id_qmp1,mode=control -chardev socket,id=hmp_id_catch_monitor,path=/tmp/monitor-catch_monitor-20151207-185515-CKlGrjUv,server,nowait -mon chardev=hmp_id_catch_monitor,mode=readline -chardev socket,id=serial_id_serial0,path=/tmp/serial-serial0-20151207-185515-CKlGrjUv,server,nowait -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=03,disable-legacy=off,disable-modern=off -drive id=drive_image1,if=none,cache=none,snapshot=off,aio=native,format=qcow2,file=rhel74-ppc64le-virtio-scsi-latest.qcow2 -device scsi-hd,id=image1,drive=drive_image1 -numa node -qmp tcp:0:4444,server,nowait -vnc :1 -rtc base=utc,clock=host,driftfix=slew -boot order=cdn,once=c,menu=off,strict=off -enable-kvm -monitor stdio -device nec-usb-xhci,id=usb1 -device usb-kbd,id=input0 -device usb-mouse,id=input1 -device usb-tablet,id=input2 -netdev tap,script=/etc/qemu-ifup,downscript=/etc/qemu-down,id=hostnet1,vhost=on -device virtio-net-pci,netdev=hostnet1,id=net1,mac=00:52:11:36:3f:00 -m 4G,slots=4,maxmem=8G -smp 2,maxcpus=4,cores=2,threads=2,sockets=1 2.hotplug cores with invalid nr_threads 3.telnet xx.xx.xx.xx port {"execute":"qmp_capabilities"} {"execute": "device_add", "arguments": {"driver": "host-spapr-cpu-core", "core-id": 2, "nr-threads": 3, "id": "core1"}} Actual results: After hotplugging (qemu) [New Thread 0x3ffeabd2eaa0 (LWP 44083)] [Thread 0x3ffeabd2eaa0 (LWP 44083) exited] [New Thread 0x3ffeabd2eaa0 (LWP 44116)] [Thread 0x3ffeabd2eaa0 (LWP 44116) exited] qemu-kvm: /builddir/build/BUILD/qemu-2.9.0/numa.c:580: numa_get_node_for_cpu: Assertion `idx < max_cpus' failed. Program received signal SIGABRT, Aborted. 0x00003fffb6f2edc8 in raise () from /lib64/libc.so.6 Missing separate debuginfos, use: debuginfo-install alsa-lib-1.1.3-3.el7.ppc64le bzip2-libs-1.0.6-13.el7.ppc64le cyrus-sasl-lib-2.1.26-21.el7.ppc64le cyrus-sasl-plain-2.1.26-21.el7.ppc64le dbus-libs-1.6.12-17.el7.ppc64le elfutils-libelf-0.168-5.el7.ppc64le elfutils-libs-0.168-5.el7.ppc64le flac-libs-1.3.0-5.el7_1.ppc64le glib2-2.50.3-2.el7.ppc64le glibc-2.17-189.el7.ppc64le gmp-6.0.0-15.el7.ppc64le gnutls-3.3.26-6.el7.ppc64le gperftools-libs-2.4-8.el7.ppc64le gsm-1.0.13-11.el7.ppc64le keyutils-libs-1.5.8-3.el7.ppc64le krb5-libs-1.15.1-5.el7.ppc64le libICE-1.0.9-5.el7.ppc64le libSM-1.2.2-2.el7.ppc64le libX11-1.6.4-4.el7.ppc64le libXau-1.0.8-2.1.el7.ppc64le libXext-1.3.3-3.el7.ppc64le libXi-1.7.9-1.el7.ppc64le libXtst-1.2.3-1.el7.ppc64le libaio-0.3.109-13.el7.ppc64le libasyncns-0.8-7.el7.ppc64le libattr-2.4.46-12.el7.ppc64le libcap-2.22-9.el7.ppc64le libcom_err-1.42.9-9.el7.ppc64le libcurl-7.29.0-39.el7.ppc64le libdb-5.3.21-20.el7.ppc64le libfdt-1.4.3-1.el7.ppc64le libffi-3.0.13-18.el7.ppc64le libgcc-4.8.5-14.el7.ppc64le libgcrypt-1.5.3-14.el7.ppc64le libgpg-error-1.12-3.el7.ppc64le libibverbs-13-1.el7.ppc64le libidn-1.28-4.el7.ppc64le libiscsi-1.9.0-7.el7.ppc64le libnl3-3.2.28-3.el7_3.ppc64le libogg-1.3.0-7.el7.ppc64le libpng-1.5.13-7.el7_2.ppc64le librdmacm-13-1.el7.ppc64le libseccomp-2.3.1-3.el7.ppc64le libselinux-2.5-11.el7.ppc64le libsndfile-1.0.25-10.el7.ppc64le libssh2-1.4.3-10.el7_2.1.ppc64le libstdc++-4.8.5-14.el7.ppc64le libtasn1-4.10-1.el7.ppc64le libusbx-1.0.20-1.el7.ppc64le libuuid-2.23.2-36.el7.ppc64le libvorbis-1.3.3-8.el7.ppc64le libxcb-1.12-1.el7.ppc64le lzo-2.06-8.el7.ppc64le nettle-2.7.1-8.el7.ppc64le nspr-4.13.1-1.0.el7.ppc64le nss-3.28.3-4.el7.ppc64le nss-softokn-freebl-3.28.3-2.el7.ppc64le nss-util-3.28.3-3.el7.ppc64le numactl-libs-2.0.9-6.el7_2.ppc64le openldap-2.4.44-3.el7.ppc64le openssl-libs-1.0.2k-4.el7.ppc64le p11-kit-0.23.5-1.el7.ppc64le pcre-8.32-17.el7.ppc64le pixman-0.34.0-1.el7.ppc64le pulseaudio-libs-10.0-3.el7.ppc64le snappy-1.1.0-3.el7.ppc64le systemd-libs-219-32.el7.ppc64le tcp_wrappers-libs-7.6-77.el7.ppc64le xz-libs-5.2.2-1.el7.ppc64le zlib-1.2.7-17.el7.ppc64le (gdb) bt #0 0x00003fffb6f2edc8 in raise () from /lib64/libc.so.6 #1 0x00003fffb6f30f4c in abort () from /lib64/libc.so.6 #2 0x00003fffb6f24b44 in __assert_fail_base () from /lib64/libc.so.6 #3 0x00003fffb6f24c34 in __assert_fail () from /lib64/libc.so.6 #4 0x000000005b51dc38 in numa_get_node_for_cpu (idx=<optimized out>) at /usr/src/debug/qemu-2.9.0/numa.c:580 #5 0x000000005b5a8e68 in spapr_cpu_core_realize (dev=<optimized out>, errp=0x3fffffffc9e0) at /usr/src/debug/qemu-2.9.0/hw/ppc/spapr_cpu_core.c:183 #6 0x000000005b6ede90 in device_set_realized (obj=<optimized out>, value=<optimized out>, errp=0x3fffffffcc00) at hw/core/qdev.c:939 #7 0x000000005b7b6f00 in property_set_bool (obj=0x5c82c580, v=<optimized out>, name=<optimized out>, opaque=0x5ddd0b50, errp=0x3fffffffcc00) at qom/object.c:1860 #8 0x000000005b7b9888 in object_property_set (obj=0x5c82c580, v=0x5c9d09c0, name=0x5b8f3710 "realized", errp=0x3fffffffcc00) at qom/object.c:1094 #9 0x000000005b7bcb0c in object_property_set_qobject (obj=0x5c82c580, value=<optimized out>, name=<optimized out>, errp=<optimized out>) at qom/qom-qobject.c:27 #10 0x000000005b7b9b4c in object_property_set_bool (obj=0x5c82c580, value=<optimized out>, name=<optimized out>, errp=<optimized out>) at qom/object.c:1163 #11 0x000000005b69eacc in qdev_device_add (opts=0x5c7e21c0, errp=0x3fffffffcd40) at qdev-monitor.c:623 #12 0x000000005b69f550 in qmp_device_add (qdict=<optimized out>, ret_data=<optimized out>, errp=0x3fffffffcdd8) at qdev-monitor.c:800 #13 0x000000005b8ab9d4 in do_qmp_dispatch (errp=0x3fffffffcdd0, request=<optimized out>, cmds=0x5bb1d910 <qmp_commands>) at qapi/qmp-dispatch.c:104 #14 qmp_dispatch (cmds=0x5bb1d910 <qmp_commands>, request=<optimized out>) at qapi/qmp-dispatch.c:131 #15 0x000000005b50f934 in handle_qmp_command (parser=<optimized out>, tokens=<optimized out>) at /usr/src/debug/qemu-2.9.0/monitor.c:3729 #16 0x000000005b8b3be0 in json_message_process_token (lexer=0x5c843888, input=0x5c8c0320, type=<optimized out>, x=<optimized out>, y=<optimized out>) at qobject/json-streamer.c:105 #17 0x000000005b8dc8f8 in json_lexer_feed_char (lexer=0x5c843888, ch=<optimized out>, flush=false) at qobject/json-lexer.c:319 #18 0x000000005b8dca34 in json_lexer_feed (lexer=0x5c843888, buffer=<optimized out>, size=<optimized out>) at qobject/json-lexer.c:369 #19 0x000000005b8b3d3c in json_message_parser_feed (parser=<error reading variable: value has been optimized out>, buffer=<optimized out>, size=<optimized out>) at qobject/json-streamer.c:124 #20 0x000000005b50d9f4 in monitor_qmp_read (opaque=<optimized out>, buf=<optimized out>, size=<optimized out>) at /usr/src/debug/qemu-2.9.0/monitor.c:3772 #21 0x000000005b83de1c in qemu_chr_be_write_impl (len=<optimized out>, buf=<optimized out>, s=<optimized out>) at chardev/char.c:284 #22 qemu_chr_be_write (s=<optimized out>, buf=<optimized out>, len=<optimized out>) at chardev/char.c:296 #23 0x000000005b847868 in tcp_chr_read (chan=<optimized out>, cond=<optimized out>, opaque=<optimized out>) at chardev/char-socket.c:411 #24 0x000000005b85be44 in qio_channel_fd_source_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at io/channel-watch.c:84 #25 0x00003fffb7473ab0 in g_main_context_dispatch () from /lib64/libglib-2.0.so.0 #26 0x000000005b8bc224 in glib_pollfds_poll () at util/main-loop.c:213 #27 os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:258 #28 main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:506 #29 0x000000005b4abee8 in main_loop () at vl.c:1898 #30 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4720 (gdb) Expected results: As the nr_threads is not equal to "2" set in the cli it should fail.However,there should not any negative effect.For example,coredump. Additional info: The guest was also attached with "-numa node"