Red Hat Bugzilla – Bug 1528173
Hot-unplug memory during booting early stage induced qemu-kvm coredump
Last modified: 2018-04-10 20:57:52 EDT
Description of problem: hot-unplug memory from guest during early stage induced qemu-kvm core dump Version-Release number of selected component (if applicable): kernel-3.10.0-823.el7.ppc64le (host and guest) qemu-kvm-rhev-2.10.0-13.el7.ppc64le How reproducible: 1/3 Steps to Reproduce: 1.boot up a guest with the following cli, /usr/libexec/qemu-kvm -name guest=nrs,debug-threads=on -machine pseries,accel=kvm,usb=off,dump-guest-core=off -m size=8388608k,slots=256,maxmem=419430400k -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -numa node,nodeid=0,cpus=0-1,mem=4096 -numa node,nodeid=1,cpus=2-3,mem=4096 -uuid d7987973-2467-43ff-b8d2-acefc6ac59e5 -display none -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/tmp/qmp,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -boot strict=on -device qemu-xhci,id=usb,bus=pci.0,addr=0x3 -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x2 -drive file=rhel75-ppc64le-virtio-scsi.qcow2,format=qcow2,if=none,id=drive-scsi0-0-0-0 -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1 -netdev tap,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown,id=hostnet0,vhost=on -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:89:8a:8b,bus=pci.0,addr=0x1 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4 -msg timestamp=on -monitor stdio -chardev socket,id=serial_id_serial0,path=/tmp/S,server,nowait -device spapr-vty,reg=0x30000000,chardev=serial_id_serial0 -monitor unix:/tmp/monitor3,server,nowait 2.during early booting stage - *it wasn't always reproducible,about 30%* (qemu) object_add memory-backend-ram,id=mem1,size=10G (qemu) device_add pc-dimm,id=dimm1,memdev=mem1 (qemu) device_del dimm1 (qemu) ... (qemu) device_del dimm1 Actual results: (qemu) device_del dimm1 ** ERROR:/builddir/build/BUILD/qemu-2.10.0/hw/ppc/spapr_drc.c:417:spapr_drc_detach: assertion failed: (drc->dev) Program received signal SIGABRT, Aborted. 0x00003fffb700fa70 in raise () from /lib64/libc.so.6 (gdb) (gdb) bt #0 0x00003fffb700fa70 in raise () from /lib64/libc.so.6 #1 0x00003fffb7011dec in abort () from /lib64/libc.so.6 #2 0x00003fffb758b964 in g_assertion_message () from /lib64/libglib-2.0.so.0 #3 0x00003fffb758ba6c in g_assertion_message_expr () from /lib64/libglib-2.0.so.0 #4 0x0000000020b44a7c in spapr_drc_detach (drc=0x21dd0b40) at /usr/src/debug/qemu-2.10.0/hw/ppc/spapr_drc.c:417 #5 0x0000000020b2c1d8 in spapr_memory_unplug_request (errp=0x3fffffffcfc0, dev=<optimized out>, hotplug_dev=0x21fa0000) at /usr/src/debug/qemu-2.10.0/hw/ppc/spapr.c:3115 #6 spapr_machine_device_unplug_request (hotplug_dev=0x21fa0000, dev=<optimized out>, errp=0x3fffffffcfc0) at /usr/src/debug/qemu-2.10.0/hw/ppc/spapr.c:3388 #7 0x0000000020c91770 in hotplug_handler_unplug_request (plug_handler=0x21fa0000, plugged_dev=0x21e72250, errp=0x3fffffffcfc0) at hw/core/hotplug.c:45 #8 0x0000000020c414b4 in qdev_unplug (dev=0x21e72250, errp=0x3fffffffcfc0) at qdev-monitor.c:877 #9 0x0000000020c65fec in hmp_device_del (mon=<optimized out>, qdict=<optimized out>) at hmp.c:1960 #10 0x0000000020aaf758 in handle_hmp_command (mon=0x21e92f40, cmdline=0x223a000c "dimm1") at /usr/src/debug/qemu-2.10.0/monitor.c:3151 #11 0x0000000020ab18ec in monitor_command_cb (opaque=0x21e92f40, cmdline=<optimized out>, readline_opaque=<optimized out>) at /usr/src/debug/qemu-2.10.0/monitor.c:3954 #12 0x0000000020e8ad4c in readline_handle_byte (rs=0x223a0000, ch=<optimized out>) at util/readline.c:393 #13 0x0000000020aafa28 in monitor_read (opaque=<optimized out>, buf=<optimized out>, size=<optimized out>) at /usr/src/debug/qemu-2.10.0/monitor.c:3937 #14 0x0000000020ded29c in qemu_chr_be_write_impl (len=<optimized out>, buf=<optimized out>, s=<optimized out>) at chardev/char.c:167 #15 qemu_chr_be_write (s=<optimized out>, buf=<optimized out>, len=<optimized out>) at chardev/char.c:179 #16 0x0000000020def538 in fd_chr_read (chan=0x21f50600, cond=<optimized out>, opaque=0x21da6ea0) at chardev/char-fd.c:66 #17 0x0000000020e0c8b4 in qio_channel_fd_source_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at io/channel-watch.c:84 #18 0x00003fffb7554230 in g_main_context_dispatch () from /lib64/libglib-2.0.so.0 #19 0x0000000020e70aa4 in glib_pollfds_poll () at util/main-loop.c:213 #20 os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:261 #21 main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:515 #22 0x0000000020a48868 in main_loop () at vl.c:1917 #23 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4805 (gdb) From console, Red Hat Enterprise Linux Server 7.5 Beta (Maipo) Kernel 3.10.0-823.el7.ppc64le on an ppc64le dhcp47-214 login: [ 7.294099] pseries-hotplug-mem: Attempting to hot-add 40 LMB(s) at index 80000020 [ 7.294345] lpar: Attempting to resize HPT to shift 24 [ 7.481301] lpar: Hash collision while resizing HPT [ 7.481423] Unable to resize hash page table to target order 24: -28 [ 7.486935] lpar: Attempting to resize HPT to shift 24 [ 7.642486] lpar: Hash collision while resizing HPT [ 7.642559] Unable to resize hash page table to target order 24: -28 [ 7.648089] lpar: Attempting to resize HPT to shift 24 [ 7.799294] lpar: Hash collision while resizing HPT [ 7.799364] Unable to resize hash page table to target order 24: -28 [ 7.804580] lpar: Attempting to resize HPT to shift 24 [ 7.948760] lpar: Hash collision while resizing HPT [ 7.952859] Unable to resize hash page table to target order 24: -28 [ 7.958323] lpar: Attempting to resize HPT to shift 24 [ 8.111660] lpar: Hash collision while resizing HPT [ 8.111734] Unable to resize hash page table to target order 24: -28 [ 8.117389] lpar: Attempting to resize HPT to shift 24 [ 8.268721] lpar: Hash collision while resizing HPT [ 8.273091] Unable to resize hash page table to target order 24: -28 [ 8.278931] lpar: Attempting to resize HPT to shift 24 [ 8.429547] lpar: Hash collision while resizing HPT [ 8.429628] Unable to resize hash page table to target order 24: -28 [ 8.435540] lpar: Attempting to resize HPT to shift 24 [ 8.581015] lpar: Hash collision while resizing HPT [ 8.581087] Unable to resize hash page table to target order 24: -28 [ 8.586548] lpar: Attempting to resize HPT to shift 24 [ 8.724024] lpar: Hash collision while resizing HPT [ 8.724091] Unable to resize hash page table to target order 24: -28 [ 8.729687] lpar: Attempting to resize HPT to shift 24 [ 8.874897] lpar: Hash collision while resizing HPT [ 8.874963] Unable to resize hash page table to target order 24: -28 [ 8.880284] lpar: Attempting to resize HPT to shift 24 [ 9.030424] lpar: Hash collision while resizing HPT [ 9.030495] Unable to resize hash page table to target order 24: -28 [ 9.036217] lpar: Attempting to resize HPT to shift 24 [ 9.176211] lpar: Hash collision while resizing HPT [ 9.176277] Unable to resize hash page table to target order 24: -28 [ 9.181778] lpar: Attempting to resize HPT to shift 24 [ 9.327187] lpar: Hash collision while resizing HPT [ 9.327266] Unable to resize hash page table to target order 24: -28 [ 9.332950] lpar: Attempting to resize HPT to shift 24 [ 9.478299] lpar: Hash collision while resizing HPT [ 9.479110] Unable to resize hash page table to target order 24: -28 [ 9.484917] lpar: Attempting to resize HPT to shift 24 [ 9.633463] lpar: Hash collision while resizing HPT [ 9.633536] Unable to resize hash page table to target order 24: -28 [ 9.639132] lpar: Attempting to resize HPT to shift 24 [ 9.794400] lpar: Hash collision while resizing HPT [ 9.794468] Unable to resize hash page table to target order 24: -28 [ 9.800051] lpar: Attempting to resize HPT to shift 24 [ 9.955559] lpar: Hash collision while resizing HPT [ 9.955678] Unable to resize hash page table to target order 24: -28 [ 9.961733] lpar: Attempting to resize HPT to shift 24 [ 10.127090] lpar: Hash collision while resizing HPT [ 10.127165] Unable to resize hash page table to target order 24: -28 [ 10.131866] lpar: Attempting to resize HPT to shift 24 [ 10.272069] lpar: Hash collision while resizing HPT [ 10.272138] Unable to resize hash page table to target order 24: -28 [ 10.277027] pseries-hotplug-mem: Memory indexed-count-add failed, removing any added LMBs [ 10.283714] Offlined Pages 4096 [ 10.286505] lpar: Attempting to resize HPT to shift 24 [ 10.430391] lpar: Hash collision while resizing HPT [ 10.430466] Unable to resize hash page table to target order 24: -28 [ 10.431455] Offlined Pages 4096 [ 10.450579] lpar: Attempting to resize HPT to shift 24 [ 10.598721] lpar: Hash collision while resizing HPT [ 10.598813] Unable to resize hash page table to target order 24: -28 [ 10.603601] Offlined Pages 4096 [ 10.606790] lpar: Attempting to resize HPT to shift 24 [ 10.760104] lpar: Hash collision while resizing HPT [ 10.760177] Unable to resize hash page table to target order 24: -28 [ 10.771796] Offlined Pages 4096 [ 10.774503] lpar: Attempting to resize HPT to shift 24 [ 10.921789] lpar: Hash collision while resizing HPT [ 10.921858] Unable to resize hash page table to target order 24: -28 [ 10.928810] Offlined Pages 4096 [ 10.940207] lpar: Attempting to resize HPT to shift 24 [ 11.088548] lpar: Hash collision while resizing HPT [ 11.088617] Unable to resize hash page table to target order 24: -28 [ 11.092844] Offlined Pages 4096 [ 11.096240] lpar: Attempting to resize HPT to shift 24 [ 11.246749] lpar: Hash collision while resizing HPT [ 11.246833] Unable to resize hash page table to target order 24: -28 [ 11.261022] Offlined Pages 4096 [ 11.281193] lpar: Attempting to resize HPT to shift 24 [ 11.427787] lpar: Hash collision while resizing HPT [ 11.427873] Unable to resize hash page table to target order 24: -28 [ 11.432105] Offlined Pages 4096 [ 11.447543] lpar: Attempting to resize HPT to shift 24 [ 11.607265] lpar: Hash collision while resizing HPT [ 11.607373] Unable to resize hash page table to target order 24: -28 [ 11.612329] Offlined Pages 4096 [ 11.623614] lpar: Attempting to resize HPT to shift 24 Expected results: There is no coredump Additional info:
Serhii, Can you take a look at this one please. I think it is probably a race between multiple attempts to unplug from the qemu side, if delays on the guest side been the early attempts don't complete quickly. It might even be an equivalent bug on the qemu side to one of the ones you tracked down on the host kernel side.
We definitely need commit 2a129767ebb1 (hw/ppc/spapr.c: abort unplug_request if previous unplug isn't done) from master. Applied, tested using steps from comment 0 and issue no longer reproducible. Unfortunately have some issues with brew job submission: investigating. Also while analyzing case have additional question: is following race is possible even after picking commit from upstream: hw/ppc/spapr.c: --------------- spapr_memory_unplug_request() { .... /* (1) False for 1st attemt. True for next. */ if (spapr_pending_dimm_unplugs_find(spapr, dimm)) return; /* (2) Guard against multiple unplug request via */ spapr_pending_dimm_unplugs_add(); for (i = 0; i < nr_lmbs; i++) { /* (3) Lookup for DRC */ drc = spapr_drc_by_id(); spapr_drc_detach(drc); } } spapr_lmb_release() { /* Wait for all LMBs */ if (--ds->nr_lmbs) { return; .... /* (4) This opens (1) while @drc is still * available to spapr_drc_by_id() in (3) due * to (5). * * This causes reentrancy of spapr_drc_detach(drc) * in (3) while both @dev and @drc can be released. */ spapr_pending_dimm_unplugs_remove(); } hw/ppc/spapr_drc.c: ------------------- spapr_drc_lmb_class_init() { .... drck->release = spapr_lmb_release(); } spapr_drc_detach() { .... drck->release(drc->dev); /* spapr_lmb_release() */ .... /* (5) This should be done before calling * spapr_lmb_release() to hide @drc from * spapr_drc_by_id() in (3). */ object_property_del(OBJECT(drc), "device", &error_abort); drck->dev = NULL; }
Also while working on the issue I contiguously spot following traces from guest kernel (3.10.0-826.el7.ppc64): [ 0.040090] IOMMU table initialized, virtual merging enabled [ 0.040141] iommu: Adding device 0000:00:00.0 to group 0 [ 0.040210] iommu: Adding device 0000:00:01.0 to group 0 [ 0.040273] iommu: Adding device 0000:00:02.0 to group 0 [ 0.040337] iommu: Adding device 0000:00:03.0 to group 0 [ 0.040571] Unable to handle kernel paging request for data at address 0x00000100 [ 0.040619] Faulting instruction address: 0xc0000000001159c8 [ 0.040660] Oops: Kernel access of bad area, sig: 11 [#1] [ 0.040692] SMP NR_CPUS=2048 NUMA pSeries [ 0.040725] Modules linked in: [ 0.040760] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.10.0-826.el7.ppc64le #1 [ 0.040807] task: c0000001fd7c0000 ti: c0000001fffe8000 task.ti: c0000001fd800000 [ 0.040855] NIP: c0000000001159c8 LR: c0000000001160e4 CTR: 0000000000000000 [ 0.040903] REGS: c0000001fffeb950 TRAP: 0300 Not tainted (3.10.0-826.el7.ppc64le) [ 0.040950] MSR: 8000000100009033 <SF,EE,ME,IR,DR,RI,LE> CR: 28000042 XER: 00000000 [ 0.041063] CFAR: 000000000000aad0 DAR: 0000000000000100 DSISR: 40000000 SOFTE: 0 GPR00: c0000000001160e4 c0000001fffebbd0 c000000001273f00 0000000000000800 GPR04: 0000000000000000 c0000000fca72e00 0000000000000000 0000000fffffffe1 GPR08: 0000000fffffffe0 0000000000000000 0000000fffffffe0 0000000100001001 GPR12: 0000000000000000 c000000007b80000 c00000000000cb08 0000000000000000 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 0000000000000000 0000000000000001 0000000000000002 0000000000000015 GPR24: c0000001fd8585a0 0000000000000800 c0000001fd858400 0000000000000000 GPR28: 0000000000000800 c0000000015f0f98 0000000000000000 c0000000fca72e00 [ 0.041702] NIP [c0000000001159c8] __queue_work+0x68/0x6d0 [ 0.041736] LR [c0000000001160e4] queue_work_on+0xb4/0xc0 [ 0.041768] Call Trace: [ 0.041786] [c0000001fffebca0] [c0000000001160e4] queue_work_on+0xb4/0xc0 [ 0.041835] [c0000001fffebcd0] [c0000000000a5e90] queue_hotplug_event+0xd0/0x150 [ 0.041891] [c0000001fffebd20] [c0000000000a3620] ras_hotplug_interrupt+0x130/0x150 [ 0.041947] [c0000001fffebdb0] [c0000000001e26d0] __handle_irq_event_percpu+0x90/0x2e0 [ 0.042003] [c0000001fffebe80] [c0000000001e2a18] handle_irq_event+0x68/0xf0 [ 0.042052] [c0000001fffebec0] [c0000000001e8040] handle_fasteoi_irq+0xd0/0x230 [ 0.042108] [c0000001fffebef0] [c0000000001e18d0] generic_handle_irq+0x50/0x80 [ 0.042164] [c0000001fffebf20] [c000000000014a88] __do_irq+0x88/0x190 [ 0.042213] [c0000001fffebf90] [c000000000028e00] call_do_irq+0x14/0x24 [ 0.042261] [c0000001fd8035a0] [c000000000014c24] do_IRQ+0x94/0x110 [ 0.042310] [c0000001fd8035f0] [c000000000002794] hardware_interrupt_common+0x114/0x180 [ 0.042367] --- Exception: 501 at arch_local_irq_restore+0xf0/0x140 [ 0.042367] LR = arch_local_irq_restore+0xf0/0x140 [ 0.042439] [c0000001fd8038e0] [c0000000001e78f0] irq_startup+0x60/0x100 (unreliable) [ 0.042496] [c0000001fd803900] [c000000000a0ccac] _raw_spin_unlock_irqrestore+0x3c/0x80 [ 0.042553] [c0000001fd803920] [c0000000001e5558] __setup_irq+0x488/0x6b0 [ 0.042601] [c0000001fd803ac0] [c0000000001e5950] request_threaded_irq+0xf0/0x1f0 [ 0.042657] [c0000001fd803b20] [c0000000000a33e0] request_event_sources_irqs+0x110/0x220 [ 0.042713] [c0000001fd803c30] [c000000000d3aff0] init_ras_IRQ+0x98/0xfc [ 0.042762] [c0000001fd803c60] [c00000000000c28c] do_one_initcall+0x12c/0x2c0 [ 0.042811] [c0000001fd803cf0] [c000000000d240d4] kernel_init_freeable+0x244/0x324 [ 0.042868] [c0000001fd803db0] [c00000000000cb2c] kernel_init+0x2c/0x220 [ 0.042917] [c0000001fd803e30] [c00000000000a4b8] ret_from_kernel_thread+0x5c/0xa4 [ 0.042972] Instruction dump: [ 0.042997] faa1ffa8 fac1ffb0 fae1ffb8 f8010010 fb01ffc0 fb41ffd0 fba1ffe8 fbc1fff0 [ 0.043078] f821ff31 892d02a2 2fa90000 40de04c0 <813b0100> 792887e3 40c205fc 7be79364 [ 0.043161] ---[ end trace ed727bb1f359e245 ]--- [ 0.044422] [ 2.044473] Kernel panic - not syncing: Fatal exception in interrupt If I hot add memory before kernel loads (e.g. during the grub2 time) then machine restarts (not qemu process) instead of kernel crash. May be this happens because on development P8 machine I have ppc64 (BE) host environment.
Serhii, 1) Please backport 2a129767ebb1 for this BZ ASAP. 2) I'm still analyzing the race you suggest in comment 3 - let's handle that as a different BZ 3) The kernel crash in comment 4 looks like a different bug, let's handle that separately 4) "If I hot add memory before kernel loads (e.g. during the grub2 time) then machine restarts (not qemu process) instead of kernel crash." This is expected - hotplugs before CAS (guest/qemu negotiation phase) cause a reboot (and the guest will pick up the new device on the reboot). It's not ideal but avoiding it would introduce a whole heap of other complexities. 5) "May be this happens because on development P8 machine I have ppc64 (BE) host environment." That sounds unlikely to me, but let's continue to invesigate after we've fixed the known qemu bug.
Min, is this a regression from RHEL7.4?
(In reply to David Gibson from comment #6) > Min, is this a regression from RHEL7.4? David, In my opinions,it wasn't a regression issue since some issues could also be found in the following builds. kernel-3.10.0-693.el7.ppc64le qemu-kvm-rhev-2.9.0-16.el7_4.1.ppc64le QE got two types of error for those builds after several tries. 1. qemu-kvm quit directly 2. (qemu)object_add memory-backend-ram,id=mem1,size=10G (qemu) device_add pc-dimm,id=dimm1,memdev=mem1 (qemu) device_del dimm1 Memory hot unplug not supported for this guest (qemu) device_del dimm1 (qemu) device_del dimm1 Message from syslogd@localhost at Jan 11 01:32:36 ... kernel:NMI watchdog: BUG: soft lockup - CPU#56 stuck for 22s! [CPU 0/KVM:53310] Message from syslogd@localhost at Jan 11 01:32:36 ... kernel:NMI watchdog: BUG: soft lockup - CPU#72 stuck for 22s! [CPU 1/KVM:53311] Message from syslogd@localhost at Jan 11 01:32:36 ... kernel:NMI watchdog: BUG: soft lockup - CPU#160 stuck for 22s! [CPU 3/KVM:53313]
Fix included in qemu-kvm-rhev-2.10.0-17.el7
Reproduced on build kernel-3.10.0-823.el7.ppc64le (host and guest) qemu-kvm-rhev-2.10.0-13.el7.ppc64le QE re-tested the bug on the following build kernel-3.10.0-827.el7.ppc64le - host kernel-3.10.0-828.el7.ppc64le - guest qemu-kvm-rhev-2.10.0-17.el7.ppc64le Steps please refer to comment0 Actual results, The original issue could not be reproduced *but* reproduced bug 1533857 - > see comment4 and bug 1528178 reported by QE before. David, What do you think,is it proper for us to mark it as verified ? At the same time,QE uploaded the entire log for comparing with 1533857.Any issues please let me know. Thanks a lot Min
Created attachment 1382313 [details] comparinglog1533857 if necessary,please comparing it with 1533857,thanks a lot
Reproduced on build kernel-3.10.0-823.el7.ppc64le (host and guest) qemu-kvm-rhev-2.10.0-13.el7.ppc64le Verified the bug on the following build kernel-3.10.0-842.el7.ppc64le (both host and guest) qemu-kvm-rhev-2.10.0-18.el7.ppc64le Steps please refer to comment0 The bug has been fixed.Thanks a lot.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:1104