Description of problem: After creating a VM using ovirt, memory hot-plug 256MB DIMMs * 15 times(total 16 DIMMs). Memory Hot-unplug doesn't succeed. No error seen on ovirt side, guest's CPU load moving to 100%: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 64 root 20 0 0 0 0 D 98.3 0.0 7106:54 kworker/u32:1+kacpi_hotplug 10847 root 20 0 0 0 0 R 1.7 0.0 0:01.13 kworker/0:1-mm_percpu_wq Version-Release number of selected component (if applicable): Guest: Red Hat Enterprise Linux release 8.0 Beta (Ootpa) qemu-guest-agent-2.12.0-41.el8+2104+3e32e6f8.x86_64 Host: Red Hat Enterprise Linux Server release 7.6 (Maipo) vdsm-4.30.6-1.el7ev.x86_64 libvirt-4.5.0-10.el7_6.4.x86_64 Engine: Red Hat Enterprise Linux Server release 7.6 (Maipo) ovirt-engine-4.3.0-0.8.rc2.el7.noarch How reproducible: 100%, Succeed to reproduce more than once. Steps to Reproduce: 1. Create a VM(1GB memory, 16 max memory), RHEL8 image, BIOS. 2. Memory hotplug 15 times 256MB. 3. Start to remove the DIMMs from VM devices. Actual results: Not all the DIMMs were removed. On UI side, all the DIMMs stayed. # free -m total used free shared buff/cache available Mem: 4593 355 3978 14 260 4150 Swap: 1023 0 1023 Starting with 1GB memory, plugging 3810MB results in 4864. After un-pluging memory, getting the same memory. VM's CPU load moved to 100%, kacpi_hotplug process causing it. Expected results: DIMMS removed correctly, VM's memory reduce, ovirt UI reports the VM devices correctly, VM's CPU load won't stuck on 100%.
Created attachment 1529010 [details] logs
Hi Liran, the BZ Component field is "edk2", but comment #0 says, "Create a VM(1GB memory, 16 max memory), RHEL8 image, *BIOS*." (emphasis mine). Also, regarding "dmesg.log" from the attachment ("logs.tar.xz"), it indeed terminates with [ 1049.684974] memory memory35: Offline failed. however, the dmesg doesn't indicate UEFI firmware (no EFI memmap, no references to EFI etc). To me it looks like a SeaBIOS VM. Should we correct the BZ Component? Thanks.
In addition, "mem_test.log-20190210" contains no references to "pflash" or "OVMF" -- I'd expect those on the QEMU command line, for setting Component=edk2. Thanks.
(In reply to Laszlo Ersek from comment #2) > Hi Liran, > > the BZ Component field is "edk2", but comment #0 says, "Create a VM(1GB > memory, 16 max memory), RHEL8 image, *BIOS*." (emphasis mine). > > Also, regarding "dmesg.log" from the attachment ("logs.tar.xz"), it indeed > terminates with > > [ 1049.684974] memory memory35: Offline failed. > > however, the dmesg doesn't indicate UEFI firmware (no EFI memmap, no > references to EFI etc). To me it looks like a SeaBIOS VM. Should we correct > the BZ Component? Thanks. Sure, moved to seabios. Thanks.
It's first hot-unplug BZ in RHEL8, as far as I'm aware it is still impossible to unplug memory reliably upstream. There were some work done in that area (like removing time-outs and continue attempting to remove memory, which could improve likehood of removal and explains CPU load). RHEL8 is probably the same as RHEL7 in memory hot-unplug area (Baoquan He backported many fixes from upstream into RHEL7) Long story in relevant BZs from RHEL7 (probably should be cloned to RHEL8) Bug 1245892 - hot-unhotplug guest memory fail most of the time because it is in use Bug 1258312 When un-hotplug memory failed, libvirt gives user a wrong message Reassigning bug back to kernel and CCing people involved in fixing them.
I agree with Igor. And we have bug 1654978 for rhel8. This one might be duplicate.
Hi Igor, (In reply to Igor Mammedov from comment #7) > It's first hot-unplug BZ in RHEL8, as far as I'm aware it is still > impossible to unplug memory reliably upstream. > There were some work done in that area (like removing time-outs and continue > attempting to remove memory, which could improve likehood of removal and > explains CPU load). > > RHEL8 is probably the same as RHEL7 in memory hot-unplug area (Baoquan He > backported many fixes from upstream into RHEL7) RHEL8 probably is different than rhel7 on memory hotplug, at least on x86 64, we have got good test results. In rhel7, since virt team didn't apply necessary udev rule, the memory block may not be onlined as online_movable, that's necessary for memory hotplug in rhel7 because of a memory defect. So as far as I know, rhel8 has different status as rhel7. As for upstream kernel, it behaves very well on memory hotplug on x86_64. Seems there's regression issue on ppc platform which is under discussion in upstream. Thanks Baoquan
Created attachment 1534440 [details] RHEL8 qemu guest memory hotplug steps
Hello, Could you share the detailed step to reproduce? I tried to do memory hotplug on RHEL8 Qemu guest and it worked well. I attached the steps I ran in Comment 10. Before memory hot-remove: total used free shared buff/cache available Mem: 9048584 248312 8445376 8768 354896 8648988 Swap: 2097148 0 2097148 After memory hot-remove: total used free shared buff/cache available Mem: 8000008 246520 7398584 8768 354904 7643260 Swap: 2097148 0 2097148 Thanks, Masa
Please refer to the comments: https://bugzilla.redhat.com/show_bug.cgi?id=1654978#c26 So I think this is not a bug, suggest close it as NOTABUG. Thanks Baoquan
Per comment as below, I would like to close this bug as NOTABUG. https://bugzilla.redhat.com/show_bug.cgi?id=1654978#c23 Please reopen it if any concern is raised. Thanks Baoquan
I'm re-opening this bug. I retested again, using RHEL7.6 hosts and RHEL8 guest (kernel-4.18.0-80.el8.x86_64) I hot-plugged 5 DIMMs, each of 256MB. When I try to unplug each DIMM I see on the guest VM: With balloon device on the VM: "Offlined pages 32768 memory memory38: Offline failed." Without balloon device on the VM: "memory memory38: Offline failed." On each DIMM it's another memory number. On RHV side, the operation is successful, but on the guest it's clearly not, the DIMM also stays in the VM devices tab. This looks like what Igor mentioned in comment #7.
(In reply to Liran Rotenberg from comment #14) > I'm re-opening this bug. > > I retested again, using RHEL7.6 hosts and RHEL8 guest > (kernel-4.18.0-80.el8.x86_64) Hi Liran, did you add 'movable_node' in guest kernel line as suggested by Baoquan in bug1654978? Seems it works to me. Thanks. > I hot-plugged 5 DIMMs, each of 256MB. > When I try to unplug each DIMM I see on the guest VM: > > With balloon device on the VM: > "Offlined pages 32768 > memory memory38: Offline failed." > > Without balloon device on the VM: > "memory memory38: Offline failed." > > On each DIMM it's another memory number. > > On RHV side, the operation is successful, but on the guest it's clearly not, > the DIMM also stays in the VM devices tab. > > This looks like what Igor mentioned in comment #7.
Hi Yumei, I just tested it with adding 'movable_node' in the guest kernel line. Solved the issue here. I'm moving this bug to documentation, if any user wish to use el8 guests. Thanks!
Published as known issue in RHV 4.4.2 release notes.