Bug 1902691
Summary: | virDomainSave a VM with an NVDIMM device is very slow | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 9 | Reporter: | Milan Zamazal <mzamazal> | |
Component: | libvirt | Assignee: | Michal Privoznik <mprivozn> | |
libvirt sub component: | General | QA Contact: | liang cong <lcong> | |
Status: | CLOSED WONTFIX | Docs Contact: | ||
Severity: | medium | |||
Priority: | medium | CC: | ailan, dgilbert, eric.auger, imammedo, jinqi, jinzhao, jsuchane, juzhang, lmen, smitterl, virt-maint, xuzhang, yalzhang, yuhuang | |
Version: | 9.1 | Keywords: | Triaged | |
Target Milestone: | rc | |||
Target Release: | --- | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | If docs needed, set a value | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1923905 (view as bug list) | Environment: | ||
Last Closed: | 2022-10-19 03:29:08 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | 1923905 | |||
Bug Blocks: | 1897906 |
Description
Milan Zamazal
2020-11-30 12:39:47 UTC
Is there any idea how to proceed with the bug? Is there a chance to get it fixed or should RHV document that suspending a VM doesn't work when NVDIMM is present? Assigned to Amnon for initial triage per bz process and age of bug created or assigned to virt-maint without triage. Suspend on bare QEMU (stop HMP command) is instantaneous. In comment 1 case, I'd guess libvirt tries to save RAM, which includes NVDIMM atm => long execution time. We probably should not be saving address ranges belonging to NVDIMM. Moving BZ to libvirt for now to discuss how to deal with it and then BZ could be cloned to qemu to implement QEMU part of it. Version: libvirt-daemon-6.6.0-11.module+el8.3.1+9196+74a80ca4.x86_64 qemu-kvm-5.1.0-17.module+el8.3.1+9213+7ace09c3.x86_64 kernel version : 4.18.0-240.10.1.el8_3.x86_64 I tried to use a machine with CPU "Intel Purley 4s (Lightning Ridge) CPU:CascadeLake B0 QS , (8) Optane 512 DIMMS" to do the test -- 1. Start the domain with below configuration - <maxMemory slots='16' unit='KiB'>1048576000</maxMemory> <memory unit='KiB'>1703936</memory> <currentMemory unit='KiB'>1572864</currentMemory> <memoryBacking> <hugepages> <page size='2048' unit='KiB'/> </hugepages> </memoryBacking> ... <cpu mode='custom' match='exact' check='full'> <model fallback='forbid'>qemu64</model> <feature policy='require' name='x2apic'/> <feature policy='require' name='hypervisor'/> <feature policy='require' name='lahf_lm'/> <feature policy='disable' name='svm'/> <numa> <cell id='0' cpus='0-1' memory='524288' unit='KiB'/> <cell id='1' cpus='2-3' memory='524288' unit='KiB'/> </numa> </cpu> ... <memory model='nvdimm' access='shared'> <source> <path>/dev/pmem0</path> <alignsize unit='KiB'>2048</alignsize> <pmem/> </source> <target> <size unit='KiB'>131072</size> <node>0</node> <label> <size unit='KiB'>128</size> </label> </target> <alias name='ua-0b946fd6-9a90-4882-a23c-2d027965e8cd'/> <address type='dimm' slot='0'/> </memory> </devices> 2. # virsh suspend vm1 Domain vm1 suspended 3. # virsh resume vm1 Domain vm1 resumed The scenario works in pure libvirt environment. Jing, do you mean suspending the VM is reasonably fast in your environment? Can it be because you use hugepages? Yes, the vm can suspend and resume quickly from my test. Did more test without hugepage used, the same result was got. Ah, I missed your target NVDIMM device size is small, only 128 MB, this is indeed fast. Could you try it with a larger device? The problem with a hanging suspend was observed when using an NVDIMM device of 256 GB size. Saving VM's RAM to file closest to offline migration Adding David to CC to look at the issue from QEMU side. Michal, can you please look into this from libvirt side? Thanks. (In reply to Milan Zamazal from comment #0) > > it can take a very long time to get it suspended via virDomainSuspend > libvirt call. For instance, suspending a VM with the 256 GB NVDIMM above, > backed by a hardware NVDIMM device, doesn't finish within 2 hours. > > Version-Release number of selected component (if applicable): > > libvirt-daemon-6.6.0-6.module+el8.3.0+8125+aefcf088.x86_64 > qemu-kvm-5.1.0-13.module+el8.3.0+8382+afc3bbea.x86_64 > > How reproducible: > > 100% > > Steps to Reproduce: > 1. Start a VM with an NVDIMM device. > 2. Suspend the VM. > Milan, are you sure about virDomainSuspend()? Because all that it does is stopping vCPUs by issuing "stop" on the monitor (plus saving new internal state, but that's not connected to NVDIMMs in any way). Could it be virDomainSave() or virDomainManagedSave() that you had on mind? (In reply to Milan Zamazal from comment #7) > Ah, I missed your target NVDIMM device size is small, only 128 MB, this is > indeed fast. Could you try it with a larger device? The problem with a > hanging suspend was observed when using an NVDIMM device of 256 GB size. <memory model='nvdimm' access='shared'> <source> <path>/dev/dax0.0</path> <alignsize unit='KiB'>2048</alignsize> </source> <target> <size unit='KiB'>262144000</size> <node>0</node> <label> <size unit='KiB'>128</size> </label> </target> <address type='dimm' slot='0'/> </memory> In the guest, occupied the space in /dev/pmem0. And from df command, got 61% space occupied. /dev/pmem0 262013956 159466148 102547808 61% /mnt # virsh suspend rhel8 Domain rhel8 suspended # virsh resume rhel8 Domain rhel8 resumed It's still quite quick to do suspend/resume for it. (In reply to Jing Qi from comment #11) > > It's still quite quick to do suspend/resume for it. Yes, because 'virsh suspend' just stops vCPUs, there is no memory saving and thus it's almost instant. I'm suspecting that Milan might have thought about different API. Milan? (In reply to Michal Privoznik from comment #12) > (In reply to Jing Qi from comment #11) > > > > It's still quite quick to do suspend/resume for it. > > Yes, because 'virsh suspend' just stops vCPUs, there is no memory saving and > thus it's almost instant. I'm suspecting that Milan might have thought about > different API. Milan? 'memory saving' was my assumption, and we may need to make up some sort of new command or an option to existing one to skip nvdimm regions. Of cause if one would move such saved VM to another host, one should also move manually corresponding NVDIMM content (same like with any other storage). Oh yes, I indeed meant virDomainSave() call. Sorry for the confusion. (In reply to Igor Mammedov from comment #13) > Of cause if one would move such saved VM to another host, one should also > move manually corresponding NVDIMM content (same like with any other > storage). Yes, we currently pin VMs with NVDIMMs to particular NVDIMM devices and the corresponding hosts and they can't be migrated elsewhere. I'm not really sure what's the right thing to do here. On one hand, NVDIMM is mapped into guest memory and thus when saving the guest memory (for later use) NVDIMMs should be saved with it. However, NVDIMMs are persistent so we might get away with offloading this responsibility to users/mgmt apps. Of course, things will go terribly wrong when users don't save and restore NVDIMM themselves. In theory, one could save a VM, use the attached NVDIMM for another purpose after and then restore the original VM. That would break of course but would anybody do the same with a normal hard drive attached to a VM? If such a scenario might make sense with an NVDIMM, would adding a corresponding virDomainSave() flag to not save NVDIMM memory be a solution? (In reply to Milan Zamazal from comment #17) > In theory, one could save a VM, use the attached NVDIMM for another purpose > after and then restore the original VM. That would break of course but would > anybody do the same with a normal hard drive attached to a VM? If such a > scenario might make sense with an NVDIMM, would adding a corresponding > virDomainSave() flag to not save NVDIMM memory be a solution? Yes, that could work. Libvirt would then set another flag for 'savevm' that would make it skip all NVDIMMs. Hopefully, no one will need more fine grained approach (choosing per each NVDIMM whether to save it or not). Igor, would this work for qemu? (In reply to Michal Privoznik from comment #18) > (In reply to Milan Zamazal from comment #17) > > In theory, one could save a VM, use the attached NVDIMM for another purpose > > after and then restore the original VM. That would break of course but would > > anybody do the same with a normal hard drive attached to a VM? If such a > > scenario might make sense with an NVDIMM, would adding a corresponding > > virDomainSave() flag to not save NVDIMM memory be a solution? > > Yes, that could work. Libvirt would then set another flag for 'savevm' that > would make it skip all NVDIMMs. Hopefully, no one will need more fine > grained approach (choosing per each NVDIMM whether to save it or not). Igor, > would this work for qemu? It should work fine for QEMU I just realized that when I was mentioning 'savevm' I really meant 'migrate' command, because virDomainSave() uses migration into an FD (which is just opened user provided path). Bulk update - Move RHEL-AV bugs to RHEL This is now targeted to RHEL 9 so it won't fix the problem in RHV. But a contingent fix might be still useful if another product would like to support both NVDIMM and suspending a VM. This bug fixed depends on the qemu bug:bug#1923905,confirmed with michal, this bug will be fixed once the dependent bug is fixed. So stale this bug to 2022-06-30 Setting stale date to match the QEMU bug. NB: Removed RHV as dependent product since this is a RHEL9 issue close it as the dependent qemu bug is closed. And for RHV, this feature is not an issue any more(refer to bug 1897906#c9) |