Bug 2134009
Summary: | memory device hotplug fails after umount and mount hugepage path again | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 9 | Reporter: | liang cong <lcong> |
Component: | libvirt | Assignee: | Michal Privoznik <mprivozn> |
libvirt sub component: | General | QA Contact: | liang cong <lcong> |
Status: | CLOSED ERRATA | Docs Contact: | |
Severity: | low | ||
Priority: | low | CC: | jdenemar, lmen, mprivozn, pkrempa, virt-maint, xuzhang, ymankad |
Version: | 9.1 | Keywords: | Triaged |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | libvirt-8.9.0-1.el9 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2023-05-09 07:27:15 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | 8.9.0 |
Embargoed: |
Description
liang cong
2022-10-12 06:35:23 UTC
Patches posted on the list: https://listman.redhat.com/archives/libvir-list/2022-October/234822.html Merged upstream as: babcbf2d5c qemu: Create base hugepages path on memory hotplug 72adf3b717 qemu: Separate out hugepages basedir making v8.8.0-136-gbabcbf2d5c Pre-verified on upstream build v8.9.0-rc1-12-gde842f37a1 Verify steps: 1. Allocate two 1G hugepages # echo 2 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages 2. Mount 1G hugepage path # mkdir /dev/hugepages1G # mount -t hugetlbfs -o pagesize=1G none /dev/hugepages1G 3. Restart libvirt # systemctl restart virtqemud 4. Umount and mount 1G hugepage path again # umount /dev/hugepages1G/ # mount -t hugetlbfs -o pagesize=1G hugetlbfs /dev/hugepages1G 5. Start up guest vm # virsh start vm1 Domain 'vm1' started 6. Prepare a dimm memory device with following xml # cat dimm1G.xml <memory model='dimm'> <source> <pagesize unit='KiB'>1048576</pagesize> <nodemask>0-1</nodemask> </source> <target> <size unit='KiB'>1048576</size> <node>0</node> </target> </memory> 7. Hot plug memory device with xml from step6 # virsh attach-device vm1 dimm1G.xml Device attached successfully Also check the below scenarios: Steps: 1. memory backing 2M guest vm start -> mount 1G path -> hotplug 1G dimm -> restart vm -> restart libvirtd -> hotplug 1G dimm 2. mount 1G path -> memory backing 2M guest vm start -> restart libvirtd -> hogplug 1G dimm -> restart libvirtd -> restart vm ->hogplug 1G dimm Tested with these settings:remember_owner=1 or 0, no memory backing,1G memory backing, 1G hugepage path as /mnt/hugepages1G, memfd memory backing Hi michal, I found below 2 senarios, and these 2 are all about umount hugepage path, I know it is about abnormal operation, but I found something interesting, pls help to check whether there is bug inside, thx. scenario 1: 1. Allocate two 1G hugepages # echo 2 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages 2. Mount 1G hugepage path # mkdir /dev/hugepages1G # mount -t hugetlbfs -o pagesize=1G none /dev/hugepages1G 3. Define a guest with below memorybacking xml. <memoryBacking> <hugepages> <page size='1' unit='G'/> </hugepages> </memoryBacking> 4. Restart libvirt # systemctl restart virtqemud 5. Start guest vm # virsh start vm1 Domain 'vm1' started 6. Umount 1G hugepage path # umount /dev/hugepages1G/ 7. Login the guest vm and check the guest memory still can be allocated. 8. Prepare memory device hotplug xml like below: # cat dimm1G.xml <memory model='dimm'> <source> <pagesize unit='KiB'>1048576</pagesize> <nodemask>0-1</nodemask> </source> <target> <size unit='KiB'>1048576</size> <node>0</node> </target> </memory> 9. Hot plug 1G hugepage sourced dimm memory device with xml from step8, then error prompts: # virsh attach-device vm1 dimm1G.xml error: Failed to attach device from dimm1G.xml error: internal error: unable to execute QEMU command 'object-add': os_mem_prealloc: preallocating memory failed: Bad address 10. Login the guest vm and found guest memory is till can be allocated. scenario 2: 1. Allocate two 1G hugepages # echo 2 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages 2. Define a guest with below memorybacking xml. <memoryBacking> <hugepages> <page size='2' unit='M'/> </hugepages> </memoryBacking> 3. Start guest vm # virsh start vm1 Domain 'vm1' started 4. Mount 1G hugepage path # mkdir /dev/hugepages1G # mount -t hugetlbfs -o pagesize=1G none /dev/hugepages1G 5. Restart libvirt # systemctl restart virtqemud 6. Prepare memory device hotplug xml like below: # cat dimm1G.xml <memory model='dimm'> <source> <pagesize unit='KiB'>1048576</pagesize> <nodemask>0-1</nodemask> </source> <target> <size unit='KiB'>1048576</size> <node>0</node> </target> </memory> 7. Hot plug 1G hugepage sourced dimm memory device with xml from step6 # virsh attach-device vm1 dimm1G.xml Device attached successfully 8. Umount 1G hugepage path # umount /dev/hugepages1G/ 9. Hot plug 1G hugepage sourced dimm memory device again, and found error: # virsh attach-device vm1 dimm1G.xml error: Failed to attach device from dimm1G.xml error: Unable to umount /dev/hugepages1G/libvirt/qemu/1-vm1: Device or resource busy 10. Hot plug 1G hugepage sourced dimm memory device the 3rd time # virsh attach-device vm1 dimm1G.xml Device attached successfully 11. Login the guest vm and found guest memory is till can be allocated. I found: if mount hugepage path before guest vm is started, then umount then got error:unable to execute QEMU command 'object-add': os_mem_prealloc: preallocating memory failed: Bad address. if mount hugepage path after guest vm is started, then get "Device or resource busy" error once of step 9 in scenario 2, then no error appears later. (In reply to liang cong from comment #4) > Hi michal, I found below 2 senarios, and these 2 are all about umount > hugepage path, I know it is about abnormal operation, but I found something > interesting, pls help to check whether there is bug inside, thx. No. These are prefectly expected behaviours. Firstly, libvirtd/virtqemud can only work so far after the system's configuration changed after the daemon was started. Secondly, the way that memory allocation works in QEMU wrt hugepages is: libvirt finds where the hugetlbfs that corresponds to requested size is (/dev/hugepages1G for 1GiB hugepages in this case), creates a per domain path there (/dev/hugepages1G/libvirt/qemu/...) and tells QEMU to use that path as prefix for a tempfile (/dev/hugepages1G/libvirt/qemu/.../qemu_back_mem....) which is then used for mmap(). Now, the first scenario fails, because the /dev/hugepages1G does not correspond to 1GiB hugetlbfs anymore, since it was unmounted. There's nothing that either libvirt/qemu can do. The second scenario fails, because libvirt needs to play tricks (exactly because of this bug) with namespaces, mountpoints. Long story short, don't go an change mount table whilst libvirtd/virtqemud/qemu is running. This is now documented here: https://libvirt.org/kbase/qemu-passthrough-security.html#private-monunt-namespace BTW: things would break horribly too if CGroup FS was umounted while libvirt/qemu is running and we don't consider that an error. Mark this bug as tested for comment3 and comment5. And since this bug's scenarios is not a recommended operation so mark it qe_test_coverage- Forgot to mention, things are a bit different with memfd. There, no hugetlbfs mount point is needed and thus these kinds of problems is just gone. Not to mention that memfd has more advantages than that. I wonder whether we should recommend using that instead. Verified on: # rpm -q qemu-kvm libvirt qemu-kvm-7.1.0-4.el9.x86_64 libvirt-8.9.0-2.el9.x86_64 Verify steps: 1. Allocate two 1G hugepages # echo 2 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages 2. Mount 1G hugepage path # mkdir /dev/hugepages1G # mount -t hugetlbfs -o pagesize=1G none /dev/hugepages1G 3. Restart libvirt # systemctl restart virtqemud 4. Umount and mount 1G hugepage path again # umount /dev/hugepages1G/ # mount -t hugetlbfs -o pagesize=1G hugetlbfs /dev/hugepages1G 5. Start up guest vm # virsh start vm1 Domain 'vm1' started 6. Prepare a dimm memory device with following xml # cat dimm1G.xml <memory model='dimm'> <source> <pagesize unit='KiB'>1048576</pagesize> <nodemask>0-1</nodemask> </source> <target> <size unit='KiB'>1048576</size> <node>0</node> </target> </memory> 7. Hot plug memory device with xml from step6 # virsh attach-device vm1 dimm1G.xml Device attached successfully Also check the below scenarios: Steps: 1. memory backing 2M guest vm start -> mount 1G path -> hotplug 1G dimm -> restart vm -> restart libvirtd -> hotplug 1G dimm 2. mount 1G path -> memory backing 2M guest vm start -> restart libvirtd -> hogplug 1G dimm -> restart libvirtd -> restart vm ->hogplug 1G dimm Tested with these settings:remember_owner=1 or 0, no memory backing,1G memory backing, 1G hugepage path as /mnt/hugepages1G, memfd memory backing Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (libvirt bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2023:2171 |