RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2134009 - memory device hotplug fails after umount and mount hugepage path again
Summary: memory device hotplug fails after umount and mount hugepage path again
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: libvirt
Version: 9.1
Hardware: Unspecified
OS: Unspecified
low
low
Target Milestone: rc
: ---
Assignee: Michal Privoznik
QA Contact: liang cong
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-10-12 06:35 UTC by liang cong
Modified: 2023-05-09 08:09 UTC (History)
7 users (show)

Fixed In Version: libvirt-8.9.0-1.el9
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-05-09 07:27:15 UTC
Type: Bug
Target Upstream Version: 8.9.0
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHELPLAN-136195 0 None None None 2022-10-12 06:43:16 UTC
Red Hat Product Errata RHBA-2023:2171 0 None None None 2023-05-09 07:27:30 UTC

Description liang cong 2022-10-12 06:35:23 UTC
Description of problem:memory device hotplug fails after umount and mount hugepage path again


Version-Release number of selected component (if applicable):
# rpm -q libvirt qemu-kvm
libvirt-8.5.0-7.el9_1.x86_64
qemu-kvm-7.0.0-13.el9.x86_64


How reproducible:
100%


Steps to Reproduce:
1. Allocate two 1G hugepages
# echo 2 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages

2. Mount 1G hugepage path
# mkdir /dev/hugepages1G
# mount -t hugetlbfs -o pagesize=1G none /dev/hugepages1G

3. Restart libvirt
# systemctl restart virtqemud

4. Umount and mount 1G hugepage path again
# umount /dev/hugepages1G/
# mount -t hugetlbfs -o pagesize=1G hugetlbfs /dev/hugepages1G

5. Start up guest vm
# virsh start vm1
Domain 'vm1' started

6. Prepare a dimm memory device with following xml
# cat dimm1G.xml 
<memory model='dimm'>
    <source>
      <pagesize unit='KiB'>1048576</pagesize>
      <nodemask>0-1</nodemask>
    </source>
    <target>
      <size unit='KiB'>1048576</size>
      <node>0</node>
    </target>
  </memory>

7. Hot plug memory device with xml from step6
# virsh attach-device vm1 dimm1G.xml 
error: Failed to attach device from dimm1G.xml
error: internal error: unable to execute QEMU command 'object-add': can't open backing store /dev/hugepages1G/libvirt/qemu/1-vm1 for guest RAM: Permission denied


Actual results:
Step7 hot plug memory device fails.

Expected results:
Hot plug memory device succeeds.

Additional info:
if restart virtqemud between step4 and step5, it would avoid this issue.
if not, then after step7, we have to mount, umount, restart virtqemud to avoid this issue.

Comment 1 Michal Privoznik 2022-10-12 10:39:03 UTC
Patches posted on the list:

https://listman.redhat.com/archives/libvir-list/2022-October/234822.html

Comment 2 Michal Privoznik 2022-10-17 06:42:30 UTC
Merged upstream as:

babcbf2d5c qemu: Create base hugepages path on memory hotplug
72adf3b717 qemu: Separate out hugepages basedir making

v8.8.0-136-gbabcbf2d5c

Comment 3 liang cong 2022-10-28 08:16:12 UTC
Pre-verified on upstream build v8.9.0-rc1-12-gde842f37a1

Verify steps:
1. Allocate two 1G hugepages
# echo 2 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages

2. Mount 1G hugepage path
# mkdir /dev/hugepages1G
# mount -t hugetlbfs -o pagesize=1G none /dev/hugepages1G

3. Restart libvirt
# systemctl restart virtqemud

4. Umount and mount 1G hugepage path again
# umount /dev/hugepages1G/
# mount -t hugetlbfs -o pagesize=1G hugetlbfs /dev/hugepages1G

5. Start up guest vm
# virsh start vm1
Domain 'vm1' started

6. Prepare a dimm memory device with following xml
# cat dimm1G.xml 
<memory model='dimm'>
    <source>
      <pagesize unit='KiB'>1048576</pagesize>
      <nodemask>0-1</nodemask>
    </source>
    <target>
      <size unit='KiB'>1048576</size>
      <node>0</node>
    </target>
  </memory>

7. Hot plug memory device with xml from step6
# virsh attach-device vm1 dimm1G.xml 
Device attached successfully


Also check the below scenarios:
Steps:
1. memory backing 2M guest vm start -> mount 1G path -> hotplug 1G dimm -> restart vm -> restart libvirtd -> hotplug 1G dimm
2. mount 1G path -> memory backing 2M guest vm start -> restart libvirtd -> hogplug 1G dimm -> restart libvirtd -> restart vm ->hogplug 1G dimm

Tested with these settings:remember_owner=1 or 0, no memory backing,1G memory backing, 1G hugepage path as /mnt/hugepages1G, memfd memory backing

Comment 4 liang cong 2022-10-28 08:57:34 UTC
Hi michal, I found below 2 senarios, and these 2 are all about umount hugepage path, I know it is about abnormal operation, but I found something interesting, pls help to check whether there is bug inside, thx.
scenario 1:
1. Allocate two 1G hugepages
# echo 2 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages

2. Mount 1G hugepage path
# mkdir /dev/hugepages1G
# mount -t hugetlbfs -o pagesize=1G none /dev/hugepages1G

3. Define a guest with below memorybacking xml.
  <memoryBacking>
    <hugepages>
      <page size='1' unit='G'/>
    </hugepages>
  </memoryBacking>

4. Restart libvirt
# systemctl restart virtqemud

5. Start guest vm
# virsh start vm1
Domain 'vm1' started

6. Umount 1G hugepage path
# umount /dev/hugepages1G/

7. Login the guest vm and check the guest memory still can be allocated.

8. Prepare memory device hotplug xml like below:
# cat dimm1G.xml 
<memory model='dimm'>
    <source>
      <pagesize unit='KiB'>1048576</pagesize>
      <nodemask>0-1</nodemask>
    </source>
    <target>
      <size unit='KiB'>1048576</size>
      <node>0</node>
    </target>
  </memory>

9. Hot plug 1G hugepage sourced dimm memory device with xml from step8, then error prompts:
# virsh attach-device vm1 dimm1G.xml 
error: Failed to attach device from dimm1G.xml
error: internal error: unable to execute QEMU command 'object-add': os_mem_prealloc: preallocating memory failed: Bad address

10. Login the guest vm and found guest memory is till can be allocated.

scenario 2:
1. Allocate two 1G hugepages
# echo 2 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages

2. Define a guest with below memorybacking xml.
  <memoryBacking>
    <hugepages>
      <page size='2' unit='M'/>
    </hugepages>
  </memoryBacking>

3. Start guest vm
# virsh start vm1
Domain 'vm1' started

4. Mount 1G hugepage path
# mkdir /dev/hugepages1G
# mount -t hugetlbfs -o pagesize=1G none /dev/hugepages1G

5. Restart libvirt
# systemctl restart virtqemud


6. Prepare memory device hotplug xml like below:
# cat dimm1G.xml 
<memory model='dimm'>
    <source>
      <pagesize unit='KiB'>1048576</pagesize>
      <nodemask>0-1</nodemask>
    </source>
    <target>
      <size unit='KiB'>1048576</size>
      <node>0</node>
    </target>
  </memory>

7. Hot plug 1G hugepage sourced dimm memory device with xml from step6
# virsh attach-device vm1 dimm1G.xml 
Device attached successfully

8. Umount 1G hugepage path
# umount /dev/hugepages1G/

9. Hot plug 1G hugepage sourced dimm memory device again, and found error:
# virsh attach-device vm1 dimm1G.xml 
error: Failed to attach device from dimm1G.xml
error: Unable to umount /dev/hugepages1G/libvirt/qemu/1-vm1: Device or resource busy

10. Hot plug 1G hugepage sourced dimm memory device the 3rd time
# virsh attach-device vm1 dimm1G.xml 
Device attached successfully

11. Login the guest vm and found guest memory is till can be allocated.



I found:
if mount hugepage path before guest vm is started, then umount then got error:unable to execute QEMU command 'object-add': os_mem_prealloc: preallocating memory failed: Bad address.
if mount hugepage path after guest vm is started, then get "Device or resource busy" error once of step 9 in scenario 2, then no error appears later.

Comment 5 Michal Privoznik 2022-10-31 08:56:00 UTC
(In reply to liang cong from comment #4)
> Hi michal, I found below 2 senarios, and these 2 are all about umount
> hugepage path, I know it is about abnormal operation, but I found something
> interesting, pls help to check whether there is bug inside, thx.

No. These are prefectly expected behaviours. Firstly, libvirtd/virtqemud can only work so far after the system's configuration changed after the daemon was started. Secondly, the way that memory allocation works in QEMU wrt hugepages is: libvirt finds where the hugetlbfs that corresponds to requested size is (/dev/hugepages1G for 1GiB hugepages in this case), creates a per domain path there (/dev/hugepages1G/libvirt/qemu/...) and tells QEMU to use that path as prefix for a tempfile (/dev/hugepages1G/libvirt/qemu/.../qemu_back_mem....) which is then used for mmap().

Now, the first scenario fails, because the /dev/hugepages1G does not correspond to 1GiB hugetlbfs anymore, since it was unmounted. There's nothing that either libvirt/qemu can do.
The second scenario fails, because libvirt needs to play tricks (exactly because of this bug) with namespaces, mountpoints.

Long story short, don't go an change mount table whilst libvirtd/virtqemud/qemu is running. This is now documented here: https://libvirt.org/kbase/qemu-passthrough-security.html#private-monunt-namespace

BTW: things would break horribly too if CGroup FS was umounted while libvirt/qemu is running and we don't consider that an error.

Comment 6 liang cong 2022-10-31 09:37:07 UTC
Mark this bug as tested for comment3 and comment5.
And since this bug's scenarios is not a recommended operation so mark it qe_test_coverage-

Comment 7 Michal Privoznik 2022-10-31 10:51:52 UTC
Forgot to mention, things are a bit different with memfd. There, no hugetlbfs mount point is needed and thus these kinds of problems is just gone. Not to mention that memfd has more advantages than that. I wonder whether we should recommend using that instead.

Comment 12 liang cong 2022-11-08 10:12:47 UTC
Verified on:
# rpm -q qemu-kvm libvirt
qemu-kvm-7.1.0-4.el9.x86_64
libvirt-8.9.0-2.el9.x86_64



Verify steps:
1. Allocate two 1G hugepages
# echo 2 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages

2. Mount 1G hugepage path
# mkdir /dev/hugepages1G
# mount -t hugetlbfs -o pagesize=1G none /dev/hugepages1G

3. Restart libvirt
# systemctl restart virtqemud

4. Umount and mount 1G hugepage path again
# umount /dev/hugepages1G/
# mount -t hugetlbfs -o pagesize=1G hugetlbfs /dev/hugepages1G

5. Start up guest vm
# virsh start vm1
Domain 'vm1' started

6. Prepare a dimm memory device with following xml
# cat dimm1G.xml 
<memory model='dimm'>
    <source>
      <pagesize unit='KiB'>1048576</pagesize>
      <nodemask>0-1</nodemask>
    </source>
    <target>
      <size unit='KiB'>1048576</size>
      <node>0</node>
    </target>
  </memory>

7. Hot plug memory device with xml from step6
# virsh attach-device vm1 dimm1G.xml 
Device attached successfully


Also check the below scenarios:
Steps:
1. memory backing 2M guest vm start -> mount 1G path -> hotplug 1G dimm -> restart vm -> restart libvirtd -> hotplug 1G dimm
2. mount 1G path -> memory backing 2M guest vm start -> restart libvirtd -> hogplug 1G dimm -> restart libvirtd -> restart vm ->hogplug 1G dimm

Tested with these settings:remember_owner=1 or 0, no memory backing,1G memory backing, 1G hugepage path as /mnt/hugepages1G, memfd memory backing

Comment 14 errata-xmlrpc 2023-05-09 07:27:15 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (libvirt bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:2171


Note You need to log in before you can comment on or make changes to this bug.