RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2132178 - libvirt kills virtual machine on restart when 2M and 1G hugepages are mounted [rhel-8.4.0.z]
Summary: libvirt kills virtual machine on restart when 2M and 1G hugepages are mounted...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: libvirt
Version: 8.4
Hardware: x86_64
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Michal Privoznik
QA Contact: liang cong
URL:
Whiteboard:
Depends On: 2123196 2151869
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-10-04 21:19 UTC by RHEL Program Management Team
Modified: 2022-12-08 11:53 UTC (History)
9 users (show)

Fixed In Version: libvirt-6.0.0-35.4.module+el8.4.0+16907+31bceb87
Doc Type: Bug Fix
Doc Text:
Cause: When libvirt is restarted after a hugetlbfs was mounted and a guest is running, libvirt tries to create guest specific path in the new hugetlbfs mount point. Because of a bug in namespace code this fails which results in the guest being killed by libvirt. Consequence: Guest is killed on libvirtd restart. Fix: Twofold. Firstly, the namespace code was fixed so that creating this guest specific path now succeeds. Secondly, the creation is postponed until really needed (memory hotplug). Result: Guests can now survive libvirtd restart.
Clone Of: 2123196
Environment:
Last Closed: 2022-11-29 14:12:30 UTC
Type: ---
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker LIBVIRTAT-13734 0 None None None 2022-12-07 03:22:41 UTC
Red Hat Issue Tracker RHELPLAN-135633 0 None None None 2022-10-05 10:08:19 UTC
Red Hat Product Errata RHBA-2022:8676 0 None None None 2022-11-29 14:12:36 UTC

Comment 2 liang cong 2022-10-08 05:34:52 UTC
Found one extra error on:
# rpm -q libvirt qemu-kvm
libvirt-6.0.0-35.3.module+el8.4.0+16832+c579b597.x86_64
qemu-kvm-4.2.0-49.module+el8.4.0+16539+22b18146.9.x86_64

1. Define a guest with below memorybacking xml.
  <memoryBacking>
    <hugepages>
      <page size='2048' unit='KiB'/>
    </hugepages>
  </memoryBacking>

2. Start the VM and stop libvirt

# virsh start vm1 && systemctl stop libvirtd
Domain vm1 started

Warning: Stopping libvirtd.service, but it can still be activated by:
  libvirtd.socket
  libvirtd-ro.socket
  libvirtd-admin.socket

3. Mount 1G hugepage path
mount -t hugetlbfs -o pagesize=1G hugetlbfs /dev/hugepages1G


4. Do virsh list and guest still in running state.

# virsh -r list --all
 Id   Name   State
----------------------
 16    vm1    running

# virsh -r list --all
 Id   Name   State
----------------------
 16    vm1    running

5. Prepare memory device hotplug xml like below:
# cat dimm1G.xml 
<memory model='dimm'>
    <source>
      <pagesize unit='KiB'>1048576</pagesize>
      <nodemask>0-1</nodemask>
    </source>
    <target>
      <size unit='KiB'>1048576</size>
      <node>0</node>
    </target>
  </memory>

6. Hotplug dimm memory device:
# virsh attach-device vm1 dimm1G.xml 
Device attached successfully

7. Shutoff vm
# virsh destroy vm1
Domain vm1 destroyed

8. Restart libvirtd found libvirtd can not start up
# systemctl restart libvirtd

# systemctl status libvirtd --full --no-pager
● libvirtd.service - Virtualization daemon
   Loaded: loaded (/usr/lib/systemd/system/libvirtd.service; enabled; vendor preset: enabled)
   Active: inactive (dead) since Fri 2022-10-07 23:13:06 EDT; 5min ago
     Docs: man:libvirtd(8)
           https://libvirt.org
  Process: 7467 ExecStart=/usr/sbin/libvirtd $LIBVIRTD_ARGS (code=exited, status=0/SUCCESS)
 Main PID: 7467 (code=exited, status=0/SUCCESS)
    Tasks: 2 (limit: 32768)
   Memory: 51.2M
   CGroup: /system.slice/libvirtd.service
           ├─1455 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro --dhcp-script=/usr/libexec/libvirt_leaseshelper
           └─1456 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro --dhcp-script=/usr/libexec/libvirt_leaseshelper

Oct 07 23:13:06 lcong-84test systemd[1]: Started Virtualization daemon.
Oct 07 23:13:06 lcong-84test dnsmasq[1455]: read /etc/hosts - 2 addresses
Oct 07 23:13:06 lcong-84test dnsmasq[1455]: read /var/lib/libvirt/dnsmasq/default.addnhosts - 0 addresses
Oct 07 23:13:06 lcong-84test dnsmasq-dhcp[1455]: read /var/lib/libvirt/dnsmasq/default.hostsfile
Oct 07 23:13:06 lcong-84test libvirtd[7467]: libvirt version: 6.0.0, package: 35.3.module+el8.4.0+16832+c579b597 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2022-10-06-16:23:12, )
Oct 07 23:13:06 lcong-84test libvirtd[7467]: hostname: lcong-84test
Oct 07 23:13:06 lcong-84test libvirtd[7467]: unable to create hugepage path /run/libvirt/qemu/16-vm1.hugepages1G.libvirt.qemu.16-vm1/libvirt/qemu: No such file or directory
Oct 07 23:13:06 lcong-84test libvirtd[7467]: Initialization of QEMU state driver failed: unable to create hugepage path /run/libvirt/qemu/16-vm1.hugepages1G.libvirt.qemu.16-vm1/libvirt/qemu: No such file or directory
Oct 07 23:13:06 lcong-84test libvirtd[7467]: Driver state initialization failed
Oct 07 23:13:06 lcong-84test systemd[1]: libvirtd.service: Succeeded.

9. Check the mount info and there is one extra hugepage mount path like:
hugetlbfs on /run/libvirt/qemu/16-vm1.hugepages1G.libvirt.qemu.16-vm1 type hugetlbfs (rw,relatime,seclabel,pagesize=1024M)
which makes libvirtd fail to start.

@mprivozn please help to check, thx.

Comment 3 liang cong 2022-10-08 09:39:21 UTC
I did not see issue described as comment#2 on rhel8.7 scratch build : http://brew-task-repos.usersys.redhat.com/repos/scratch/mprivozn/libvirt/8.0.0/11.el8_rc.8c593088fd/

Comment 4 liang cong 2022-10-08 09:49:21 UTC
Found another issue on:
# rpm -q libvirt qemu-kvm
libvirt-6.0.0-35.3.module+el8.4.0+16832+c579b597.x86_64
qemu-kvm-4.2.0-49.module+el8.4.0+16539+22b18146.9.x86_64

1. Define a guest with below memorybacking xml.
  <memoryBacking>
    <hugepages>
      <page size='2048' unit='KiB'/>
    </hugepages>
  </memoryBacking>

2. Mount 1G hugepage path
mount -t hugetlbfs -o pagesize=1G hugetlbfs /dev/hugepages1G

3. Start vm
# virsh start vm1
Domain vm1 started

4. Prepare memory device hotplug xml like below:
# cat dimm1G.xml 
<memory model='dimm'>
    <source>
      <pagesize unit='KiB'>1048576</pagesize>
      <nodemask>0-1</nodemask>
    </source>
    <target>
      <size unit='KiB'>1048576</size>
      <node>0</node>
    </target>
  </memory>

5. Attach 1G memory device described in step 4.
# virsh attach-device vm1 dimm1G.xml 
error: Failed to attach device from dimm1G.xml
error: internal error: Unable to find any usable hugetlbfs mount for 1048576 KiB

Comment 5 Michal Privoznik 2022-10-10 07:34:21 UTC
(In reply to liang cong from comment #2)
> Found one extra error on:
> # rpm -q libvirt qemu-kvm
> libvirt-6.0.0-35.3.module+el8.4.0+16832+c579b597.x86_64
> qemu-kvm-4.2.0-49.module+el8.4.0+16539+22b18146.9.x86_64


> 
> @mprivozn please help to check, thx.

Yeah, this is a genuine bug. Libvirt just did not umount the bind mounted path in the parent namespace. We have a fix upstream, I just need to backport it.

Comment 6 Michal Privoznik 2022-10-10 07:36:34 UTC
(In reply to liang cong from comment #4)
> Found another issue on:
> # rpm -q libvirt qemu-kvm
> libvirt-6.0.0-35.3.module+el8.4.0+16832+c579b597.x86_64
> qemu-kvm-4.2.0-49.module+el8.4.0+16539+22b18146.9.x86_64
> 

> 5. Attach 1G memory device described in step 4.
> # virsh attach-device vm1 dimm1G.xml 
> error: Failed to attach device from dimm1G.xml
> error: internal error: Unable to find any usable hugetlbfs mount for 1048576
> KiB

This is expected and does not diverge from the upstream at all. Libvirt looks at the mount table only when the daemon is starting up, therefore it does not see any mounts after it was started. Restarting the daemon fixes the issue. I mean, any significant change to the host configuration requires daemon restart.

Comment 11 liang cong 2022-10-13 03:08:03 UTC
Preverified on scratch build:
# rpm -q libvirt qemu-kvm
libvirt-6.0.0-35.4.el8_rc.440f7c6749.x86_64
qemu-kvm-4.2.0-48.module+el8.4.0+11909+3300d70f.3.x86_64

Verify steps:
1. Define a guest with below memorybacking xml.
  <memoryBacking>
    <hugepages>
      <page size='2048' unit='KiB'/>
    </hugepages>
  </memoryBacking>

2. Start the VM and stop libvirt

# virsh start vm1 && systemctl stop libvirtd
Domain vm1 started

Warning: Stopping libvirtd.service, but it can still be activated by:
  libvirtd.socket
  libvirtd-ro.socket
  libvirtd-admin.socket

3. Mount 1G hugepage path
# mkdir /dev/hugepages1G
# mount -t hugetlbfs -o pagesize=1G hugetlbfs /dev/hugepages1G


4. Do virsh list and guest still in running state.

# virsh -r list --all
 Id   Name   State
----------------------
 1    vm1    running

# virsh -r list --all
 Id   Name   State
----------------------
 1    vm1    running

5. Prepare memory device hotplug xml like below:
# cat dimm1G.xml 
<memory model='dimm'>
    <source>
      <pagesize unit='KiB'>1048576</pagesize>
      <nodemask>0-1</nodemask>
    </source>
    <target>
      <size unit='KiB'>1048576</size>
      <node>0</node>
    </target>
  </memory>


6. Hotplug dimm memory device:
# virsh attach-device vm1 dimm1G.xml 
Device attached successfully

7. Prepare memory device with 2M hugepage source hotplug xml like below:
# cat dimm2M.xml 
<memory model='dimm'>
    <source>
      <pagesize unit='KiB'>2048</pagesize>
      <nodemask>0-1</nodemask>
    </source>
    <target>
      <size unit='KiB'>1048576</size>
      <node>0</node>
    </target>
  </memory>

8. Hotplug dimm memory device:
# virsh attach-device vm1 dimm2M.xml 
Device attached successfully


9. Shutoff vm
# virsh destroy vm1
Domain vm1 destroyed


10. Restart libvirtd
# systemctl restart libvirtd

11. Start vm
# virsh start vm1
Domain 'vm1' started


Also check the below scenarios:
Steps:
1. memory backing 2M guest vm start -> stop libvirt -> mount 1G path -> start libvirt -> hotplug 1G dimm -> restart vm -> restart libvirtd -> hotplug 1G dimm
2. mount 1G path -> memory backing 2M guest vm start -> restart libvirtd -> hogplug 1G dimm -> restart libvirtd -> restart vm ->hogplug 1G dimm

Tested with these settings:remember_owner=1 or 0, default memory backing,memfd memory backing, 1G hugepage memory backing, 1G hugepage path as /mnt/hugepages1G


Additional info:
1. Restart libvirt after mount hugepage path.
2. Umount and mount hugepage path may cause another issue bug#2134009

Comment 13 liang cong 2022-10-20 03:08:23 UTC
Verified on build:
# rpm -q libvirt qemu-kvm
libvirt-6.0.0-35.4.module+el8.4.0+16907+31bceb87.x86_64
qemu-kvm-4.2.0-48.module+el8.4.0+11909+3300d70f.3.x86_64

Verify steps:
1. Define a guest with below memorybacking xml.
  <memoryBacking>
    <hugepages>
      <page size='2048' unit='KiB'/>
    </hugepages>
  </memoryBacking>

2. Start the VM and stop libvirt

# virsh start vm1 && systemctl stop libvirtd
Domain vm1 started

Warning: Stopping libvirtd.service, but it can still be activated by:
  libvirtd.socket
  libvirtd-ro.socket
  libvirtd-admin.socket

3. Mount 1G hugepage path
# mkdir /dev/hugepages1G
# mount -t hugetlbfs -o pagesize=1G hugetlbfs /dev/hugepages1G


4. Do virsh list and guest still in running state.

# virsh -r list --all
 Id   Name   State
----------------------
 2    vm1    running

# virsh -r list --all
 Id   Name   State
----------------------
 2    vm1    running

5. Prepare memory device hotplug xml like below:
# cat dimm1G.xml 
<memory model='dimm'>
    <source>
      <pagesize unit='KiB'>1048576</pagesize>
      <nodemask>0-1</nodemask>
    </source>
    <target>
      <size unit='KiB'>1048576</size>
      <node>0</node>
    </target>
  </memory>


6. Hotplug dimm memory device:
# virsh attach-device vm1 dimm1G.xml 
Device attached successfully

7. Prepare memory device with 2M hugepage source hotplug xml like below:
# cat dimm2M.xml 
<memory model='dimm'>
    <source>
      <pagesize unit='KiB'>2048</pagesize>
      <nodemask>0-1</nodemask>
    </source>
    <target>
      <size unit='KiB'>1048576</size>
      <node>0</node>
    </target>
  </memory>

8. Hotplug dimm memory device:
# virsh attach-device vm1 dimm2M.xml 
Device attached successfully


9. Shutoff vm
# virsh destroy vm1
Domain vm1 destroyed


10. Restart libvirtd
# systemctl restart libvirtd

11. Start vm
# virsh start vm1
Domain 'vm1' started


Also check the below scenarios:
Steps:
1. memory backing 2M guest vm start -> stop libvirt -> mount 1G path -> start libvirt -> hotplug 1G dimm -> restart vm -> restart libvirtd -> hotplug 1G dimm
2. mount 1G path -> memory backing 2M guest vm start -> restart libvirtd -> hogplug 1G dimm -> restart libvirtd -> restart vm ->hogplug 1G dimm

Tested with these settings:remember_owner=1 or 0, memfd memory backing, default memory backing, 1G hugepage memory backing, 1G hugepage path as /mnt/hugepages1G


Additional info:
1. Restart libvirt after mount hugepage path.
2. Umount and mount hugepage path may cause another issue bug#2134009

Comment 18 errata-xmlrpc 2022-11-29 14:12:30 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (virt:rhel bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:8676


Note You need to log in before you can comment on or make changes to this bug.