Bug 2123196
| Summary: | libvirt kills virtual machine on restart when 2M and 1G hugepages are mounted | |||
|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | Germano Veit Michel <gveitmic> | |
| Component: | libvirt | Assignee: | Michal Privoznik <mprivozn> | |
| Status: | CLOSED ERRATA | QA Contact: | liang cong <lcong> | |
| Severity: | high | Docs Contact: | ||
| Priority: | high | |||
| Version: | 8.4 | CC: | ailan, duclee, dzheng, haizhao, jdenemar, jsuchane, lmen, mprivozn, virt-maint, yafu, yalzhang | |
| Target Milestone: | rc | Keywords: | Triaged, Upstream, ZStream | |
| Target Release: | --- | Flags: | pm-rhel:
mirror+
|
|
| Hardware: | x86_64 | |||
| OS: | Linux | |||
| Whiteboard: | ||||
| Fixed In Version: | libvirt-8.0.0-11.module+el8.8.0+16835+1d966b61 | Doc Type: | Bug Fix | |
| Doc Text: |
Cause:
When libvirt is restarted after a hugetlbfs was mounted and a guest is running, libvirt tries to create guest specific path in the new hugetlbfs mount point. Because of a bug in namespace code this fails which results in the guest being killed by libvirt.
Consequence:
Guest is killed on libvirtd restart.
Fix:
Twofold. Firstly, the namespace code was fixed so that creating this guest specific path now succeeds. Secondly, the creation is postponed until really needed (memory hotplug).
Result:
Guests can now survive libvirtd restart.
|
Story Points: | --- | |
| Clone Of: | ||||
| : | 2132176 2132177 2132178 2151869 (view as bug list) | Environment: | ||
| Last Closed: | 2023-05-16 08:16:30 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | 8.8.0 | |
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 2132176, 2132177, 2132178, 2151869 | |||
|
Description
Germano Veit Michel
2022-09-01 00:56:08 UTC
This is a tricky one. With newer libvirt it basically works due to an accident. Let me explain what is going on and why new libvirt "works". there were plenty of problems on a layer between udev and libvirt. The former (udev) is responsible for creating /dev/* nodes on device hotplug/hotunplug/config change and the latter (libvirt) needs to set correct seclabels (DAC + SELinux) on those nodes so that unprivileged QEMU can access them. This friction was 'producing more heat' the moment users/sysadmins started to write their own udev rules which clashed with libvirt set seclabels, effectively cutting a running QEMU process off. Here at libvirt we decided to create a private mount namespace for each QEMU and replace system /dev with a private, libvirt managed one. And to enhance security only those nodes that QEMU needs are exposed there. Moreover, they are dynamically created on device hotplug to/hotunplug from QEMU (again, by libvirt). Because of enhanced security, this feature is automatically turned on but can be overridden in qemu.conf (namespaces=[]). Now, libvirt has to be careful to replace just devtmpfs mounted on /dev. The rest of the mount table has to be kept verbatim (e.g. because of disks which may be on NFS or basically anything else). And you may already see the problem. When the guest is started for the first time, its mount table contains just 1GiB hugetlbfs mount point. Then they stop libvirtd and mount another hugetlbfs. When libvirtd is started again it wants to create per-guest path in that new mountpoint as well, but: 1) it's doing so in the host namespace (i.e. the namespace where libvirtd is running), while it should have done in the QEMU's namespace! But even if it wanted to do that, it couldn't because, 2) the mount event of the other hugetlbfs (2MiB) is not propagated into the QEMU's namespace. The reason for 2) is that mount event propagation is done outside of libvirt's control (in kernel) and thus sysadmin might mangle the QEMU's private /dev by remounting /dev in the top level namespace. The 1) is clearly a bug, but trying to create the path in the QEMU's namespace is equally wrong. The ideal solution here would be to have sysadmin set up the mount table upfront and start VMs only afterwards. There's no harm in having hugetlbfs mounted with empty pool (= no hugepages reserved). Meanwhile, let me investigate why we need to call qemuProcessBuildDestroyMemoryPaths() from qemuProcessReconnect() in the first place. I mean, we are just reconnecting to a previously started QEMU process, no new paths need to be created. It made sense when I introduced per-domain location back in 2016 (https://gitlab.com/libvirt/libvirt/-/commit/f55afd83b13) so that the paths are created on libvirtd upgrade, but that's not the case anymore. This won't solve the other issue at hand - even if libvirtd restarted successfully, they still wouldn't be able to use that new 2MiB hugetlbfs, because it doesn't exist in the QEMU's namespace. And as promised, there's not a single patch that 'fixed' this behaviour upstream. It's my seclabel remembering work that has a flaw and simply ignores nonexistent paths. If you'd set remember_owner=0 in qemu.conf then you'd get the same behaviour no matter the version. Note to QE: it doesn't really matter which hugetlbfs comes first. I can even reproduce without a single 1GiB HP allocated. Just start a guest with 2MiB HPs, then mount hugetlbfs pagesize=1GB and restart libvirtd. Thanks for the explanation. That build=true when called from reconnect looks a bit suspicious to me, but this is the first time I read this code... (In reply to Germano Veit Michel from comment #4) > That build=true when called from reconnect looks a bit suspicious to me, but > this is the first time I read this code... Ignore this, it would try to find it and delete it anyway. After thinking about this more, I think this is a misconfiguration problem. Here's why: when libvirt creates the private namespace it also marks it as 'slave' (equivalent to 'mount --make-rslave'). The reason is, we indeed need mounts/umounts to propagate into the namespace (I know I said otherwise in previous comment). Just do a thought experiment - imagine a domain that is already running, then an NFS is mounted and a disk hotplug is attempted. We need the NFS mount to propagate, otherwise QEMU wouldn't see the disk on NFS. And maybe my memory is failing me, but IIRC the / used to be shared. Also, there's one problem with hugetlbfs placement: if it's directly under /dev (like in the description - /dev/hugepages2M) then this won't get propagated, because the path doesn't exist at the time when QEMU process is started, and even if it did exist, QEMU is not configured to access it thus libvirt doesn't create the path in the namespace and as a result later mount is not propagated, because the mount point does not exist in the child namespace. Using any other location (e.g. /hugepages2M) works just fine. Having said all of this, I believe when the following steps are taken then the bug goes away: 1) Run 'mount --make-rshared' before starting any guest, 2) change location of hugepages in the systemd unit file so that it's not under /dev. Now, there is a real bug still: libvirt tries to create a domain private path in all hugetlbfs mount points even when not needed. I wanted to suggest using memfd backend but hit just this bug. I'll post a patch shortly. But the advantage of memfd is that it doesn't need any hugetlbfs mount points. I'll also post another patch that documents aforementioned reasoning. reproduced on: libvirt-6.0.0-35.2.module+el8.4.0+14226+d39fa4ab.x86_64 qemu-kvm-4.2.0-49.module+el8.4.0+16539+22b18146.9.x86_64 with the same steps of the description. test on: libvirt-6.0.0-35.2.module+el8.4.0+14226+d39fa4ab.x86_64 qemu-kvm-4.2.0-49.module+el8.4.0+16539+22b18146.9.x86_64 before "4. Start the VM and stop libvirt" run command "mount --make-rshared -t hugetlbfs -o pagesize=2M hugetlbfs /dev/hugepages2M", then work around this problem. test on: libvirt-6.0.0-35.2.module+el8.4.0+14226+d39fa4ab.x86_64 qemu-kvm-4.2.0-49.module+el8.4.0+16539+22b18146.9.x86_64 run command "mount -t hugetlbfs -o pagesize=2M hugetlbfs /mnt/hugepages2M", to mount at /mnt/hugepages2M not under /dev, this also could work around this problem. Some follow up patches: https://listman.redhat.com/archives/libvir-list/2022-September/234290.html test on: libvirt-6.0.0-35.2.module+el8.4.0+14226+d39fa4ab.x86_64 qemu-kvm-4.2.0-49.module+el8.4.0+16539+22b18146.9.x86_64 before "4. Start the VM and stop libvirt" use common mount command: "mount -t hugetlbfs -o pagesize=2M hugetlbfs /dev/hugepages2M", also can work around this problem. Merged upstream: 0377177c78 qemu_process.c: Propagate hugetlbfs mounts on reconnect 5853d70718 qemu_namespace: Introduce qemuDomainNamespaceSetupPath() 46b03819ae qemu_namespace: Fix a corner case in qemuDomainGetPreservedMounts() 687374959e qemu_namespace: Tolerate missing ACLs when creating a path in namespace v8.7.0-134-g0377177c78 Tested on
libvirt-8.5.0-6.el9.x86_64
qemu-kvm-7.0.0-13.el9.x86_64
Senario1:
1. Define a guest with below memorybacking xml.
<memoryBacking>
<hugepages>
<page size='2048' unit='KiB'/>
</hugepages>
</memoryBacking>
2. Start the VM and stop libvirt
# virsh start vm1 && systemctl stop virtqemud
Domain 'vm1' started
Warning: Stopping virtqemud.service, but it can still be activated by:
virtqemud-admin.socket
virtqemud-ro.socket
virtqemud.socket
3. Mount 1G hugepage path
mount -t hugetlbfs -o pagesize=1G hugetlbfs /dev/hugepages1G
4. Start guest and stop virtqemud
# virsh start vm1 && systemctl stop virtqemud
Domain 'vm1' started
Warning: Stopping virtqemud.service, but it can still be activated by:
virtqemud-admin.socket
virtqemud-ro.socket
virtqemud.socket
5. Do virsh list and guest still in running state.
# virsh -r list --all
Id Name State
----------------------
8 vm1 running
# virsh -r list --all
Id Name State
----------------------
8 vm1 running
6. Check the libvirt log, found error below:
40' on '/dev/hugepages1G/libvirt/qemu/8-vm1': No such file or directory
2022-09-26 01:54:08.988+0000: 56052: info : virSecuritySELinuxSetFileconImpl:1252 : Setting SELinux context on '/dev/hugepages1G/libvirt/qemu/8-vm1' to 'system_u:object_r:svirt_image_t:s0:c378,c740'
2022-09-26 01:54:08.988+0000: 56052: info : virObjectUnref:378 : OBJECT_UNREF: obj=0x7f448801d970
2022-09-26 01:54:08.992+0000: 56052: error : virProcessRunInFork:1361 : internal error: child reported (status=125): unable to stat: /dev/hugepages1G/libvirt/qemu/8-vm1: No such file or directory
2022-09-26 01:54:08.992+0000: 56052: error : virProcessRunInFork:1365 : unable to stat: /dev/hugepages1G/libvirt/qemu/8-vm1: No such file or directory
2022-09-26 01:54:08.992+0000: 56052: info : virSecurityDACSetOwnership:789 : Setting DAC user and group on '/dev/hugepages1G/libvirt/qemu/8-vm1' to '107:107'
Senario2:
1. Define a guest with below memorybacking xml.
<memoryBacking>
<hugepages>
<page size='2048' unit='KiB'/>
</hugepages>
</memoryBacking>
2. Start guest
# virsh start vm1
Domain 'vm1' started
3. Prepare 1G hugepage sized dimm xml like below:
# cat dimm.xml
<memory model='dimm'>
<source>
<pagesize unit='KiB'>1048576</pagesize>
<nodemask>0-1</nodemask>
</source>
<target>
<size unit='KiB'>1048576</size>
<node>1</node>
</target>
</memory>
4. Mount 1G hugepage path
mount -t hugetlbfs -o pagesize=1G hugetlbfs /dev/hugepages1G
5. Attach memory device with dimm xml from step 3
# virsh attach-device vm1 dimm.xml
error: Failed to attach device from dimm.xml
error: internal error: unable to execute QEMU command 'object-add': can't open backing store /dev/hugepages1G/libvirt/qemu/9-vm1 for guest RAM: No such file or directory
Hi michal, I test the with below:
libvirtv8.7.0-138-gfa2a7f888c
qemu-kvm-7.1.0-3.fc38.x86_64
test step like below:
1. Define a guest with below memorybacking xml.
<memoryBacking>
<hugepages>
<page size='2048' unit='KiB'/>
</hugepages>
</memoryBacking>
2. Start the VM and stop libvirt
# virsh start vm1 && systemctl stop virtqemud
Domain 'vm1' started
Warning: Stopping virtqemud.service, but it can still be activated by:
virtqemud-admin.socket
virtqemud-ro.socket
virtqemud.socket
3. Mount 1G hugepage path
mount -t hugetlbfs -o pagesize=1G hugetlbfs /dev/hugepages1G
4. Do virsh list and guest still in running state.
# virsh -r list --all
Id Name State
----------------------
1 vm1 running
# virsh -r list --all
Id Name State
----------------------
1 vm1 running
# virsh -r list --all
Id Name State
----------------------
1 vm1 running
5. check the mount namespace of qemu about /dev path:
# cat /proc/`pidof qemu-system-x86_64`/mountinfo | grep ' /dev'
600 599 0:29 /root / rw,relatime master:1 - btrfs /dev/vda5 rw,seclabel,compress=zstd:1,space_cache,subvolid=256,subvol=/root
612 600 0:5 / /dev rw,nosuid master:8 - devtmpfs devtmpfs rw,seclabel,size=4096k,nr_inodes=1048576,mode=755,inode64
613 627 0:22 / /dev/shm rw,nosuid,nodev master:9 - tmpfs tmpfs rw,seclabel,inode64
614 627 0:23 / /dev/pts rw,nosuid,noexec,relatime master:10 - devpts devpts rw,seclabel,gid=5,mode=620,ptmxmode=000
615 627 0:33 / /dev/hugepages rw,relatime master:14 - hugetlbfs hugetlbfs rw,seclabel,pagesize=2M
616 627 0:18 / /dev/mqueue rw,nosuid,nodev,noexec,relatime master:15 - mqueue mqueue rw,seclabel
624 600 0:29 /home /home rw,relatime master:45 - btrfs /dev/vda5 rw,seclabel,compress=zstd:1,space_cache,subvolid=258,subvol=/home
625 600 252:2 / /boot rw,relatime master:47 - ext4 /dev/vda2 rw,seclabel
626 625 252:3 / /boot/efi rw,relatime master:49 - vfat /dev/vda3 rw,fmask=0077,dmask=0077,codepage=437,iocharset=ascii,shortname=winnt,errors=remount-ro
627 612 0:52 / /dev rw,nosuid,relatime - tmpfs devfs rw,seclabel,size=64k,mode=755,inode64
629 612 0:54 / /dev/hugepages1G rw,relatime master:326 - hugetlbfs hugetlbfs rw,seclabel,pagesize=1024M
665 627 0:54 /libvirt/qemu/1-vm1 /dev/hugepages1G/libvirt/qemu/1-vm1 rw,relatime master:326 - hugetlbfs hugetlbfs rw,seclabel,pagesize=1024M
I find there is one extra mount point for 1G hugepage:
665 627 0:54 /libvirt/qemu/1-vm1 /dev/hugepages1G/libvirt/qemu/1-vm1 rw,relatime master:326 - hugetlbfs hugetlbfs rw,seclabel,pagesize=1024M
is this the patch's purpose to fix this issue?
and why doesn't this mount points like the same rule with previous ones such as /dev/hugepages?
(In reply to liang cong from comment #21) > Hi michal, I test the with below: > libvirtv8.7.0-138-gfa2a7f888c > qemu-kvm-7.1.0-3.fc38.x86_64 > > test step like below: > 1. Define a guest with below memorybacking xml. > <memoryBacking> > <hugepages> > <page size='2048' unit='KiB'/> > </hugepages> > </memoryBacking> > > 2. Start the VM and stop libvirt > > # virsh start vm1 && systemctl stop virtqemud > Domain 'vm1' started > > Warning: Stopping virtqemud.service, but it can still be activated by: > virtqemud-admin.socket > virtqemud-ro.socket > virtqemud.socket > > > 3. Mount 1G hugepage path > mount -t hugetlbfs -o pagesize=1G hugetlbfs /dev/hugepages1G > > 4. Do virsh list and guest still in running state. > > # virsh -r list --all > Id Name State > ---------------------- > 1 vm1 running > > # virsh -r list --all > Id Name State > ---------------------- > 1 vm1 running > > # virsh -r list --all > Id Name State > ---------------------- > 1 vm1 running > > 5. check the mount namespace of qemu about /dev path: > # cat /proc/`pidof qemu-system-x86_64`/mountinfo | grep ' /dev' > 600 599 0:29 /root / rw,relatime master:1 - btrfs /dev/vda5 > rw,seclabel,compress=zstd:1,space_cache,subvolid=256,subvol=/root > 612 600 0:5 / /dev rw,nosuid master:8 - devtmpfs devtmpfs > rw,seclabel,size=4096k,nr_inodes=1048576,mode=755,inode64 You can see that the original devtmpfs is still mounted here, under /dev. > 613 627 0:22 / /dev/shm rw,nosuid,nodev master:9 - tmpfs tmpfs > rw,seclabel,inode64 > 614 627 0:23 / /dev/pts rw,nosuid,noexec,relatime master:10 - devpts devpts > rw,seclabel,gid=5,mode=620,ptmxmode=000 > 615 627 0:33 / /dev/hugepages rw,relatime master:14 - hugetlbfs hugetlbfs > rw,seclabel,pagesize=2M > 616 627 0:18 / /dev/mqueue rw,nosuid,nodev,noexec,relatime master:15 - > mqueue mqueue rw,seclabel > 624 600 0:29 /home /home rw,relatime master:45 - btrfs /dev/vda5 > rw,seclabel,compress=zstd:1,space_cache,subvolid=258,subvol=/home > 625 600 252:2 / /boot rw,relatime master:47 - ext4 /dev/vda2 rw,seclabel > 626 625 252:3 / /boot/efi rw,relatime master:49 - vfat /dev/vda3 > rw,fmask=0077,dmask=0077,codepage=437,iocharset=ascii,shortname=winnt, > errors=remount-ro > 627 612 0:52 / /dev rw,nosuid,relatime - tmpfs devfs > rw,seclabel,size=64k,mode=755,inode64 It's just that libvirt mounts a tmpfs over it, therefore, the original /dev is still kind of in the namespace except not accessible for processes because of this tmpfs. > 629 612 0:54 / /dev/hugepages1G rw,relatime master:326 - hugetlbfs hugetlbfs > rw,seclabel,pagesize=1024M Now, when systemd mounts /dev/hugepages1G this is propagated into that lower devtmpfs as this line shows. However, it's not accessible to QEMU or any other process running inside namespace, because there's still tmpfs mounted on top of the original devtmpfs. > 665 627 0:54 /libvirt/qemu/1-vm1 /dev/hugepages1G/libvirt/qemu/1-vm1 > rw,relatime master:326 - hugetlbfs hugetlbfs rw,seclabel,pagesize=1024M And this is the result of my fix - libvirt bind mounts the hugetlbfs into the namespace. This is a submount of the tmpfs and hence accessible to QEMU. While libvirt could umount /dev/hugepages1G it's not necessary because it's not accessible to anything and it's also just cosmetics. What libvirt could do (but then again, just cosmetics, has no affect on QEMU), is to bind mount /dev/hugepages1G instead of domain's private path /dev/hugepages/1G/libvirt/qemu-1vm1. But that would not bring anything new and would require non-trivial amount of core rewrite. I hope this clears up your concerns. Hi michal, I found this error when testing this issue:
libvirt and qemu version:
libvirtv8.7.0-138-gfa2a7f888c
qemu-kvm-7.1.0-3.fc38.x86_64
test step like below:
1. Define a guest with below memorybacking xml.
<memoryBacking>
<hugepages>
<page size='2048' unit='KiB'/>
</hugepages>
</memoryBacking>
2. Mount 1G hugepage path
mount -t hugetlbfs -o pagesize=1G hugetlbfs /dev/hugepages1G
3. Start the VM
# virsh start vm1
Domain 'vm1' started
4. Prepare memory device hotplug xml like below:
# cat dimm1G.xml
<memory model='dimm'>
<source>
<pagesize unit='KiB'>1048576</pagesize>
<nodemask>0-1</nodemask>
</source>
<target>
<size unit='KiB'>1048576</size>
<node>1</node>
</target>
</memory>
5. Hotplug dimm memory device:
# virsh attach-device vm1 dimm1G.xml
error: Failed to attach device from dimm1G.xml
error: internal error: unable to execute QEMU command 'object-add': can't open backing store /dev/hugepages1G/libvirt/qemu/2-vm1 for guest RAM: Permission denied
BTW, if mount hugepage 1G path then start a guest with 1G hugepages memory backing has same error.
(In reply to liang cong from comment #23) > Hi michal, I found this error when testing this issue: Hm. I can't reproduce. With these steps I get: virsh # attach-device ble dimm.xml error: Failed to attach device from dimm.xml error: internal error: Unable to find any usable hugetlbfs mount for 1048576 KiB because the hugetlbfs was mounted only after libvirtd/virtqemud was started. But after I restart the daemon I get: virsh # attach-device ble dimm.xml error: Reconnected to the hypervisor Device attached successfully Maybe I'm missing something? Also, can you please see whether there is an error message in audit.log that would correspond to this error? BTW: I'm contemplating on removing this code from reconnect phase completely, because both domain startup and domain hotplug code now handle creating that domain private path (/dev/hugepages1G/...). I haven't posted a patch yet, because I want to test it first, but once I'm done I'll post it. Alright, it passed my testing, so I've posted it onto the list: https://listman.redhat.com/archives/libvir-list/2022-September/234543.html Find an issue on:
# rpm -q libvirt qemu-kvm
libvirt-8.0.0-11.module+el8.8.0+16835+1d966b61.x86_64
qemu-kvm-6.2.0-21.module+el8.8.0+16781+9f4724c2.x86_64
1. Define a guest with below memorybacking xml.
<memoryBacking>
<hugepages>
<page size='2048' unit='KiB'/>
</hugepages>
</memoryBacking>
2. Mount 1G hugepage path
mount -t hugetlbfs -o pagesize=1G hugetlbfs /dev/hugepages1G
3. Start vm
# virsh start vm1
Domain vm1 started
4. Prepare memory device hotplug xml like below:
# cat dimm1G.xml
<memory model='dimm'>
<source>
<pagesize unit='KiB'>1048576</pagesize>
<nodemask>0-1</nodemask>
</source>
<target>
<size unit='KiB'>1048576</size>
<node>0</node>
</target>
</memory>
5. Attach dimm memory devices as step4 defined.
# virsh attach-device vm1 dimm1G.xml
Device attached successfully
6. Shutoff the vm
# virsh destroy vm1
Domain 'vm1' destroyed
7. umount and mount 1G hugepage path again
# umount /dev/hugepages1G
# mount -t hugetlbfs -o pagesize=1G hugetlbfs /dev/hugepages1G
8. start the vm
# virsh start vm1
Domain 'vm1' started
9. Attach dimm memory devices as step4 defined again
# virsh attach-device vm1 dimm1G.xml
error: Failed to attach device from dimm1G.xml
error: internal error: unable to execute QEMU command 'object-add': can't open backing store /dev/hugepages1G/libvirt/qemu/3-vm1 for guest RAM: Permission denied
@mprivozn could you help to check this issue?
More info about comment#34 if restart libvirtd between step7 and step8, then issue is gone. But I found one extra issue, if restart libvirtd after step8 then no mater how many times restart libvirtd, the error like 'can't open backing store /dev/hugepages1G/libvirt/qemu/3-vm1 for guest RAM: Permission denied' would always occur. Verified on build:
rpm -q libvirt qemu-kvm
libvirt-8.0.0-11.module+el8.8.0+16835+1d966b61.x86_64
qemu-kvm-6.2.0-22.module+el8.8.0+16816+1d3555ec.x86_64
Verify steps:
1. Define a guest with below memorybacking xml.
<memoryBacking>
<hugepages>
<page size='2048' unit='KiB'/>
</hugepages>
</memoryBacking>
2. Start the VM and stop libvirt
# virsh start vm1 && systemctl stop libvirtd
Domain vm1 started
Warning: Stopping libvirtd.service, but it can still be activated by:
libvirtd.socket
libvirtd-ro.socket
libvirtd-admin.socket
3. Mount 1G hugepage path
# mkdir /dev/hugepages1G
# mount -t hugetlbfs -o pagesize=1G hugetlbfs /dev/hugepages1G
4. Do virsh list and guest still in running state.
# virsh -r list --all
Id Name State
----------------------
3 vm1 running
# virsh -r list --all
Id Name State
----------------------
3 vm1 running
5. Prepare memory device hotplug xml like below:
# cat dimm1G.xml
<memory model='dimm'>
<source>
<pagesize unit='KiB'>1048576</pagesize>
<nodemask>0-1</nodemask>
</source>
<target>
<size unit='KiB'>1048576</size>
<node>0</node>
</target>
</memory>
6. Hotplug dimm memory device:
# virsh attach-device vm1 dimm1G.xml
Device attached successfully
7. Prepare memory device with 2M hugepage source hotplug xml like below:
# cat dimm2M.xml
<memory model='dimm'>
<source>
<pagesize unit='KiB'>2048</pagesize>
<nodemask>0-1</nodemask>
</source>
<target>
<size unit='KiB'>1048576</size>
<node>0</node>
</target>
</memory>
8. Hotplug dimm memory device:
# virsh attach-device vm1 dimm2M.xml
Device attached successfully
9. Shutoff vm
# virsh destroy vm1
Domain vm1 destroyed
10. Restart libvirtd
# systemctl restart libvirtd
11. Start vm
# virsh start vm1
Domain 'vm1' started
Also check the below scenarios:
Steps:
1. memory backing 2M guest vm start -> stop libvirt -> mount 1G path -> start libvirt -> hotplug 1G dimm -> restart vm -> restart libvirtd -> hotplug 1G dimm
2. mount 1G path -> memory backing 2M guest vm start -> restart libvirtd -> hogplug 1G dimm -> restart libvirtd -> restart vm ->hogplug 1G dimm
Tested with these settings:remember_owner=1 or 0, memfd memory backing, default memory backing, 1G hugepage memory backing, 1G hugepage path as /mnt/hugepages1G
Additional info:
1. Restart libvirt after mount hugepage path.
2. Umount and mount hugepage path may cause another issue bug#2134009
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: virt:rhel and virt-devel:rhel security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2023:2757 |