Bug 1839034 - Use of <memoryBacking><source type=memfd|shared/> is only honored when <numa> present
Summary: Use of <memoryBacking><source type=memfd|shared/> is only honored when <numa>...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux Advanced Virtualization
Classification: Red Hat
Component: libvirt
Version: 8.3
Hardware: Unspecified
OS: Unspecified
medium
unspecified
Target Milestone: rc
: 8.3
Assignee: Michal Privoznik
QA Contact: Jing Qi
URL:
Whiteboard:
Depends On: 1887368
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-05-22 11:18 UTC by Daniel Berrangé
Modified: 2021-05-25 06:42 UTC (History)
9 users (show)

Fixed In Version: libvirt-7.0.0-1.el8
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1887368 (view as bug list)
Environment:
Last Closed: 2021-05-25 06:42:15 UTC
Type: Bug
Target Upstream Version: 7.0.0
Embargoed:


Attachments (Terms of Use)

Description Daniel Berrangé 2020-05-22 11:18:01 UTC
Description of problem:

I took an existing guest with *NO* guest NUMA cells defined and added:

  <memoryBacking>
    <hugepages/>
    <source type='memfd'/>
    <allocation mode='immediate'/>
  </memoryBacking>


Upon starting the guest, I see QEMU is still launched using /dev/hugepages and not memfd ie there is

   "-mem-path /dev/hugepages/libvirt/qemu/1-memtest"

in the args

If I add a further

  <cpu mode='host-model' check='partial'>
    <numa>
      <cell id='0' cpus='0-15' memory='262144' unit='KiB'/>
    </numa>
  </cpu>


then the use of memfd is honoured

-object memory-backend-memfd,id=ram-node0,hugetlb=yes,hugetlbsize=2097152,size=268435456


If we can't honour the memfd request in the non-NUMA case, we must raise an error

Version-Release number of selected component (if applicable):
libvirt-6.3.0-1.el8

How reproducible:
Always

Steps to Reproduce:
1. Setup a guest with        <hugepages/> <source type='memfd'/> in memory backing config, but no guest NUMA

Actual results:
Guest is started using /dev/hugepages

Expected results:
Either guest should use memory-backend-memfd, or it should refuse to start

Additional info:

NB, memory-backend-memfd is not currently enabled QEMU in RHEL-AV. This is requested in https://bugzilla.redhat.com/show_bug.cgi?id=1839030

Comment 1 Michal Privoznik 2020-06-11 11:53:41 UTC
I believe that after my patches

https://www.redhat.com/archives/libvir-list/2020-May/msg01114.html

the issue will be resolved. But not really - it would be only papered over, because my patches enable memory-backend-* iff QEMU_CAPS_MACHINE_MEMORY_BACKEND is set (qemu-5.0 and higher) - patch 5/8. Still worth fixing for older qemus IMO.

Comment 2 Cole Robinson 2020-08-19 18:04:25 UTC
This affects source type=shared too AFAIK, and use of <hugepages> is also an option. Changing title to reflect that

Comment 3 Cole Robinson 2020-08-20 17:05:45 UTC
That <hugepages> bit was incorrect. Technically it's <numa> or <memory> device

Comment 4 Michal Privoznik 2020-09-17 10:09:40 UTC
I just realized that patches I'm mentioning in comment 1 are our only hope (I've sent a v2 of them, btw). The reason is we can't simply switch from -mem-path + -m to memory-backend-{file,memfd}, or from -m to memory-backend-ram because we did that in the past and had to revert it. For instance:

https://gitlab.com/libvirt/libvirt/-/commit/41c2aa729f0af084ede95ee9a06219a2dd5fb5df
https://gitlab.com/libvirt/libvirt/-/commit/ff3112f3dc2c276a7e387ff7bb86f4fbbdf7bf2c

The problem was that the IDs used for memory don't match between -m and memory-backend-*. Hence broken migration. What my patches implement is that libvirt looks what the ID for -m is and set it for memory-backend-*. I've tried migrating between -m and memory-backend-* (one libvirt had my patches applied the other didn't) back and forth and it worked nicely. Version two of my patches can be found here:

https://www.redhat.com/archives/libvir-list/2020-September/msg00978.html

Comment 5 Michal Privoznik 2020-10-08 10:58:34 UTC
Patches were merged upstream as:

88957116c9 qemu: Use memory-backend-* for regular guest memory
b647654cbb qemu: Track default-ram-id machine attribute
d1ffc8cd3e qemuBuildMemoryBackendProps: Fix const correctness
a658a4bdf7 qemuBuildMemoryBackendProps: Prealloc mem for memfd backend
0217c5a6b4 qemuBuildMemoryBackendProps: Respect //memoryBacking/allocation/@mode=immediate
eda5cc7a62 qemuBuildMemoryBackendProps: Move @prealloc setting to backend agnostic part

v6.8.0-27-g88957116c9

And commit is needed too:

0c8ab47847 qemu: Don't generate '-machine memory-backend' and '-numa memdev'

v6.8.0-232-g0c8ab47847

Comment 7 Jing Qi 2020-10-12 06:59:49 UTC
I build the upstream version - libvirt version: 6.9.0  & qemu version :qemu-kvm-5.1.0-12.module+el8.3.0+8338+cbcb1a4b.x86_64.

Use the domain xml with below configuration -
 <memoryBacking>
    <hugepages/>
    <source type='memfd'/>
    <allocation mode='immediate'/>
  </memoryBacking>
  <vcpu placement='static'>3</vcpu>
 ....
  <cpu mode='custom' match='exact' check='full'>
    <model fallback='forbid'>qemu64</model>
    <feature policy='require' name='x2apic'/>
    <feature policy='require' name='hypervisor'/>
    <feature policy='require' name='lahf_lm'/>
    <feature policy='disable' name='svm'/>
  </cpu>

From the qemu cmd line, it's still using "-m 6632" & "-mem-path /dev/hugepages/libvirt/qemu/5-vm3". 

qemu       77856       1 99 02:53 ?        00:00:26 /usr/libexec/qemu-kvm -name guest=vm3,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-5-vm3/master-key.aes -machine pc-q35-rhel7.6.0,accel=kvm,usb=off,dump-guest-core=off -cpu qemu64,x2apic=on,hypervisor=on,lahf-lm=on,svm=off -m 6632 -mem-prealloc -mem-path /dev/hugepages/libvirt/qemu/5-vm3 -overcommit mem-lock=off -smp 3,sockets=3,cores=1,threads=1 -uuid a4a4041d-4e7f-40fb-9b22-f8a42e27d7a3 -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=26,server,nowait


Can you tell what your domain xml looks like when you tested migration "between -m and memory-backend-*" in comment 4?

Comment 8 Michal Privoznik 2020-10-12 09:10:09 UTC
(In reply to Jing Qi from comment #7)
> I build the upstream version - libvirt version: 6.9.0  & qemu version
> :qemu-kvm-5.1.0-12.module+el8.3.0+8338+cbcb1a4b.x86_64.
> 
> Use the domain xml with below configuration -
>  <memoryBacking>
>     <hugepages/>
>     <source type='memfd'/>
>     <allocation mode='immediate'/>
>   </memoryBacking>
>   <vcpu placement='static'>3</vcpu>
>  ....
>   <cpu mode='custom' match='exact' check='full'>
>     <model fallback='forbid'>qemu64</model>
>     <feature policy='require' name='x2apic'/>
>     <feature policy='require' name='hypervisor'/>
>     <feature policy='require' name='lahf_lm'/>
>     <feature policy='disable' name='svm'/>
>   </cpu>
> 
> From the qemu cmd line, it's still using "-m 6632" & "-mem-path
> /dev/hugepages/libvirt/qemu/5-vm3". 

Just for the record, we have to continue using -m because when I removed it qemu was unhappy.

But thanks for the report! Apparently qemu counterpart is missing. I mean, there is this patch in qemu that exposes default RAM ID:

https://git.qemu.org/?p=qemu.git;a=commit;h=c556600598afc6e90ae52a2e9ce910b8842244c5

and it missed qemu-kvm-5.1.0 and wasn't backported either. I've tested with upstream qemu and libvirt and the XML produces expected output:

-machine pc-i440fx-4.0,accel=kvm,usb=off,dump-guest-core=off,memory-backend=pc.ram \
-cpu host,migratable=on \
-m 4096 \
-object memory-backend-memfd,id=pc.ram,hugetlb=yes,hugetlbsize=2097152,prealloc=yes,size=4294967296,host-nodes=0,policy=bind \

I will clone this bug over to QEMU so that the commit can be backported.

Comment 10 Jing Qi 2020-10-12 10:29:12 UTC
Yes. With the qemu scratch build and upstream libvirt - version 6.9.0

The qemu command line(part) is as below -

-machine pc-q35-rhel8.3.0,accel=kvm,usb=off,dump-guest-core=off,memory-backend=pc.ram \
-cpu qemu64,x2apic=on,hypervisor=on,lahf-lm=on,svm=off \
-m 6632 \
-object memory-backend-memfd,id=pc.ram,hugetlb=yes,hugetlbsize=2097152,prealloc=yes,size=6954156032 \
-overcommit mem-lock=off \

Comment 11 Michal Privoznik 2020-12-03 04:58:16 UTC
Moving to POST per comment 10 and comment 5.

Comment 12 Jing Qi 2020-12-14 09:43:30 UTC
Tested with libvirt-daemon-6.10.0-1.module+el8.4.0+8898+a84e86e1.x86_64 & 
qemu-kvm-5.2.0-0.module+el8.4.0+8855+a9e237a9.x86_64

S1: Start vm with <access mode='shared'/> and no numa node configuration -

  <memoryBacking>
    <access mode='shared'/>
  </memoryBacking>

/usr/libexec/qemu-kvm \
-name guest=rhel8,debug-threads=on \
-S \
-object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-4-rhel8/master-key.aes \
-machine pc-i440fx-rhel7.6.0,accel=kvm,usb=off,dump-guest-core=off,memory-backend=pc.ram \
-cpu IvyBridge-IBRS,ss=on,pcid=on,hypervisor=on,arat=on,tsc-adjust=on,umip=on,md-clear=on,stibp=on,arch-capabilities=on,ssbd=on,xsaveopt=on,pdpe1gb=on,ibpb=on,amd-stibp=on,amd-ssbd=on,skip-l1dfl-vmentry=on,pschange-mc-no=on,hv-time,hv-vapic,hv-spinlocks=0x1000 \
-m 2560 \
-object memory-backend-file,id=pc.ram,mem-path=/var/lib/libvirt/qemu/ram/4-rhel8/pc.ram,share=yes,size=2684354560 \
-overcommit mem-lock=off \
-smp 4,sockets=4,cores=1,threads=1 \
-uuid 60d7e03d-0d7b-42c6-ac4c-4da98dd6a5ec \

s2:  Edit vm with adding <hugepages/> and keep access mode='shared' and no numa node configuration

 <memoryBacking>
    <hugepages/>
    <access mode='shared'/>
  </memoryBacking>

virsh edit rhel8
error: unsupported configuration: memory access mode 'shared' not supported without guest numa node
Failed. Try again? [y,n,i,f,?]: 
error: unsupported configuration: memory access mode 'shared' not supported without guest numa node
Failed. Try again? [y,n,i,f,?]: 

s3: change the memoryBacking part  to below part -
 <memoryBacking>
    <hugepages/>
    <source type='memfd'/>
 </memoryBacking>

# virsh start rhel8
Domain rhel8 started


Michal, could you please help to check if the s1/s2 results are expected?

Comment 13 Michal Privoznik 2020-12-14 10:52:59 UTC
(In reply to Jing Qi from comment #12)
> Tested with libvirt-daemon-6.10.0-1.module+el8.4.0+8898+a84e86e1.x86_64 & 
> qemu-kvm-5.2.0-0.module+el8.4.0+8855+a9e237a9.x86_64
> 
> S1: Start vm with <access mode='shared'/> and no numa node configuration -
> 
>   <memoryBacking>
>     <access mode='shared'/>
>   </memoryBacking>
> 
> /usr/libexec/qemu-kvm \
> -name guest=rhel8,debug-threads=on \
> -S \
> -object
> secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-4-rhel8/
> master-key.aes \
> -machine
> pc-i440fx-rhel7.6.0,accel=kvm,usb=off,dump-guest-core=off,memory-backend=pc.
> ram \
> -cpu
> IvyBridge-IBRS,ss=on,pcid=on,hypervisor=on,arat=on,tsc-adjust=on,umip=on,md-
> clear=on,stibp=on,arch-capabilities=on,ssbd=on,xsaveopt=on,pdpe1gb=on,
> ibpb=on,amd-stibp=on,amd-ssbd=on,skip-l1dfl-vmentry=on,pschange-mc-no=on,hv-
> time,hv-vapic,hv-spinlocks=0x1000 \
> -m 2560 \
> -object
> memory-backend-file,id=pc.ram,mem-path=/var/lib/libvirt/qemu/ram/4-rhel8/pc.
> ram,share=yes,size=2684354560 \
> -overcommit mem-lock=off \
> -smp 4,sockets=4,cores=1,threads=1 \
> -uuid 60d7e03d-0d7b-42c6-ac4c-4da98dd6a5ec \

I am not seeing anything problematic here. Do you think of something specific?

> 
> s2:  Edit vm with adding <hugepages/> and keep access mode='shared' and no
> numa node configuration
> 
>  <memoryBacking>
>     <hugepages/>
>     <access mode='shared'/>
>   </memoryBacking>
> 
> virsh edit rhel8
> error: unsupported configuration: memory access mode 'shared' not supported
> without guest numa node
> Failed. Try again? [y,n,i,f,?]: 
> error: unsupported configuration: memory access mode 'shared' not supported
> without guest numa node
> Failed. Try again? [y,n,i,f,?]: 

Oh yes, this is a bug. Will post a patch shortly.

> 
> s3: change the memoryBacking part  to below part -
>  <memoryBacking>
>     <hugepages/>
>     <source type='memfd'/>
>  </memoryBacking>
> 
> # virsh start rhel8
> Domain rhel8 started

And this is also expected.

Comment 14 Michal Privoznik 2020-12-14 12:10:12 UTC
Patch posted upstream to fix scenario 2 from comment 12:

https://www.redhat.com/archives/libvir-list/2020-December/msg00661.html

Comment 15 Michal Privoznik 2020-12-14 13:03:17 UTC
Merged upstream:

bff2ad5d6b qemu: Relax validation for mem->access if guest has no NUMA

v6.10.0-206-gbff2ad5d6b

Comment 16 Jing Qi 2020-12-15 05:59:56 UTC
Tested S3 in comment 12 with upstream libvirt version: v6.10.0-206-gbff2ad5d6b

S3 test result:
the domain is started.

And the memory part in qemu cmd line is as below -
 
-m 2560 -object memory-backend-file,id=pc.ram,mem-path=/dev/hugepages/libvirt/qemu/3-pc,share=yes,prealloc=yes,size=2684354560

Comment 20 Jing Qi 2021-01-18 09:30:07 UTC
Verified in versions:
libvirt-daemon-7.0.0-1.module+el8.4.0+9464+3e71831a.x86_64
qemu-kvm-5.2.0-2.module+el8.4.0+9186+ec44380f.x86_64

S1. Start vm with memfd and shared = on used and no numa cell config 
 
1. Start vm with memoryBacking and no numa cell
<memoryBacking>
    <source type='memfd'/>
    <access mode='shared'/>
  </memoryBacking>
  <vcpu placement='static'>4</>
...
<cpu mode='custom' match='exact' check='full'>
    <model fallback='forbid'>qemu64</model>
    <feature policy='require' name='x2apic'/>
    <feature policy='require' name='hypervisor'/>
    <feature policy='require' name='lahf_lm'/>
    <feature policy='disable' name='svm'/>
  </cpu>

# virsh start vm1
Domain 'vm1' started

Check the qemu command line, the "memory-backend-memfd" and 'share=yes' are there as expected.  
 -m 2048 -object memory-backend-memfd,id=pc.ram,share=yes,size=2147483648 -overcommit

Comment 22 errata-xmlrpc 2021-05-25 06:42:15 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (virt:av bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:2098


Note You need to log in before you can comment on or make changes to this bug.