Bug 2116496

Summary: Can't run when memory backing with hugepages and backend type memfd
Product: Red Hat Enterprise Linux 9 Reporter: smitterl
Component: qemu-kvmAssignee: Thomas Huth <thuth>
qemu-kvm sub component: General QA Contact: smitterl
Status: CLOSED ERRATA Docs Contact: Jiri Herrmann <jherrman>
Severity: low    
Priority: low CC: dhildenb, mrezanin, thuth, virt-maint, virt-qe-z
Version: 9.1Keywords: Automation, Triaged
Target Milestone: rc   
Target Release: 9.2   
Hardware: s390x   
OS: Linux   
Whiteboard:
Fixed In Version: qemu-kvm-7.2.0-1.el9 Doc Type: Bug Fix
Doc Text:
.VMs on IBM Z no longer fail to start when using `memfd` memory backing Previously, on IBM Z hosts, virtual machines (VMs) failed to boot if they were configured to use the `memfd` type of hugepage memory backing, for example as follows: ---- <memoryBacking> <hugepages/> <source type='memfd'/> </memoryBacking> ---- With this update, the underlying cause has been fixed, and the affected VMs now start correctly.
Story Points: ---
Clone Of:
: 2117149 (view as bug list) Environment:
Last Closed: 2023-05-09 07:20:04 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version: 7.2
Embargoed:

Description smitterl 2022-08-08 15:59:45 UTC
Description of problem:
A machine with memory backing via hugepages using the memfd backend doesn't run.

Version-Release number of selected component (if applicable):
qemu-kvm-7.0.0-9.el9.s390x

How reproducible:
100%


Steps to Reproduce:
0. Make sure hugepages are enabled for KVM (modprobe kvm hpage=1)
1. Start a machine with -object {"qom-type":"memory-backend-memfd","id":"s390.ram","hugetlb":true,"hugetlbsize":1048576,"x-use-canonical-path-for-ramblock-id":false,"prealloc":true,"size":1073741824}
OR equivalently, in libvirt define
<memoryBacking>
  <hugepages/>
  <source type='memfd'/>
</memoryBacking>

Actual results:
The machine is immediately paused; there's an error: kvm run failed Bad address
PSW=mask 0000000180000000 addr 000000003fe00510
R00=0000000000000000 R01=0000000000000000 R02=0000000000000000 R03=0000000000000000
R04=0000000000000000 R05=0000000000000000 R06=0000000000000000 R07=0000000000000000
R08=0000000000000000 R09=0000000000000000 R10=0000000000000000 R11=0000000000000000
R12=0000000000000000 R13=0000000000000000 R14=0000000000000000 R15=0000000000000000
C00=00000000000000e0 C01=0000000000000000 C02=0000000000000000 C03=0000000000000000
C04=0000000000000000 C05=0000000000000000 C06=0000000000000000 C07=0000000000000000
C08=0000000000000000 C09=0000000000000000 C10=0000000000000000 C11=0000000000000000
C12=0000000000000000 C13=0000000000000000 C14=00000000c2000000 C15=0000000000000000


Expected results:
The machine runs.

Additional info:
a) The machine runs successfully on x86_64 (used qemu-kvm-6.2.0-11.el9_0.3.x86_64).
b) It reproduces with qemu-kvm-6.2.0-11.el9_0.2.s390x.
c) Not considering critical at this point as we can still use the memory-backend-file instead of memory-backend-memfd successfully.
d) Using memory-backend-memfd without hugepages works, too.

Comment 1 Thomas Huth 2022-08-09 06:29:11 UTC
Just to be sure: Did you load your kvm kernel module with "hpage=1" ?

Comment 2 smitterl 2022-08-09 07:30:34 UTC
(In reply to Thomas Huth from comment #1)
> Just to be sure: Did you load your kvm kernel module with "hpage=1" ?

Yes :)

Comment 3 Thomas Huth 2022-08-09 09:56:02 UTC
FWIW, I can reproduce the issue. With memory-backend-file (which is working), the QEMU command line looks like this:

/usr/libexec/qemu-kvm \
-name guest=rhel9,debug-threads=on \
-S \
-object '{"qom-type":"secret","id":"masterKey0","format":"raw","file":"/var/lib/libvirt/qemu/domain-4-rhel9/master-key.aes"}' \
-machine s390-ccw-virtio-rhel9.0.0,usb=off,dump-guest-core=off,memory-backend=s390.ram \
-accel kvm \
-cpu gen15a-base,aen=on,cmmnt=on,vxpdeh=on,aefsi=on,diag318=on,csske=on,mepoch=on,msa9=on,msa8=on,msa7=on,msa6=on,msa5=on,msa4=on,msa3=on,msa2=on,msa1=on,sthyi=on,edat=on,ri=on,deflate=on,edat2=on,etoken=on,vx=on,ipter=on,mepochptff=on,ap=on,vxeh=on,vxpd=on,esop=on,msa9_pckmo=on,vxeh2=on,esort=on,apqi=on,apft=on,els=on,iep=on,apqci=on,cte=on,ais=on,bpb=on,gs=on,ppa15=on,zpci=on,sea_esop2=on,te=on,cmm=on \
-m 4096 \
-object '{"qom-type":"memory-backend-file","id":"s390.ram","mem-path":"/dev/hugepages/libvirt/qemu/4-rhel9","x-use-canonical-path-for-ramblock-id":false,"prealloc":true,"size":4294967296}' \
-overcommit mem-lock=off \
-smp 1,sockets=1,cores=1,threads=1 \
-uuid 6ad23529-478b-44d6-9140-3d5bda239a53 \
-display none \
-no-user-config \
-nodefaults \
-chardev socket,id=charmonitor,fd=32,server=on,wait=off \
-mon chardev=charmonitor,id=monitor,mode=control \
-rtc base=utc \
-no-shutdown \
-boot strict=on \
-device '{"driver":"virtio-serial-ccw","id":"virtio-serial0","devno":"fe.0.0002"}' \
-blockdev '{"driver":"file","filename":"/var/lib/libvirt/images/rhel9.qcow2","node-name":"libvirt-1-storage","auto-read-only":true,"discard":"unmap"}' \
-blockdev '{"node-name":"libvirt-1-format","read-only":false,"driver":"qcow2","file":"libvirt-1-storage","backing":null}' \
-device '{"driver":"virtio-blk-ccw","devno":"fe.0.0000","drive":"libvirt-1-format","id":"virtio-disk0","bootindex":1}' \
-netdev tap,fd=33,vhost=on,vhostfd=35,id=hostnet0 \
-device '{"driver":"virtio-net-ccw","netdev":"hostnet0","id":"net0","mac":"52:54:00:c9:ff:7d","devno":"fe.0.0001"}' \
-chardev socket,id=charchannel0,fd=31,server=on,wait=off \
-device '{"driver":"virtserialport","bus":"virtio-serial0.0","nr":1,"chardev":"charchannel0","id":"channel0","name":"org.qemu.guest_agent.0"}' \
-chardev pty,id=charconsole0 \
-device '{"driver":"sclpconsole","chardev":"charconsole0","id":"console0"}' \
-audiodev '{"id":"audio1","driver":"none"}' \
-device '{"driver":"virtio-balloon-ccw","id":"balloon0","devno":"fe.0.0003"}' \
-object '{"qom-type":"rng-random","id":"objrng0","filename":"/dev/urandom"}' \
-device '{"driver":"virtio-rng-ccw","rng":"objrng0","id":"rng0","devno":"fe.0.0004"}' \
-sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny \
-msg timestamp=on

With memory-backend-memfd (which is not working), the command line looks like this:

/usr/libexec/qemu-kvm \
-name guest=rhel9,debug-threads=on \
-S \
-object '{"qom-type":"secret","id":"masterKey0","format":"raw","file":"/var/lib/libvirt/qemu/domain-5-rhel9/master-key.aes"}' \
-machine s390-ccw-virtio-rhel9.0.0,usb=off,dump-guest-core=off,memory-backend=s390.ram \
-accel kvm \
-cpu gen15a-base,aen=on,cmmnt=on,vxpdeh=on,aefsi=on,diag318=on,csske=on,mepoch=on,msa9=on,msa8=on,msa7=on,msa6=on,msa5=on,msa4=on,msa3=on,msa2=on,msa1=on,sthyi=on,edat=on,ri=on,deflate=on,edat2=on,etoken=on,vx=on,ipter=on,mepochptff=on,ap=on,vxeh=on,vxpd=on,esop=on,msa9_pckmo=on,vxeh2=on,esort=on,apqi=on,apft=on,els=on,iep=on,apqci=on,cte=on,ais=on,bpb=on,gs=on,ppa15=on,zpci=on,sea_esop2=on,te=on,cmm=on \
-m 4096 \
-object '{"qom-type":"memory-backend-memfd","id":"s390.ram","hugetlb":true,"hugetlbsize":1048576,"x-use-canonical-path-for-ramblock-id":false,"prealloc":true,"size":4294967296}' \
-overcommit mem-lock=off \
-smp 1,sockets=1,cores=1,threads=1 \
-uuid 6ad23529-478b-44d6-9140-3d5bda239a53 \
-display none \
-no-user-config \
-nodefaults \
-chardev socket,id=charmonitor,fd=32,server=on,wait=off \
-mon chardev=charmonitor,id=monitor,mode=control \
-rtc base=utc \
-no-shutdown \
-boot strict=on \
-device '{"driver":"virtio-serial-ccw","id":"virtio-serial0","devno":"fe.0.0002"}' \
-blockdev '{"driver":"file","filename":"/var/lib/libvirt/images/rhel9.qcow2","node-name":"libvirt-1-storage","auto-read-only":true,"discard":"unmap"}' \
-blockdev '{"node-name":"libvirt-1-format","read-only":false,"driver":"qcow2","file":"libvirt-1-storage","backing":null}' \
-device '{"driver":"virtio-blk-ccw","devno":"fe.0.0000","drive":"libvirt-1-format","id":"virtio-disk0","bootindex":1}' \
-netdev tap,fd=33,vhost=on,vhostfd=35,id=hostnet0 \
-device '{"driver":"virtio-net-ccw","netdev":"hostnet0","id":"net0","mac":"52:54:00:c9:ff:7d","devno":"fe.0.0001"}' \
-chardev socket,id=charchannel0,fd=31,server=on,wait=off \
-device '{"driver":"virtserialport","bus":"virtio-serial0.0","nr":1,"chardev":"charchannel0","id":"channel0","name":"org.qemu.guest_agent.0"}' \
-chardev pty,id=charconsole0 \
-device '{"driver":"sclpconsole","chardev":"charconsole0","id":"console0"}' \
-audiodev '{"id":"audio1","driver":"none"}' \
-device '{"driver":"virtio-balloon-ccw","id":"balloon0","devno":"fe.0.0003"}' \
-object '{"qom-type":"rng-random","id":"objrng0","filename":"/dev/urandom"}' \
-device '{"driver":"virtio-rng-ccw","rng":"objrng0","id":"rng0","devno":"fe.0.0004"}' \
-sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny \
-msg timestamp=on

The only difference is basicall the line with "-object '{"qom-type":"memory-backend-...'.

Comment 4 Thomas Huth 2022-08-09 10:14:02 UTC
For reproducing without libvirt, these QEMU command lines can be used, too:

- memory-backend-file (working):

  qemu-system-s390x --accel kvm -m 4G -nographic -hda rhel8.qcow2 \
   -M s390-ccw-virtio,memory-backend=s390.ram \
   -object '{"qom-type":"memory-backend-file","id":"s390.ram","mem-path":"/dev/hugepages/","x-use-canonical-path-for-ramblock-id":false,"prealloc":true,"size":4294967296}'

- memory-backend-memfd (non-working):

  qemu-system-s390x --accel kvm -m 4G -nographic -hda rhel8.qcow2 \
   -M s390-ccw-virtio,memory-backend=s390.ram \
  -object '{"qom-type":"memory-backend-memfd","id":"s390.ram","hugetlb":true,"hugetlbsize":1048576,"x-use-canonical-path-for-ramblock-id":false,"prealloc":true,"size":4294967296}'

Comment 5 Thomas Huth 2022-08-09 10:21:59 UTC
Seems like s390_set_max_pagesize() in the QEMU sources does not activate the huge pages since qemu_maxrampagesize() only returns 4096 on s390x ... and looking a little bit closer, this happens because host_memory_backend_pagesize() is only able to deal with objects of type memory-backend-file...

Comment 6 Thomas Huth 2022-08-10 06:47:41 UTC
I suggested a patch upstream here: https://lore.kernel.org/qemu-devel/20220810063204.3589543-1-thuth@redhat.com/

Comment 7 Thomas Huth 2022-09-22 08:47:36 UTC
We'll get this fixed with the rebase to QEMU 7.2.

Comment 9 smitterl 2022-12-19 10:26:34 UTC
Pre-verified with:
qemu-kvm-7.2.0-1.el9.s390x

Automated test updated to pass on s390x:
https://github.com/autotest/tp-libvirt/pull/4683

Comment 17 errata-xmlrpc 2023-05-09 07:20:04 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: qemu-kvm security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:2162