Bug 2116496
| Summary: | Can't run when memory backing with hugepages and backend type memfd | |||
|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 9 | Reporter: | smitterl | |
| Component: | qemu-kvm | Assignee: | Thomas Huth <thuth> | |
| qemu-kvm sub component: | General | QA Contact: | smitterl | |
| Status: | CLOSED ERRATA | Docs Contact: | Jiri Herrmann <jherrman> | |
| Severity: | low | |||
| Priority: | low | CC: | dhildenb, mrezanin, thuth, virt-maint, virt-qe-z | |
| Version: | 9.1 | Keywords: | Automation, Triaged | |
| Target Milestone: | rc | Flags: | pm-rhel:
mirror+
|
|
| Target Release: | 9.2 | |||
| Hardware: | s390x | |||
| OS: | Linux | |||
| Whiteboard: | ||||
| Fixed In Version: | qemu-kvm-7.2.0-1.el9 | Doc Type: | Bug Fix | |
| Doc Text: |
.VMs on IBM Z no longer fail to start when using `memfd` memory backing
Previously, on IBM Z hosts, virtual machines (VMs) failed to boot if they were configured to use the `memfd` type of hugepage memory backing, for example as follows:
----
<memoryBacking>
<hugepages/>
<source type='memfd'/>
</memoryBacking>
----
With this update, the underlying cause has been fixed, and the affected VMs now start correctly.
|
Story Points: | --- | |
| Clone Of: | ||||
| : | 2117149 (view as bug list) | Environment: | ||
| Last Closed: | 2023-05-09 07:20:04 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | 7.2 | |
| Embargoed: | ||||
Just to be sure: Did you load your kvm kernel module with "hpage=1" ? (In reply to Thomas Huth from comment #1) > Just to be sure: Did you load your kvm kernel module with "hpage=1" ? Yes :) FWIW, I can reproduce the issue. With memory-backend-file (which is working), the QEMU command line looks like this:
/usr/libexec/qemu-kvm \
-name guest=rhel9,debug-threads=on \
-S \
-object '{"qom-type":"secret","id":"masterKey0","format":"raw","file":"/var/lib/libvirt/qemu/domain-4-rhel9/master-key.aes"}' \
-machine s390-ccw-virtio-rhel9.0.0,usb=off,dump-guest-core=off,memory-backend=s390.ram \
-accel kvm \
-cpu gen15a-base,aen=on,cmmnt=on,vxpdeh=on,aefsi=on,diag318=on,csske=on,mepoch=on,msa9=on,msa8=on,msa7=on,msa6=on,msa5=on,msa4=on,msa3=on,msa2=on,msa1=on,sthyi=on,edat=on,ri=on,deflate=on,edat2=on,etoken=on,vx=on,ipter=on,mepochptff=on,ap=on,vxeh=on,vxpd=on,esop=on,msa9_pckmo=on,vxeh2=on,esort=on,apqi=on,apft=on,els=on,iep=on,apqci=on,cte=on,ais=on,bpb=on,gs=on,ppa15=on,zpci=on,sea_esop2=on,te=on,cmm=on \
-m 4096 \
-object '{"qom-type":"memory-backend-file","id":"s390.ram","mem-path":"/dev/hugepages/libvirt/qemu/4-rhel9","x-use-canonical-path-for-ramblock-id":false,"prealloc":true,"size":4294967296}' \
-overcommit mem-lock=off \
-smp 1,sockets=1,cores=1,threads=1 \
-uuid 6ad23529-478b-44d6-9140-3d5bda239a53 \
-display none \
-no-user-config \
-nodefaults \
-chardev socket,id=charmonitor,fd=32,server=on,wait=off \
-mon chardev=charmonitor,id=monitor,mode=control \
-rtc base=utc \
-no-shutdown \
-boot strict=on \
-device '{"driver":"virtio-serial-ccw","id":"virtio-serial0","devno":"fe.0.0002"}' \
-blockdev '{"driver":"file","filename":"/var/lib/libvirt/images/rhel9.qcow2","node-name":"libvirt-1-storage","auto-read-only":true,"discard":"unmap"}' \
-blockdev '{"node-name":"libvirt-1-format","read-only":false,"driver":"qcow2","file":"libvirt-1-storage","backing":null}' \
-device '{"driver":"virtio-blk-ccw","devno":"fe.0.0000","drive":"libvirt-1-format","id":"virtio-disk0","bootindex":1}' \
-netdev tap,fd=33,vhost=on,vhostfd=35,id=hostnet0 \
-device '{"driver":"virtio-net-ccw","netdev":"hostnet0","id":"net0","mac":"52:54:00:c9:ff:7d","devno":"fe.0.0001"}' \
-chardev socket,id=charchannel0,fd=31,server=on,wait=off \
-device '{"driver":"virtserialport","bus":"virtio-serial0.0","nr":1,"chardev":"charchannel0","id":"channel0","name":"org.qemu.guest_agent.0"}' \
-chardev pty,id=charconsole0 \
-device '{"driver":"sclpconsole","chardev":"charconsole0","id":"console0"}' \
-audiodev '{"id":"audio1","driver":"none"}' \
-device '{"driver":"virtio-balloon-ccw","id":"balloon0","devno":"fe.0.0003"}' \
-object '{"qom-type":"rng-random","id":"objrng0","filename":"/dev/urandom"}' \
-device '{"driver":"virtio-rng-ccw","rng":"objrng0","id":"rng0","devno":"fe.0.0004"}' \
-sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny \
-msg timestamp=on
With memory-backend-memfd (which is not working), the command line looks like this:
/usr/libexec/qemu-kvm \
-name guest=rhel9,debug-threads=on \
-S \
-object '{"qom-type":"secret","id":"masterKey0","format":"raw","file":"/var/lib/libvirt/qemu/domain-5-rhel9/master-key.aes"}' \
-machine s390-ccw-virtio-rhel9.0.0,usb=off,dump-guest-core=off,memory-backend=s390.ram \
-accel kvm \
-cpu gen15a-base,aen=on,cmmnt=on,vxpdeh=on,aefsi=on,diag318=on,csske=on,mepoch=on,msa9=on,msa8=on,msa7=on,msa6=on,msa5=on,msa4=on,msa3=on,msa2=on,msa1=on,sthyi=on,edat=on,ri=on,deflate=on,edat2=on,etoken=on,vx=on,ipter=on,mepochptff=on,ap=on,vxeh=on,vxpd=on,esop=on,msa9_pckmo=on,vxeh2=on,esort=on,apqi=on,apft=on,els=on,iep=on,apqci=on,cte=on,ais=on,bpb=on,gs=on,ppa15=on,zpci=on,sea_esop2=on,te=on,cmm=on \
-m 4096 \
-object '{"qom-type":"memory-backend-memfd","id":"s390.ram","hugetlb":true,"hugetlbsize":1048576,"x-use-canonical-path-for-ramblock-id":false,"prealloc":true,"size":4294967296}' \
-overcommit mem-lock=off \
-smp 1,sockets=1,cores=1,threads=1 \
-uuid 6ad23529-478b-44d6-9140-3d5bda239a53 \
-display none \
-no-user-config \
-nodefaults \
-chardev socket,id=charmonitor,fd=32,server=on,wait=off \
-mon chardev=charmonitor,id=monitor,mode=control \
-rtc base=utc \
-no-shutdown \
-boot strict=on \
-device '{"driver":"virtio-serial-ccw","id":"virtio-serial0","devno":"fe.0.0002"}' \
-blockdev '{"driver":"file","filename":"/var/lib/libvirt/images/rhel9.qcow2","node-name":"libvirt-1-storage","auto-read-only":true,"discard":"unmap"}' \
-blockdev '{"node-name":"libvirt-1-format","read-only":false,"driver":"qcow2","file":"libvirt-1-storage","backing":null}' \
-device '{"driver":"virtio-blk-ccw","devno":"fe.0.0000","drive":"libvirt-1-format","id":"virtio-disk0","bootindex":1}' \
-netdev tap,fd=33,vhost=on,vhostfd=35,id=hostnet0 \
-device '{"driver":"virtio-net-ccw","netdev":"hostnet0","id":"net0","mac":"52:54:00:c9:ff:7d","devno":"fe.0.0001"}' \
-chardev socket,id=charchannel0,fd=31,server=on,wait=off \
-device '{"driver":"virtserialport","bus":"virtio-serial0.0","nr":1,"chardev":"charchannel0","id":"channel0","name":"org.qemu.guest_agent.0"}' \
-chardev pty,id=charconsole0 \
-device '{"driver":"sclpconsole","chardev":"charconsole0","id":"console0"}' \
-audiodev '{"id":"audio1","driver":"none"}' \
-device '{"driver":"virtio-balloon-ccw","id":"balloon0","devno":"fe.0.0003"}' \
-object '{"qom-type":"rng-random","id":"objrng0","filename":"/dev/urandom"}' \
-device '{"driver":"virtio-rng-ccw","rng":"objrng0","id":"rng0","devno":"fe.0.0004"}' \
-sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny \
-msg timestamp=on
The only difference is basicall the line with "-object '{"qom-type":"memory-backend-...'.
For reproducing without libvirt, these QEMU command lines can be used, too:
- memory-backend-file (working):
qemu-system-s390x --accel kvm -m 4G -nographic -hda rhel8.qcow2 \
-M s390-ccw-virtio,memory-backend=s390.ram \
-object '{"qom-type":"memory-backend-file","id":"s390.ram","mem-path":"/dev/hugepages/","x-use-canonical-path-for-ramblock-id":false,"prealloc":true,"size":4294967296}'
- memory-backend-memfd (non-working):
qemu-system-s390x --accel kvm -m 4G -nographic -hda rhel8.qcow2 \
-M s390-ccw-virtio,memory-backend=s390.ram \
-object '{"qom-type":"memory-backend-memfd","id":"s390.ram","hugetlb":true,"hugetlbsize":1048576,"x-use-canonical-path-for-ramblock-id":false,"prealloc":true,"size":4294967296}'
Seems like s390_set_max_pagesize() in the QEMU sources does not activate the huge pages since qemu_maxrampagesize() only returns 4096 on s390x ... and looking a little bit closer, this happens because host_memory_backend_pagesize() is only able to deal with objects of type memory-backend-file... I suggested a patch upstream here: https://lore.kernel.org/qemu-devel/20220810063204.3589543-1-thuth@redhat.com/ We'll get this fixed with the rebase to QEMU 7.2. Pre-verified with: qemu-kvm-7.2.0-1.el9.s390x Automated test updated to pass on s390x: https://github.com/autotest/tp-libvirt/pull/4683 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: qemu-kvm security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2023:2162 |
Description of problem: A machine with memory backing via hugepages using the memfd backend doesn't run. Version-Release number of selected component (if applicable): qemu-kvm-7.0.0-9.el9.s390x How reproducible: 100% Steps to Reproduce: 0. Make sure hugepages are enabled for KVM (modprobe kvm hpage=1) 1. Start a machine with -object {"qom-type":"memory-backend-memfd","id":"s390.ram","hugetlb":true,"hugetlbsize":1048576,"x-use-canonical-path-for-ramblock-id":false,"prealloc":true,"size":1073741824} OR equivalently, in libvirt define <memoryBacking> <hugepages/> <source type='memfd'/> </memoryBacking> Actual results: The machine is immediately paused; there's an error: kvm run failed Bad address PSW=mask 0000000180000000 addr 000000003fe00510 R00=0000000000000000 R01=0000000000000000 R02=0000000000000000 R03=0000000000000000 R04=0000000000000000 R05=0000000000000000 R06=0000000000000000 R07=0000000000000000 R08=0000000000000000 R09=0000000000000000 R10=0000000000000000 R11=0000000000000000 R12=0000000000000000 R13=0000000000000000 R14=0000000000000000 R15=0000000000000000 C00=00000000000000e0 C01=0000000000000000 C02=0000000000000000 C03=0000000000000000 C04=0000000000000000 C05=0000000000000000 C06=0000000000000000 C07=0000000000000000 C08=0000000000000000 C09=0000000000000000 C10=0000000000000000 C11=0000000000000000 C12=0000000000000000 C13=0000000000000000 C14=00000000c2000000 C15=0000000000000000 Expected results: The machine runs. Additional info: a) The machine runs successfully on x86_64 (used qemu-kvm-6.2.0-11.el9_0.3.x86_64). b) It reproduces with qemu-kvm-6.2.0-11.el9_0.2.s390x. c) Not considering critical at this point as we can still use the memory-backend-file instead of memory-backend-memfd successfully. d) Using memory-backend-memfd without hugepages works, too.