Description of problem: Version-Release number of selected component (if applicable): qemu-kvm-4.0.0-4.module+el8.1.0+3356+cda7f1ee.ppc64le kernel-4.18.0-107.el8.ppc64le - guest and host HPT guest How reproducible: Steps to Reproduce: 1.According to the following comment,QE help to open a new bug for this new issue,Thanks. https://bugzilla.redhat.com/show_bug.cgi?id=1723297#c27 There's an additional problem here. qemu is advertising too large an RMA in this case - the full memory size instead of clamping it to 1TiB. In fact the rma size advertisement is buggy in other ways to, because it works out differently for KVM and non-KVM paths. 2. 3. Actual results: Expected results: Additional info:
More details: When POWER guests have their MMU turned "off" (not actually off, but the guest-controlled parts are disabled) they can only access memory within a limited Real Mode Area (RMA). How big that can be is subject to several overlapping limits from different sources. There are some cases where we don't calculate all these limits correctly in qemu and hence advertise an RMA size to the guest which is too large - that can cause the guest to allocate space at an address it can't actually safely access in all the cases it needs to (bug 1723297 is a case somewhat like this, although complicated by the fact that there is a guest side bug as well). Worse, we don't always advertise the same RMA size in KVM-HV, KVM-PR and TCG cases, which is a guest-visible inconsistency that shouldn't be there.
Although this is theoretically nasty, we mostly get away with it. In fixing the guest kernel bug for 1723297, we'll also work around this qemu bug for now. Therefore, only aiming to fix this in qemu-4.2 upstream, and RHEL-AV-8.2 downstream.
I've made an upstream series addressing this, and posted it as a draft.
Upstream fix has missed qemu-4.2, and since we've worked around it in the kernel, there's no compelling reason to backport. Therefore moving this to backlog.
QEMU has been recently split into sub-components and as a one-time operation to avoid breakage of tools, we are setting the QEMU sub-component of this BZ to "General". Please review and change the sub-component if necessary the next time you review this BZ. Thanks
Fix is now staged in my tree, hope to PR tomorrow.
Now merged upstream.
QE tested it on the build kernel-4.18.0-216.el8.ppc64le qemu-kvm-5.0.0-0.scrmod+el8.3.0+7066+6dd3ecaa.wrb200617.ppc64le cli, /usr/libexec/qemu-kvm -name avocado-vt-vm1 -machine pseries,max-cpu-compat=power8 -nodefaults -device VGA,bus=pci.0,addr=0x2 -chardev socket,id=serierial0,path=/tmp/t11,server,nowait -device spapr-vty,reg=0x30000000,chardev=serierial0 -device qemu-xhci,id=usb1,bus=pci.0,addr=0x3 -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=0x4 -drive id=drive_image1,if=none,snapshot=off,aio=threads,cache=none,format=qcow2,file=rhel830-ppc64le-virtio-scsi.qcow2.backup.p8.june -device scsi-hd,id=image1,drive=drive_image1 -m 1030G,slots=256,maxmem=2T -smp 8,maxcpus=8,cores=4,threads=1,sockets=2 -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 -vnc :1 -rtc base=utc,clock=host -boot order=cdn,once=c,menu=off,strict=off -enable-kvm -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 -monitor stdio Expected results, The guest could boot up successfully. Actual results, The guest could boot up successfully.
QE just got the big size machine and will give a verification soon
Verified the bug with build kernel-4.18.0-221.el8.ppc64le qemu-kvm-5.0.0-0.module+el8.3.0+6620+5d5e1420.ppc64le Steps, Please refer to comment8 Actual results, the hpt guest could boot up successfully on Power 9 and works well Expected results, the hpt guest could boot up successfully on Power 9 and works well
Base on comment12, QE moved it to be verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (virt:8.3 bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:5137