Bug 1499647
Summary: | qemu miscalculates guest RAM size during HPT resizing | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | David Gibson <dgibson> | ||||
Component: | qemu-kvm-rhev | Assignee: | Serhii Popovych <spopovyc> | ||||
Status: | CLOSED ERRATA | QA Contact: | Min Deng <mdeng> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 7.5 | CC: | bugproxy, dgibson, dzheng, hannsj_uhl, knoel, lmiksik, lvivier, mdeng, michen, mrezanin, qzhang, spopovyc, virt-maint | ||||
Target Milestone: | rc | Keywords: | Patch | ||||
Target Release: | 7.5 | ||||||
Hardware: | ppc64le | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | qemu-kvm-rhev-2.10.0-9.el7 | Doc Type: | If docs needed, set a value | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2018-04-11 00:38:42 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | 1510771 | ||||||
Bug Blocks: | 1399177 | ||||||
Attachments: |
|
Description
David Gibson
2017-10-09 08:06:38 UTC
I've written a fix and staged it upstream, waiting for a couple of unrelated things to slot into place before sending a pull request. This is now in upstream qemu as db50f280cf5f714e64ff2b134aae138908f07502. Serhii, can you make a downstream backport of this. I think this will be your first qemu backport. David, just to clarify: I need to port HPT as well as requested commit id to downstream pegas/master-2.9.0 branch? No, just the specific commit id. qemu-2.10 on which the next qemu-kvm release will be based already has the base HPT resizing code included. Got it. Ported to rhv7/master-2.9.0 (with some minor changes to avoid unnecessary backport). Prepared kernel. Have problems with testing on pseries machine: don't have one and not found one through beaker (look for pseries and get no results). David, can you point me where to get pseries machine to test conditions according to comment 0 testcase? My machine is PowerNV (which is virtualized by Power KVM, I guess), so "hpt_order" debugfs entry is missing and can't trigger resize via debugfs. Will submit brew build and patch for review shortly. Serhii, the pseries is the guest machine type, and the hpt_order file is thus in the guest filesystem. The commit must be ported to rhv7/master-2.10.0 as we have the rhel-7.5.0? flag. Note that PowerNV stands for "Power Not Virtualized", so it is the host machine type. Thanks for explanation, guys. Unfortunately I'm not able to validate test case described in comment 0. Kernel is build with HPT, qemu from rhv7/master-2.10.0 with requested change applied, guest is either 3.10 or 4.14 downstream kernel. I have host kernel crashed during HPT resize with following conditions: 1. Guest kernel either 3.10.0-709.el7.ppc64le or 3.10.0-749.el7.ppc64le. 2. Non debug kernel build. 3. Following commands get issued to trigger a crash on [root@localhost ~]# cat /sys/kernel/debug/powerpc/hpt_order 26 [root@localhost ~]# echo '27' >/sys/kernel/debug/powerpc/hpt_order [ 43.049746] lpar: Attempting to resize HPT to shift 27 [ 43.262022] lpar: HPT resize to shift 27 complete (101 ms / 110 ms) [root@localhost ~]# [root@localhost ~]# [root@localhost ~]# [root@localhost ~]# [root@localhost ~]# cat /sys/kernel/debug/powerpc/hpt_order 27 [root@localhost ~]# echo '28' >/sys/kernel/debug/powerpc/hpt_order [ 61.378081] lpar: Attempting to resize HPT to shift 28 -bash: echo: write error: Operation not permitted [root@localhost ~]# echo '28' >/sys/kernel/debug/powerpc/hpt_order [ 67.778024] lpar: Attempting to resize HPT to shift 28 -bash: echo: write error: Operation not permitted [root@localhost ~]# echo '29' >/sys/kernel/debug/powerpc/hpt_order [ 81.690078] lpar: Attempting to resize HPT to shift 29 -bash: echo: write error: Operation not permitted [root@localhost ~]# echo '26' >/sys/kernel/debug/powerpc/hpt_order [ 85.330009] lpar: Attempting to resize HPT to shift 26 kdump: dump target is /dev/mapper/rhel_ibm--p8--virt--03-root kdump: saving to /sysroot//var/crash/127.0.0.1-2017-10-30-15:36:29/ kdump: saving vmcore-dmesg.txt kdump: saving vmcore-dmesg.txt complete kdump: saving vmcore Sometimes sequence isn't exact: crash could be triggered after second command (i.e. echo '27' >...). While kdump saves crashed kernel as well as other information most of it looks useless to me for now, I have turned on DEBUG_RESIZE_HPT to get bit more information about call path. I have rebuild host kernel from *-debug.config in hope of getting more information about the problem but unfortunately it does not give much details to me. Dmesg in attachment is from debug kernel and contains more traces besides main causing fault. Second trace indicates about possible incorrect locking usage in case of debugfs forced HPT resize, but this is only assumption. Also *-debug.config kernel does not cause host to immediately reboot, we get more traces (e.g. one described above) but system becomes unusable. Created attachment 1345548 [details]
dmesg from debug kernel on ppc64le
Actually the same I have for kernel-3.10.0-757.el7.ppc64le from nightly builds (HPT resize was merged in 755). Tested with both host and guest kernel-3.10.0-757.el7.ppc64le. Serhii, Ah.. that's probably the HPT resizing crash bug Paul Mackerras at IBM has been talking to me about. It's unrelated to this bug, but obviously will block testing. I've filed bug 1510771 to track that problem, and marked this one as dependent on it. Serhii, would you mind updating bug 1510771 with your reproducer steps for the benefit of QE. Otherwise I'll handle 1510771 until the upstream fix is merged (since I'm in the same timezone as Paul) then hand it over to you for the downstream port. I checked with qemu-kvm-rhev with/without patch and get following results: qemu-kvm-rhev without patch: ---------------------------- [root@localhost ~]# echo '27' >/sys/kernel/debug/powerpc/hpt_order [ 100.864883] lpar: Attempting to resize HPT to shift 27 -bash: echo: write error: Operation not permitted [root@localhost ~]# echo '27' >/sys/kernel/debug/powerpc/hpt_order [ 102.174066] lpar: Attempting to resize HPT to shift 27 -bash: echo: write error: Operation not permitted [root@localhost ~]# echo '27' >/sys/kernel/debug/powerpc/hpt_order [ 102.981877] lpar: Attempting to resize HPT to shift 27 -bash: echo: write error: Operation not permitted [root@localhost ~]# echo '27' >/sys/kernel/debug/powerpc/hpt_order [ 103.542115] lpar: Attempting to resize HPT to shift 27 -bash: echo: write error: Operation not permitted [root@localhost ~]# echo '27' >/sys/kernel/debug/powerpc/hpt_order [ 104.061859] lpar: Attempting to resize HPT to shift 27 -bash: echo: write error: Operation not permitted [root@localhost ~]# echo '27' >/sys/kernel/debug/powerpc/hpt_order [ 104.621986] lpar: Attempting to resize HPT to shift 27 -bash: echo: write error: Operation not permitted qemu-kvm-rhev with patch: ------------------------- [root@localhost ~]# cat /sys/kernel/debug/powerpc/hpt_order 26 [root@localhost ~]# echo '27' >/sys/kernel/debug/powerpc/hpt_order [ 298.352509] lpar: Attempting to resize HPT to shift 27 [ 298.774738] lpar: HPT resize to shift 27 complete (104 ms / 317 ms) [root@localhost ~]# [root@localhost ~]# [root@localhost ~]# echo '28' >/sys/kernel/debug/powerpc/hpt_order [ 307.648547] lpar: Attempting to resize HPT to shift 28 -bash: echo: write error: Operation not permitted [root@localhost ~]# echo '28' >/sys/kernel/debug/powerpc/hpt_order [ 308.888659] lpar: Attempting to resize HPT to shift 28 -bash: echo: write error: Operation not permitted [root@localhost ~]# echo '27' >/sys/kernel/debug/powerpc/hpt_order [ 315.860630] lpar: Attempting to resize HPT to shift 27 *** host kernel crashes *** I did'nt understand why we get EPERM in both cases from kernel: by looking at the code the only place when EPERM could be returned for resize is hypercall returns H_RESOURCE. qemu instance is running with -m 8G,maxmem=200G,slots=256 -> pseries_lpar_resize_hpt() -> plpar_resize_hpt_prepare() -> plpar_hcall_norets(H_RESIZE_HPT_PREPARE, flags, shift) H_RESOURCE looks like hypervisor return from my point (could be wrong :-). On the other hand, when patch attached to this bug is applied I get kernel crash. Components in use: ================== Host: kernel-3.10.0-757.el7.ppc64le Hypervisor: rhv7/master-2.10.0 branch, commit 196c322c2420b91f962fa036fd979e27e12761be (Update to qemu-kvm-ma-2.10.0-4.el7 / qemu-kvm-rhev-2.10.0-4.el7) Guest: kernel-3.10.0-768.el7.ppc64le So the reason the fix for this bug appears to cause the crash is that this bug was preventing the triggering conditions for bug 1510771. They're otherwise unrelated. H_RESOURCE is indeed a hypervisor return code. In this case it is coming from qemu, in the function h_resize_hpt_prepare() this bit: /* We only allow the guest to allocate an HPT one order above what * we'd normally give them (to stop a small guest claiming a huge * chunk of resources in the HPT */ if (shift > (spapr_hpt_shift_for_ramsize(current_ram_size) + 1)) { return H_RESOURCE; } spapr_hpt_shift_for_ramsize() is where the bug is, causing this check to trigger when it shouldn't. Looks like issue described in comment 9 and comment 13 isn't related to bug 1510771. I narrowed down to the commit in upstream: commit c35786cf7ef71473ed8c8aebba985ef895209b7f Author: Paul Mackerras <paulus> Date: Fri Jul 21 15:41:49 2017 +1000 KVM: PPC: Book3S HV: Fix host crash on changing HPT size After this is applied I can easily replay test case from this bug description message. Now I have two patches: ----------------------- 1. for qemu-kvm addressing this bug 2. for kernel addressing issue in comment 9 and comment 13 David, how should I deal with these changes? Open another bug for issue in comment 9 or comment 13 or proceed with this bug id for both? Ah, ok. Looks like you were hitting an older and easier to trigger host crash than the one I was thinking of. I think it would still be possible to trigger the 1510771 crash in a similar way, just rarer (it involves a race between the vcpu invoking the resize commit and the other vcpus). We can't combine that crash with this bug, because they're in different components. Usually it's best to file a separate BZ for each separate issue. However, in this case the crash you hit and bug 1510771 are sufficiently similar (both are a host crash during resize caused by rmap corruption) that we can handle both of the crahes in bug 1510771. So, I'm putting 1510771 back on this bug's dep list, and I'll update it to cover both patches. Btw, Serhii, I have no idea where you got that commit id from in comment 16. That patch is commit ef42719814db06fdfa26cd7566de0b64de173320 in my upstream tree. I can't find the SHA you referenced at all. QE tested similar scenario for the bug,it seems that it not only gets failure by writing sys file directly but also happens by memory hotplug or unplug events. Build info Guest:kernel-3.10.0-776.el7.ppc64le host: kernel-3.10.0-781.el7.ppc64le qemu-kvm-rhev: qemu-kvm-rhev-2.10.0-5.el7.ppc64le CLI, /usr/libexec/qemu-kvm -name avocado-vt-vm1 -sandbox off -machine pseries -nodefaults -vga std -chardev socket,id=serial_id_serial0,path=/tmp/S,server,nowait -device spapr-vty,reg=0x30000000,chardev=serial_id_serial0 -device nec-usb-xhci,id=usb1,bus=pci.0,addr=0x3 -drive id=drive_image1,if=none,snapshot=off,aio=native,cache=none,format=qcow2,file=rhel75-ppc64le-virtio.qcow2 -device virtio-blk-pci,id=image1,drive=drive_image1,bootindex=1,bus=pci.0,addr=0x4 -netdev tap,id=net0,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown,vhost=on -device virtio-net-pci,netdev=net0,id=nic0,mac=52:54:00:43:17:1a,bus=pci.0,addr=0x1e -m 4096 -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 -vnc :1 -rtc base=utc,clock=host -enable-kvm -device usb-kbd,id=input0 -device usb-mouse,id=input1 -device usb-tablet,id=input2 -monitor stdio -smp 16,sockets=8,threads=1,cores=2 -m 16G,slots=256,maxmem=40G -numa node -qmp tcp:0:4444,server,nowait 1.boot up guest with above cli 2.hotplug memory by the following cli, #telnet 127.0.0.1 4444 {"execute":"qmp_capabilities"} {"return": {}} {'execute': 'object-add', 'arguments': {'id': 'mem1', 'qom-type': 'memory-backend-ram', 'props': {'policy': 'default', 'size': 1073741824}}, 'id': 'wJpPI7AQ'} {"return": {}, "id": "wJpPI7AQ"} {'execute': 'device_add', 'arguments':{'id': 'dimm1','driver': 'pc-dimm', 'memdev': 'mem1'}} {"return": {}} {"timestamp": {"seconds": 1510554195, "microseconds": 496286}, "event": "RTC_CHANGE", "data": {"offset": 1}} 3.check dmesg [ 135.817587] pseries-hotplug-mem: Attempting to hot-add 4 LMB(s) at index 80000040 [ 135.817776] lpar: Attempting to resize HPT to shift 25 [ 135.818150] Unable to resize hash page table to target order 25: -1 [ 135.827845] lpar: Attempting to resize HPT to shift 25 [ 135.828316] Unable to resize hash page table to target order 25: -1 [ 135.833181] lpar: Attempting to resize HPT to shift 25 [ 135.833393] Unable to resize hash page table to target order 25: -1 [ 135.838243] lpar: Attempting to resize HPT to shift 25 [ 135.838452] Unable to resize hash page table to target order 25: -1 [ 135.843268] pseries-hotplug-mem: Memory at 400000000 (drc index 80000040) was hot-added [ 135.843270] pseries-hotplug-mem: Memory at 410000000 (drc index 80000041) was hot-added [ 135.843271] pseries-hotplug-mem: Memory at 420000000 (drc index 80000042) was hot-added [ 135.843272] pseries-hotplug-mem: Memory at 430000000 (drc index 80000043) was hot-added Any issues please let me know,thanks a lot. Min > CLI,
> /usr/libexec/qemu-kvm -name avocado-vt-vm1 -sandbox off -machine pseries
> -nodefaults -vga std -chardev
> socket,id=serial_id_serial0,path=/tmp/S,server,nowait -device
> spapr-vty,reg=0x30000000,chardev=serial_id_serial0 -device
> nec-usb-xhci,id=usb1,bus=pci.0,addr=0x3 -drive
> id=drive_image1,if=none,snapshot=off,aio=native,cache=none,format=qcow2,
> file=rhel75-ppc64le-virtio.qcow2 -device
> virtio-blk-pci,id=image1,drive=drive_image1,bootindex=1,bus=pci.0,addr=0x4
> -netdev
> tap,id=net0,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown,vhost=on
> -device
> virtio-net-pci,netdev=net0,id=nic0,mac=52:54:00:43:17:1a,bus=pci.0,addr=0x1e
> -m 4096 -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 -vnc :1 -rtc
> base=utc,clock=host -enable-kvm -device usb-kbd,id=input0 -device
> usb-mouse,id=input1 -device usb-tablet,id=input2 -monitor stdio -smp
> 16,sockets=8,threads=1,cores=2 -m 16G,slots=256,maxmem=40G -numa node -qmp
> tcp:0:4444,server,nowait
Correct cli,thanks.
/usr/libexec/qemu-kvm -name avocado-vt-vm1 -sandbox off -machine pseries -nodefaults -vga std -chardev socket,id=serial_id_serial0,path=/tmp/S,server,nowait -device spapr-vty,reg=0x30000000,chardev=serial_id_serial0 -device nec-usb-xhci,id=usb1,bus=pci.0,addr=0x3 -drive id=drive_image1,if=none,snapshot=off,aio=native,cache=none,format=qcow2,file=rhel75-ppc64le-virtio.qcow2 -device virtio-blk-pci,id=image1,drive=drive_image1,bootindex=1,bus=pci.0,addr=0x4 -netdev tap,id=net0,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown,vhost=on -device virtio-net-pci,netdev=net0,id=nic0,mac=52:54:00:43:17:1a,bus=pci.0,addr=0x1e -m 4096 -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 -vnc :1 -rtc base=utc,clock=host -enable-kvm -device usb-kbd,id=input0 -device usb-mouse,id=input1 -device usb-tablet,id=input2 -monitor stdio -smp 16,sockets=8,threads=1,cores=2 -m 16G,slots=256,maxmem=40G -numa node -qmp tcp:0:4444,server,nowait
Hi David and Serhii How do you think if QE use steps from comment 19 to verify the bug in the future ? I guess QE have to wait 1510771 is fixed and then we could verify this one.do you agree with me ? Thanks. Thanks Min Min, I think the steps from comment 0 would be preferable. By using the debug file we're testing more specifically just the resize path, rather than the hotplug path as well. And, yes, we'll need to wait for the fix for bug 1510771 before being able to test this well. Fix included in qemu-kvm-rhev-2.10.0-9.el7 According to comment0,comment9,comment13 and comment24 QE tried to reproduce the bug on the following builds qemu-kvm-rhev-2.9.0-16.el7_4.6.ppc64le kernel-3.10.0-820.el7.ppc64le - host kernel-3.10.0-749.el7.ppc64le - guest steps, 1.boot up a guest without dimm /usr/libexec/qemu-kvm -name virt-tests-vm1 -sandbox off -machine pseries -nodefaults -vga std -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=03,disable-legacy=off,disable-modern=off -drive id=drive_image1,if=none,cache=none,snapshot=off,aio=native,format=qcow2,file=rhel75-ppc64le-virtio-scsi.qcow2 -device scsi-hd,id=image1,drive=drive_image1 -numa node -qmp tcp:0:4444,server,nowait -vnc :1 -rtc base=utc,clock=host,driftfix=slew -boot order=cdn,once=c,menu=off,strict=off -monitor stdio -device nec-usb-xhci,id=usb1 -device usb-kbd,id=input0 -device usb-mouse,id=input1 -device usb-tablet,id=input2 -netdev tap,script=/etc/qemu-ifup,downscript=/etc/qemu-down,id=hostnet1,vhost=on -device virtio-net-pci,netdev=hostnet1,id=net1,mac=00:52:11:36:3f:01 -m 4G,maxmem=1024G,slots=32 -smp 4,maxcpus=4,cores=2,threads=1,sockets=2 -machine accel=kvm -chardev socket,id=serial_id_serial0,path=/tmp/min,server,nowait -device spapr-vty,reg=0x30000000,chardev=serial_id_serial0 -object memory-backend-ram,id=mem1,size=1G -object memory-backend-ram,id=mem2,size=1G -object memory-backend-ram,id=mem3,size=1G 2.login guest and try to resize [root@dhcp47-185 home]# cat /sys/kernel/debug/powerpc/hpt_order 33 [root@dhcp47-185 home]# echo '34' >/sys/kernel/debug/powerpc/hpt_order -bash: echo: write error: No such device [root@dhcp47-185 home]# echo '35' >/sys/kernel/debug/powerpc/hpt_order -bash: echo: write error: No such device ... QE verified the bug on the following builds kernel-3.10.0-820.el7.ppc64le qemu-kvm-rhev-2.10.0-12.el7.ppc64le SLOF-20170724-5.git89f519f.el8.ppc64le Steps, please refer to comment27 Actual results, 1.cat /sys/kernel/debug/powerpc/hpt_order cat /sys/kernel/debug/powerpc/hpt_order 25 2. echo 26 > /sys/kernel/debug/powerpc/hpt_order - looked like re-size successfully here. echo 26 > /sys/kernel/debug/powerpc/hpt_order 3.echo 27 > /sys/kernel/debug/powerpc/hpt_order - cannot continue to resize echo 27 > /sys/kernel/debug/powerpc/hpt_order -bash: echo: write error: Operation not permitted 4.echo 28 > /sys/kernel/debug/powerpc/hpt_order - cannot continue to resize echo 28 > /sys/kernel/debug/powerpc/hpt_order -bash: echo: write error: Operation not permitted Expected results, According to comment0,the it can re-size. Hi David and Serhii, Could you help to have a look on comment 27 and comment 28 ? If it is enough for QE to verify the bug or not ? Any issues please let me know,thanks a lot. Min Comment 28 looks good, however in comment 27 it looks like things are failing for a different reason than this bug. If this bug were in play, I would expect the error "Operation not permitted" instead of "No such device". The initial hpt_order is also much larger in comment 27 which suggests one of the components doesn't support HPT resizing at all. I suspect I may have given the wrong qemu version in comment 0. Can you try reproducing this with the version immediately before the fix went in, i.e.: qemu-kvm-rhev-2.10.0-8.el7 (In reply to David Gibson from comment #30) > Comment 28 looks good, however in comment 27 it looks like things are > failing for a different reason than this bug. If this bug were in play, I > would expect the error "Operation not permitted" instead of "No such device". > > The initial hpt_order is also much larger in comment 27 which suggests one > of the components doesn't support HPT resizing at all. > > I suspect I may have given the wrong qemu version in comment 0. Can you try > reproducing this with the version immediately before the fix went in, i.e.: > > qemu-kvm-rhev-2.10.0-8.el7 Thanks for you reply and QE used it to reproduce the bug again and got following output. Build info, qemu-kvm-rhev-2.10.0-8.el7 kernel-3.10.0-820.el7.ppc64le [root@dhcp47-185 home]# cat /sys/kernel/debug/powerpc/hpt_order cat /sys/kernel/debug/powerpc/hpt_order 25 [root@dhcp47-185 home]# echo 26 > /sys/kernel/debug/powerpc/hpt_order echo 26 > /sys/kernel/debug/powerpc/hpt_order -bash: echo: write error: Operation not permitted [root@dhcp47-185 home]# echo 27 > /sys/kernel/debug/powerpc/hpt_order echo 27 > /sys/kernel/debug/powerpc/hpt_order -bash: echo: write error: Operation not permitted [root@dhcp47-185 home]# echo 28 > /sys/kernel/debug/powerpc/hpt_order echo 28 > /sys/kernel/debug/powerpc/hpt_order -bash: echo: write error: Operation not permitted Hi David, In my opinions,base on comment28,30 and 31.I think it could be moved to verified status.Do you agree with me ? Thanks a lot. Min As discussed on IRC, I got a bit confused by the various attempts here. But, yes, comparing the reproducing run in comment 31 to the fixed run in comment 28 I think this can be marked verified. QE reproduced the bug with qemu-kvm-rhev-2.10.0-8.el7 from comment30 - see comment31 QE verified the bug - see comment28 Based on comment28,comment31 and comment33,QE moved it to be verified Thanks for developer's efforts Min Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:1104 |