Bug 1305398

Summary: [RFE] PAPR Hash Page Table (HPT) resizing (qemu-kvm-rhev)
Product: Red Hat Enterprise Linux 7 Reporter: David Gibson <dgibson>
Component: qemu-kvm-rhevAssignee: David Gibson <dgibson>
Status: CLOSED ERRATA QA Contact: Min Deng <mdeng>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 7.3CC: bugproxy, dgibson, dzheng, hannsj_uhl, knoel, mdeng, michen, mrezanin, mtessun, qzhang, virt-maint, xuhan, xuma
Target Milestone: rcKeywords: FutureFeature, Patch
Target Release: 7.5   
Hardware: ppc64le   
OS: Linux   
Whiteboard:
Fixed In Version: qemu-kvm-rhev-2.10.0-1.el7 Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of:
: 1308743 (view as bug list) Environment:
Last Closed: 2018-04-11 00:09:32 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1305399, 1305400    
Bug Blocks: 1248279, 1284775, 1305498, 1308743, 1308744, 1308746, 1444027, 1469590, 1473046    

Description David Gibson 2016-02-08 00:50:23 UTC
Description of problem:

Allow the hash page table (HPT) of PAPR guests to be resized at runtime.

This is important for practical memory hotplug.  Without this the HPT needs to be sized for the guest's maximum possible memory - since RHEV wants to set that to 4T, this can result in a much bigger than necessary HPT which wastes host resources and can cause allocation failures.  With HV KVM the HPT is unswappable, contiguous host memory.

This BZ covers the qemu parts of this including TCG and PR KVM implementation of the necessary hypercalls, feature negotation with the guest and enabling the necessary KVM host pieces.

Comment 1 David Gibson 2016-02-08 00:52:00 UTC
An RFC has been posted upstream:

https://lists.gnu.org/archive/html/qemu-devel/2016-01/msg05852.html

Comment 2 Karen Noel 2016-03-28 00:56:49 UTC
Cannot commit to RHEL 7.3 until patches are posted and accepted upstream. Stay tuned.

Comment 3 David Gibson 2016-06-08 05:27:32 UTC
At this stage we're blocked on acceptance of the PAPR spec change, I'm having some trouble getting details of progress on this from IBM.

It's possible this could happen in time to make qemu changes for RHEL 7.3, but it's not plausible it will happen in time for the earlier kernel deadline (which affects bug 1305399 and bug 1305400).  Without the kernel parts, the qemu changes aren't much use.

Therefore, deferring.

Comment 4 David Gibson 2017-02-09 00:33:35 UTC
Update: patches have been posted upstream, and don't seem to have objections, but we're blocked on the upstream merge of the kernel components for a proper upstream merge.

Comment 5 David Gibson 2017-03-09 03:41:15 UTC
Unfortunately we didn't get an upstream kernel merge in time to do the backport for the RHEL 7.4 downstream kernel deadline (see bug 1305400).  Without that, there's not much value to the qemu part.  So, deferring to 7.5.

Comment 6 David Gibson 2017-07-19 01:59:48 UTC
This is now upstream ready for qemu-2.10.  I expect we'll get it downstream with the 7.5 rebase.

Comment 8 Hanns-Joachim Uhl 2017-10-09 10:18:10 UTC
(In reply to David Gibson from comment #3)
> At this stage we're blocked on acceptance of the PAPR spec change, I'm
> having some trouble getting details of progress on this from IBM.
> 
> It's possible this could happen in time to make qemu changes for RHEL 7.3,
> but it's not plausible it will happen in time for the earlier kernel
> deadline (which affects bug 1305399 and bug 1305400).  Without the kernel
> parts, the qemu changes aren't much use.
> 
.
Hello Red Hat / David,
unfortunately we do not have access to Red Hat bug 1305399 and bug 1305400 ...
... therefore we would like to ask you whether the related kernel patches
are already integrated in an early RHEL7.5 kernel
and, if yes, in which RHEL7.5 kernel level these are integrated ...?
Please advise ...
Thanks in advance for your support.

Comment 9 Karen Noel 2017-10-09 12:38:41 UTC
I opened up the 2 BZs to IBM. Sorry about that.

Comment 10 David Gibson 2017-10-10 00:25:42 UTC
In response to comment 8.

The guest side kernel patches are already merged - they were in RHEL7.4, and so they are in all RHEL7.5 kernels.

The host side kernel patches are not yet merged, but I'm working on them right now.

Comment 14 Min Deng 2017-12-11 07:30:46 UTC
According to comment13 and QE did the following tests,
Build information,
kernel-3.10.0-820.el7.ppc64le (host and guest)
qemu-kvm-rhev-2.10.0-11.el7.ppc64le
SLOF-20170303-4.git66d250e.el7.noarch
Test case1,
CLI,
/usr/libexec/qemu-kvm -name virt-tests-vm1 -sandbox off -machine pseries -nodefaults -vga std -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=03,disable-legacy=off,disable-modern=off -drive id=drive_image1,if=none,cache=none,snapshot=off,aio=native,format=qcow2,file=rhel75-ppc64le-virtio-scsi.qcow2 -device scsi-hd,id=image1,drive=drive_image1 -numa node -qmp tcp:0:4444,server,nowait -vnc :1 -rtc base=utc,clock=host,driftfix=slew -boot order=cdn,once=c,menu=off,strict=off -monitor stdio -device nec-usb-xhci,id=usb1 -device usb-kbd,id=input0 -device usb-mouse,id=input1 -device usb-tablet,id=input2 -netdev tap,script=/etc/qemu-ifup,downscript=/etc/qemu-down,id=hostnet1,vhost=on -device virtio-net-pci,netdev=hostnet1,id=net1,mac=00:52:11:36:3f:01 -m 4G,maxmem=1024G,slots=32 -smp 4,maxcpus=4,cores=2,threads=1,sockets=2 -machine accel=tcg -chardev socket,id=serial_id_serial0,path=/tmp/min,server,nowait -device spapr-vty,reg=0x30000000,chardev=serial_id_serial0
Steps,
1.cat /sys/kernel/debug/powerpc/hpt_order
cat /sys/kernel/debug/powerpc/hpt_order
25
2.echo 26 > /sys/kernel/debug/powerpc/hpt_order
echo 26 > /sys/kernel/debug/powerpc/hpt_order
3.cat /sys/kernel/debug/powerpc/hpt_order
26
4.dmesg
[  543.629264] lpar: Attempting to resize HPT to shift 26
[  543.893952] lpar: HPT resize to shift 26 complete (112 ms / 150 ms)

Test case2,
1.read  /sys/kernel/debug/powerpc/hpt_order
Results,
cat /sys/kernel/debug/powerpc/hpt_order
25
2.hotplug memory to double the memory size
(qemu) object_add memory-backend-ram,id=mem1,size=8G
(qemu) device_add pc-dimm,id=dimm1,memdev=mem1
3.cat /sys/kernel/debug/powerpc/hpt_order
cat /sys/kernel/debug/powerpc/hpt_order
24
4.dmesg
[  610.302749] pseries-hotplug-mem: Attempting to hot-add 32 LMB(s) at index 80000010
[  610.312528] lpar: Attempting to resize HPT to shift 23
[  610.570498] lpar: HPT resize to shift 23 complete (107 ms / 149 ms)
[  610.909832] lpar: Attempting to resize HPT to shift 24
[  611.060338] lpar: HPT resize to shift 24 complete (111 ms / 38 ms)
[  611.836404] pseries-hotplug-mem: Memory at 100000000 (drc index 80000010) was hot-added
[  611.836521] pseries-hotplug-mem: Memory at 110000000 (drc index 80000011) was hot-added
[  611.836546] pseries-hotplug-mem: Memory at 120000000 (drc index 80000012) was hot-added
[  611.836569] pseries-hotplug-mem: Memory at 130000000 (drc index 80000013) was hot-added
[  611.836589] pseries-hotplug-mem: Memory at 140000000 (drc index 80000014) was hot-added
[  611.836609] pseries-hotplug-mem: Memory at 150000000 (drc index 80000015) was hot-added
[  611.836628] pseries-hotplug-mem: Memory at 160000000 (drc index 80000016) was hot-added
[  611.836648] pseries-hotplug-mem: Memory at 170000000 (drc index 80000017) was hot-added
[  611.836668] pseries-hotplug-mem: Memory at 180000000 (drc index 80000018) was hot-added
[  611.836687] pseries-hotplug-mem: Memory at 190000000 (drc index 80000019) was hot-added
[  611.836707] pseries-hotplug-mem: Memory at 1a0000000 (drc index 8000001a) was hot-added
[  611.836727] pseries-hotplug-mem: Memory at 1b0000000 (drc index 8000001b) was hot-added
[  611.836746] pseries-hotplug-mem: Memory at 1c0000000 (drc index 8000001c) was hot-added
[  611.836766] pseries-hotplug-mem: Memory at 1d0000000 (drc index 8000001d) was hot-added
[  611.836785] pseries-hotplug-mem: Memory at 1e0000000 (drc index 8000001e) was hot-added
[  611.836805] pseries-hotplug-mem: Memory at 1f0000000 (drc index 8000001f) was hot-added
[  611.836825] pseries-hotplug-mem: Memory at 200000000 (drc index 80000020) was hot-added
[  611.836844] pseries-hotplug-mem: Memory at 210000000 (drc index 80000021) was hot-added
[  611.836864] pseries-hotplug-mem: Memory at 220000000 (drc index 80000022) was hot-added
[  611.836883] pseries-hotplug-mem: Memory at 230000000 (drc index 80000023) was hot-added
[  611.836903] pseries-hotplug-mem: Memory at 240000000 (drc index 80000024) was hot-added
[  611.836922] pseries-hotplug-mem: Memory at 250000000 (drc index 80000025) was hot-added
[  611.836942] pseries-hotplug-mem: Memory at 260000000 (drc index 80000026) was hot-added
[  611.836962] pseries-hotplug-mem: Memory at 270000000 (drc index 80000027) was hot-added
[  611.836981] pseries-hotplug-mem: Memory at 280000000 (drc index 80000028) was hot-added
[  611.837001] pseries-hotplug-mem: Memory at 290000000 (drc index 80000029) was hot-added
[  611.837020] pseries-hotplug-mem: Memory at 2a0000000 (drc index 8000002a) was hot-added
[  611.837040] pseries-hotplug-mem: Memory at 2b0000000 (drc index 8000002b) was hot-added
[  611.837060] pseries-hotplug-mem: Memory at 2c0000000 (drc index 8000002c) was hot-added
[  611.837079] pseries-hotplug-mem: Memory at 2d0000000 (drc index 8000002d) was hot-added
[  611.837099] pseries-hotplug-mem: Memory at 2e0000000 (drc index 8000002e) was hot-added
[  611.837118] pseries-hotplug-mem: Memory at 2f0000000 (drc index 8000002f) was hot-added

5.hot-unplug memory
(qemu) device_del dimm1
(qemu) object_del mem1
object 'mem1' is in use, can not be deleted
(qemu) object_del mem1

6.cat /sys/kernel/debug/powerpc/hpt_order
cat /sys/kernel/debug/powerpc/hpt_order
22
7.dmesg
[  746.533546] pseries-hotplug-mem: Attempting to hot-remove 32 LMB(s) at 80000010
[  746.591646] Offlined Pages 4096
[  746.635921] Offlined Pages 4096
[  746.718275] Offlined Pages 4096
[  746.825370] Offlined Pages 4096
[  746.932840] Offlined Pages 4096
[  747.120290] Offlined Pages 4096
[  747.358892] Offlined Pages 4096
[  747.878879] Offlined Pages 4096
[  748.909317] Offlined Pages 4096
[  749.600845] Offlined Pages 4096
[  750.062616] Offlined Pages 4096
[  750.529426] Offlined Pages 4096
[  751.161344] Offlined Pages 4096
[  751.952617] Offlined Pages 4096
[  752.412406] Offlined Pages 4096
[  753.009526] Offlined Pages 4096
[  753.952017] Offlined Pages 4096
[  754.797190] Offlined Pages 4096
[  755.685075] Offlined Pages 4096
[  756.582665] Offlined Pages 4096
[  757.608650] Offlined Pages 4096
[  758.356289] Offlined Pages 4096
[  759.080502] Offlined Pages 4096
[  759.441964] Offlined Pages 4096
[  759.634949] Offlined Pages 4096
[  759.804548] Offlined Pages 4096
[  759.999824] Offlined Pages 4096
[  760.072227] Offlined Pages 4096
[  760.113684] Offlined Pages 4096
[  760.213926] Offlined Pages 4096
[  760.244205] Offlined Pages 4096
[  760.382442] Offlined Pages 4096
[  760.406183] lpar: Attempting to resize HPT to shift 22
[  760.600749] lpar: HPT resize to shift 22 complete (117 ms / 76 ms)
[  760.602529] pseries-hotplug-mem: Memory at 100000000 (drc index 80000010) was hot-removed
[  760.602718] pseries-hotplug-mem: Memory at 110000000 (drc index 80000011) was hot-removed
[  760.602792] pseries-hotplug-mem: Memory at 120000000 (drc index 80000012) was hot-removed
[  760.602908] pseries-hotplug-mem: Memory at 130000000 (drc index 80000013) was hot-removed
[  760.602971] pseries-hotplug-mem: Memory at 140000000 (drc index 80000014) was hot-removed
[  760.603031] pseries-hotplug-mem: Memory at 150000000 (drc index 80000015) was hot-removed
[  760.603092] pseries-hotplug-mem: Memory at 160000000 (drc index 80000016) was hot-removed
[  760.603152] pseries-hotplug-mem: Memory at 170000000 (drc index 80000017) was hot-removed
[  760.603238] pseries-hotplug-mem: Memory at 180000000 (drc index 80000018) was hot-removed
[  760.603299] pseries-hotplug-mem: Memory at 190000000 (drc index 80000019) was hot-removed
[  760.603360] pseries-hotplug-mem: Memory at 1a0000000 (drc index 8000001a) was hot-removed
[  760.603437] pseries-hotplug-mem: Memory at 1b0000000 (drc index 8000001b) was hot-removed
[  760.603499] pseries-hotplug-mem: Memory at 1c0000000 (drc index 8000001c) was hot-removed
[  760.603559] pseries-hotplug-mem: Memory at 1d0000000 (drc index 8000001d) was hot-removed
[  760.603618] pseries-hotplug-mem: Memory at 1e0000000 (drc index 8000001e) was hot-removed
[  760.603688] pseries-hotplug-mem: Memory at 1f0000000 (drc index 8000001f) was hot-removed
[  760.603749] pseries-hotplug-mem: Memory at 200000000 (drc index 80000020) was hot-removed
[  760.603809] pseries-hotplug-mem: Memory at 210000000 (drc index 80000021) was hot-removed
[  760.603869] pseries-hotplug-mem: Memory at 220000000 (drc index 80000022) was hot-removed
[  760.603933] pseries-hotplug-mem: Memory at 230000000 (drc index 80000023) was hot-removed
[  760.603994] pseries-hotplug-mem: Memory at 240000000 (drc index 80000024) was hot-removed
[  760.604054] pseries-hotplug-mem: Memory at 250000000 (drc index 80000025) was hot-removed
[  760.604115] pseries-hotplug-mem: Memory at 260000000 (drc index 80000026) was hot-removed
[  760.604174] pseries-hotplug-mem: Memory at 270000000 (drc index 80000027) was hot-removed
[  760.604241] pseries-hotplug-mem: Memory at 280000000 (drc index 80000028) was hot-removed
[  760.604302] pseries-hotplug-mem: Memory at 290000000 (drc index 80000029) was hot-removed
[  760.604362] pseries-hotplug-mem: Memory at 2a0000000 (drc index 8000002a) was hot-removed
[  760.604422] pseries-hotplug-mem: Memory at 2b0000000 (drc index 8000002b) was hot-removed
[  760.604482] pseries-hotplug-mem: Memory at 2c0000000 (drc index 8000002c) was hot-removed
[  760.604553] pseries-hotplug-mem: Memory at 2d0000000 (drc index 8000002d) was hot-removed
[  760.604614] pseries-hotplug-mem: Memory at 2e0000000 (drc index 8000002e) was hot-removed
[  760.605014] pseries-hotplug-mem: Memory at 2f0000000 (drc index 8000002f) was hot-removed
Test case3,
Boot up guest with test case1's cli,
*The original order is 25                                                     
(qemu) object_add memory-backend-ram,id=mem1,size=8G   - After checking -  24
(qemu) device_add pc-dimm,id=dimm1,memdev=mem1

(qemu) object_add memory-backend-ram,id=mem2,size=8G   - After checking  - 25 
(qemu) device_add pc-dimm,id=dimm2,memdev=mem2

(qemu) object_add memory-backend-ram,id=mem3,size=16G  - After checking  - 26
(qemu) device_add pc-dimm,id=dimm3,memdev=mem3

(qemu) object_add memory-backend-ram,id=mem4,size=32G  - After checking  - 27 
(qemu) device_add pc-dimm,id=dimm4,memdev=mem4

(qemu) object_add memory-backend-ram,id=mem5,size=32G  - After checking  - 27
(qemu) device_add pc-dimm,id=dimm5,memdev=mem5

(qemu) object_add memory-backend-ram,id=mem6,size=32G  - After checking  - 28
(qemu) device_add pc-dimm,id=dimm6,memdev=mem6

(qemu) object_add memory-backend-ram,id=mem7,size=64G  - After checking  - 28
(qemu) device_add pc-dimm,id=dimm7,memdev=mem7

(qemu) object_add memory-backend-ram,id=mem8,size=64G  - After checking  - 28
(qemu) device_add pc-dimm,id=dimm8,memdev=mem8


 In QE's opinions,it has the almost the same test results as bug1305399's by the same procedure.It should be fixed already.To David,do you agree with me ? Thanks in advance for your support.

Min

Comment 15 David Gibson 2017-12-11 07:40:17 UTC
Yes, looks good.  I think we can mark this verified.

Comment 16 Min Deng 2017-12-11 07:47:00 UTC
According to comment14 and comment15,mark this bug as verified,thanks for developer's support.

Comment 18 errata-xmlrpc 2018-04-11 00:09:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:1104