1780506 – Attach-device does not fail when memory device size is not 256 MiB aligned

Bug 1780506 - Attach-device does not fail when memory device size is not 256 MiB aligned

Summary: Attach-device does not fail when memory device size is not 256 MiB aligned

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	Red Hat Enterprise Linux Advanced Virtualization
Classification:	Red Hat
Component:	libvirt
Sub Component:
Version:	8.2
Hardware:	ppc64le
OS:	Linux
Priority:	unspecified
Severity:	medium
Target Milestone:	rc
Target Release:	8.2
Assignee:	Daniel Henrique Barboza (IBM)
QA Contact:	Virtualization Bugs
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2019-12-06 08:48 UTC by Dan Zheng
Modified:	2020-03-16 08:21 UTC (History)
CC List:	6 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2020-03-16 08:21:47 UTC
Type:	Bug
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Dan Zheng 2019-12-06 08:48:48 UTC

Description of problem:
Attach-device does not fail when memory device size is not 256 MiB aligned

Version-Release number of selected component (if applicable):
libvirt-5.9.0-4.module+el8.2.0+4836+a8e32ad7.ppc64le
qemu-kvm-4.2.0-1.module+el8.2.0+4793+b09dd2fb.ppc64le
kernel-4.18.0-151.el8.ppc64le


How reproducible:
100%

Steps to Reproduce:
1.
Memory device mem.xml: 
<?xml version='1.0' encoding='UTF-8'?>
  <memory model="dimm">
    <target>
      <size unit="KiB">524289</size> <=== 2* 256 MiB + 1
      <node>0</node>
    </target>
  </memory>

2. virsh command: 
# virsh attach-device avocado-vt-vm1 mem.xml
Device attached successfully


3. Comparing x86 on which the device is 2 MiB aligned.
Memory device xml:
<?xml version='1.0' encoding='UTF-8'?>
  <memory model="dimm">
    <target>
      <size unit="KiB">512001</size> <=== 250 * 2 MiB + 1
      <node>0</node>
    </target>
  </memory>

# virsh attach-device avocado-vt-vm1 mem.xml
error: internal error: unable to execute QEMU command 'device_add': backend memory size must be multiple of 0x200000


Actual results:
See above

Expected results:
Similar error message with that on x86

Additional info:
This problem also exists in RHEL 7.8

libvirt-4.5.0-29.virtcov.el7.ppc64le
kernel-3.10.0-1111.el7.ppc64le
qemu-kvm-rhev-2.12.0-38.el7.ppc64le

Comment 2 Andrea Bolognani 2019-12-09 08:35:35 UTC

David, is this actually supposed to fail? Does ppc64 have this
size requirement?

If so, then it looks like the validation should happen in QEMU, as
it is the case for x86. I see you've assigned the bug to yourself,
but the component is still libvirt...

Comment 3 Min Deng 2019-12-09 09:01:41 UTC

QE had a try for the bug,it looks like it wasn't reproducible with the following cli,but it could not better if there's qemu command line.Thanks.
Build information,
kernel-4.18.0-160.el8.ppc64le
qemu-kvm-4.2.0-1.module+el8.2.0+4793+b09dd2fb.ppc64le

1.Boot up a guest 
-object memory-backend-ram,id=mem-1,size=513M,prealloc=yes -device pc-dimm,id=dimm1,memdev=mem-1
QEMU 4.2.0 monitor - type 'help' for more information
(qemu) qemu-kvm: -device pc-dimm,id=dimm1,memdev=mem-1: Hotplugged memory size must be a multiple of 256 MB

2.Boot up a guest
-object memory-backend-ram,host-nodes=0,policy=bind,id=mem-1,size=513M,prealloc=yes
QEMU 4.2.0 monitor - type 'help' for more information
(qemu) VNC server running on ::1:5901
(qemu) device_add pc-dimm,id=dimm1,memdev=mem-1
Error: Hotplugged memory size must be a multiple of 256 MB

Comment 4 David Gibson 2019-12-11 01:58:31 UTC

The 256MiB alignment constraint on POWER comes from a different reason than the 2MiB constraint on x86.  On POWER it's a constraint of the guest PAPR platform definition, on x86 it's due to the need for hugepage/THP alignment of the back end memory.

For that reason, we don't expect an exactly similar error on POWER.  We do, however, still expect an error message - the one shown in comment 3: "Error: Hotplugged memory size must be a multiple of 256 MB".  This is generated in the memory pre_plug callback.  Since device_add does appear to be reporting the error in comment 3, I'm a bit baffled as to why libvirt is not reporting it.

So, I think this needs some analysis from the libvirt side first.

Comment 5 Daniel Henrique Barboza (IBM) 2020-02-29 17:57:38 UTC

Libvirt is aligning up the dimms before hotplug and unplug for both x86 and ppc64.

In comment #1, in the ppc64 example, the memory is being aligned up, meaning that
2*256 MiB + 1 is being rounded to to 3 * 256 MiB. QEMU doesn't throw any errors
because it is receiving a valid dimm size, although it is *way* bigger than the user
intended, which is misleading. I believe auto-align for ppc64 dimm hotplug should
go away.

Comment #1 also compared the result with an x86 example. Problem is that x86 is also
being aligned up - QEMU is throwing an error because Libvirt is aligning the x86 dimm
with 1MiB (total memory size alignment) instead of memory module alignment which is
2Mib. The desirable x86 behavior is result of a bug: Libvirt is failing to align the
x86 dimm correctly and QEMU is throwing an error.

My proposal is to remove auto-alignment of dimms in memory hotplug/unplug for all
archs. x86 will keep the same behavior to error on dimm misalign (this time
intentionally) and ppc64 hotplug behavior will be more clear to the user. And
we also avoid yet another x86 vs ppc64 behavior difference in Libvirt.


Patches were sent upstream to Libvirt ML here:


https://www.redhat.com/archives/libvir-list/2020-February/msg01156.html

Comment 6 Daniel Henrique Barboza (IBM) 2020-03-06 17:43:18 UTC

After talking with PowerPC devs and users about the issue presented here and my
proposed fix, I changed my mind. Here's my reasoning:

- DIMM align up for ppc64 guests can lead to funny situations like a 100MiB hotplug
being rounded up to 256MiB, but in reality PowerPC users are hotplugging several
gigabytes of RAM per DIMM and the align up of 256MiB is of no consequence. But removing
the align up and erroring out, like I'm proposing to do, will annoy the existing users
with technical details they are not used to and, arguably, shouldn't care.

- Comment #1 compared the results of a unaligned PowePC hotplug (512+1 MiB being
succesfully hotplugged (although as 3*256MiB) versus a x86 scenario (512+1 MiB
erroring out in QEMU). This is happening because QEMU was expecting a 2MiB alignment
**in that scenario**. I got educated about x86 not having a standard memory block
size, with differences between the memory block size used in some operational
systems. There is no universal answer for dimm alignment like PowerPC has.


That all said, I believe we have no intentions of deteriorating PowerPC user experience
with something that will do more harm than good for their actual usage. At the
same time, although I boldly claimed that Libvirt was considering a wrong x86
alignment value, I am not so sure anymore after learning that x86 works with several
memory block sizes. 

I asked the Libvirt ML to drop the patch series mentioned in comment #5. My
suggestion for this bug then is to closed it as NOTABUG. The tester is welcome to
re-open it (or create a new one) asking for someone in the x86 side to take a look
at why Libvirt is failing to align up the dimm like shown in comment #1.

Comment 7 David Gibson 2020-03-10 01:59:43 UTC

Daniel,

That reasoning makes sense to me.

Dan, are you ok to close this and/or file a bug about the x86 behaviour?

Comment 8 Dan Zheng 2020-03-16 08:21:47 UTC

OK, I agree to close this bug.

Note You need to log in before you can comment on or make changes to this bug.