Summary: | Guest fail to start while the guest's memory isn't aligned to 256 MB | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | zhenfeng wang <zhwang> |
Component: | libvirt | Assignee: | Peter Krempa <pkrempa> |
Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | 7.2 | CC: | abologna, dgibson, dyuan, dzheng, gsun, hannsj_uhl, huding, juzhang, knoel, mzhan, pkrempa, rbalakri, tlavigne, virt-maint, xfu |
Target Milestone: | rc | Keywords: | Regression |
Target Release: | --- | ||
Hardware: | ppc64le | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | libvirt-1.2.17-10.el7 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2015-11-19 06:49:24 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Bug Depends On: | |||
Bug Blocks: | 1201513, 1277183, 1277184 |
Description
zhenfeng wang
2015-07-31 10:10:44 UTC
So I validated that a migration stream fails to restore on destination once you change the memory size, so with the patch you are no longer able to migrate from previous versions including save/restore and other stuff. The memory size alignment should not be enforced unless you request memory hotplug. I think the better option is to fix this restriction inside qemu, rather than working around it in libvirt. I have some ideas on how to do this. On further investigation, I take comment 5 back. Removing this restriction in qemu is moderately tricky I think aligning up in libvirt is the correct approach - although discoverability of that alignment requirement is kind of nasty. Migration isn't an issue, because RHEL7.2 will be the first supported KVM on Power. On yet further investigation we might be able to at least partially fix this in qemu without too much trouble. Specifically we can remove the alignment restriction for base memory, however any hotplugged memory will still need to be in 256MB chunks - effectively (max_ram_size - ram_size) must be a multiple of 256MB. Will that be enough to avoid the problems in libvirt? (In reply to David Gibson from comment #6) ... > Migration isn't an issue, because RHEL7.2 will be the first supported KVM on > Power. While this is true on downstream I wouldn't be able to justify this upstream. (In reply to David Gibson from comment #7) > On yet further investigation we might be able to at least partially fix this > in qemu without too much trouble. > > Specifically we can remove the alignment restriction for base memory, > however any hotplugged memory will still need to be in 256MB chunks - > effectively (max_ram_size - ram_size) must be a multiple of 256MB. Aligning the size of the added memory chunks or even the difference between max_ram and base memory size (if that's a hard requirement, not just the implication of aligning the added memory) is fine in libvirt. > > Will that be enough to avoid the problems in libvirt? Definitely yes. It will allow us to keep the legacy configurations working and will also be justifiable upstream since memory hotplug is a new feature. Thanks! Sorry, I realised moments after I posted that that the migration would still be an issue for upstream. Although, note that the alignment is only enforced when memory hotplug is enabled, which is only true for the most recent machine type unless explicitly overridden. (Note that "enabled" in this sense can be true even if maxram == ram). Nonetheless looks like we have a better fix here - Bharata is working on an upstream patch to remove the alignment restriction for non hotplugged memory. I hope to merge that into my spapr-next feeder tree ASAP, and we should be able to pull downstream from there. It will probably take a bit longer to get into upstream maintainer, because agraf isn't usually very prompt about pulling things. Should I make a new BZ for the qemu fix, or just move this BZ over to qemu? David, What is the upstream status for this fix? Thanks. Hi Karen, I discussed this further with Peter Krempa and Andrea during KVM Forum. We decided that in fact fixing this in libvirt is the correct approach after all. The issue with migration can be addressed because the alignment is only enforced when memory hotplug (or "LMB Dynamic Reconfiguration" in IBM terminology) is enabled (either via the machine type or explicitly with machine options). So libvirt needs to make the rounding up of memory sizes conditional on dr_lmb being enabled. From additional discussions it looks like that the "dr_lmb" is enabled only on the basis of the selected machine type and can't be controlled by any other way. That means libvirt can't base the decision how to align the memory on the fact that memory hotplug was enabled, but we'll need to hack around this issue by machine type checks which is rather unfortunate, since we'll need to hardcode the support for this into libvirt. Said this, it won't be impossible to do that, but machine type based decisions tend to end up rather fragile when something changes in the future which is beyond libvirts control. Peter, Ok, so the question at this point is which is less bad: 1) Do the icky machine type check in libvirt or 2) Rush in a last minute qemu fix to add a machine option which explicitly enables/disable DR memory. From an upstream POV, (2) should be fine - memory hotplug has only just been pushed to mainline, and wasn't in qemu-2.4. Downstream it's trickier, but should be possible since we already have blocker+ on this bug. The qemu side fix (well, fix to assist the libvirt fix) should be pretty straightforward. If you think that's the way to go, I'm happy to implement that tomorrow. Upstream fixes this with: commit bd874b6c422283ff9c07ee28b042b424e85a2398 Author: Peter Krempa <pkrempa> Date: Mon Sep 21 18:10:55 2015 +0200 qemu: ppc64: Align memory sizes to 256MiB blocks For some machine types ppc64 machines now require that memory sizes are aligned to 256MiB increments (due to the dynamically reconfigurable memory). As now we treat existing configs reasonably in regards to migration, we can round all the sizes unconditionally. The only drawback will be that the memory size of a VM can potentially increase by (256MiB - 1byte) * number_of_NUMA_nodes. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1249006 But the commit requires the migration size fixes that were done for https://bugzilla.redhat.com/show_bug.cgi?id=1252685 As bugzilla was down when I've tried to reply I'm attaching the e-mail reply I've sent to David: (In reply to David Gibson from comment #13) >Peter, > >Ok, so the question at this point is which is less bad: 1) Do the icky machine >type check in libvirt or 2) Rush in a last minute qemu fix to add a machine >option which explicitly enables/disable DR memory. > Actually I probably also have a third option now: I'm preparing a patchset for other issue related to memory alignment and this patchset fixes various aspects of memory alignment, especially one where we'd re-allign the memory even if the guest was alive (migration) which we shouldn't do actually. This patchset will allow us to force the 256MiB alignment without breaking migration or existing guests from libvirts view (it will be broken only if qemu will not allow such config). The only downside of this is that switching to new libvirt will possibly increase memory sizes of VMs that will be started fresh from the point we add the alignment. >From an upstream POV, (2) should be fine - memory hotplug has only just been >pushed to mainline, and wasn't in qemu-2.4. Downstream it's trickier, but >should be possible since we already have blocker+ on this bug. > >The qemu side fix (well, fix to assist the libvirt fix) should be pretty >straightforward. If you think that's the way to go, I'm happy to implement >that tomorrow. Doing option 2 would have a few tricks to it: 1) We'd need a way to introspect the possibility to pass the new flag to disable DR mem. 2) The new flag to disable DR memory would need to work with older machine types too (otherwise we wouldn't avoid the 3) (I forgot this one, but I'm certain I thought of 3 points ...) Peter, Does that mean we can consider the problem basically solved now? Should I still work on adding a machine option upstream? Regarding your detailed points: (1) I believe machine options are introspectable - they appear with "qemu-kvm -machine pseries,?" at least. (2) Yes, my intention would be that the option would work with all machine type revisions (3) um... Test with below packages: libvirt-1.2.17-11.el7.ppc64le qemu-kvm-rhev-2.3.0-26.el7.ppc64le kernel-3.10.0-319.el7.ppc64le Below tests are done: Test 1. Test maxMemory and memory without aligned to 256M Configure with below setting: <maxMemory slots='16' unit='KiB'>10000000</maxMemory> <memory unit='KiB'>1000000</memory> <cpu mode='host-model'> <model fallback='forbid'/> <numa> <cell id='0' cpus='0-1' memory='500000' unit='KiB'/> <cell id='1' cpus='2-3' memory='500000' unit='KiB'/> </numa> </cpu> Guest can start successfully. # virsh dumpxml guest ... <maxMemory slots='16' unit='KiB'>10223616</maxMemory> <memory unit='KiB'>1048576</memory> <currentMemory unit='KiB'>1048576</currentMemory> <cpu mode='host-model'> <model fallback='forbid'/> <numa> <cell id='0' cpus='0-1' memory='524288' unit='KiB'/> <cell id='1' cpus='2-3' memory='524288' unit='KiB'/> </numa> </cpu> Check qemu command line: qemu 142155 1 ... -cpu host -m size=1048576k,slots=16,maxmem=10223616k Check within the guest. # cat /proc/meminfo MemTotal: 979520 kB Test 2: Migration On source, dumpxml of the running guest is same as dumpxml in test 1. # virsh migrate guest --live --copy-storage-all --unsafe qemu+ssh://10.19.112.39/system On target, guest is running. Dumpxml is as below: ... <maxMemory slots='16' unit='KiB'>10223616</maxMemory> <memory unit='KiB'>1048576</memory> <currentMemory unit='KiB'>1000000</currentMemory> <cpu mode='host-model'> <model fallback='forbid'/> <numa> <cell id='0' cpus='0-1' memory='524288' unit='KiB'/> <cell id='1' cpus='2-3' memory='524288' unit='KiB'/> </numa> </cpu> Check within the guest: # cat /proc/meminfo MemTotal: 979520 kB Test 3: Memory hotplug without aligned to 256M memoryplug.xml: <memory model="dimm"> <target> <size unit="KiB">500000</size> <node>0</node> </target> <source> <pagesize unit="KiB">64</pagesize> <nodemask>0</nodemask> </source> </memory> # virsh attach-device guest memoryplug.xml Device attached successfully # virsh dumpxml guest <maxMemory slots='16' unit='KiB'>10223616</maxMemory> <memory unit='KiB'>1572864</memory> <currentMemory unit='KiB'>1000000</currentMemory> ====> 1048576 ??? <devices> <memory model='dimm'> <source> <nodemask>0</nodemask> <pagesize unit='KiB'>64</pagesize> </source> <target> <size unit='KiB'>524288</size> <node>0</node> </target> <alias name='dimm0'/> <address type='dimm' slot='0' base='0x40000000'/> </memory> </devices> # cat /proc/meminfo MemTotal: 1503808 kB (1503808-979520=524288) It is appreciated of your comments on more test scenarios. (In reply to David Gibson from comment #18) > Peter, > > Does that mean we can consider the problem basically solved now? Erm, I hope so :) > > Should I still work on adding a machine option upstream? I don't think it will be necessary, well unless somebody decides that rounding the memory doesn't suit them for some reason. Otherwise we'll just always do it from now on. (In reply to Dan Zheng from comment #19) > ... > # virsh attach-device guest memoryplug.xml > Device attached successfully > # virsh dumpxml guest > > <maxMemory slots='16' unit='KiB'>10223616</maxMemory> > <memory unit='KiB'>1572864</memory> > <currentMemory unit='KiB'>1000000</currentMemory> ====> 1048576 ??? currentMemory is handled via the balloon driver that does not have the alignment issue. > <devices> > <memory model='dimm'> > <source> > <nodemask>0</nodemask> > <pagesize unit='KiB'>64</pagesize> > </source> > <target> > <size unit='KiB'>524288</size> > <node>0</node> > </target> > <alias name='dimm0'/> > <address type='dimm' slot='0' base='0x40000000'/> > </memory> > </devices> > > # cat /proc/meminfo > MemTotal: 1503808 kB (1503808-979520=524288) (In reply to Dan Zheng from comment #20) > It is appreciated of your comments on more test scenarios. I currently can't think of anything else to test in this regard. Based on comment 22 and comment 19, I mark it as verified. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2015-2202.html |