| Summary: | Xen Guest Memory Capacity Wrong in 6.2 kernels | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | Kevin Stange <kevin> | ||||||||
| Component: | kernel | Assignee: | Xen Maintainance List <xen-maint> | ||||||||
| Status: | CLOSED NOTABUG | QA Contact: | Red Hat Kernel QE team <kernel-qe> | ||||||||
| Severity: | unspecified | Docs Contact: | |||||||||
| Priority: | unspecified | ||||||||||
| Version: | 6.2 | CC: | drjones, imammedo, lersek, pbonzini | ||||||||
| Target Milestone: | rc | ||||||||||
| Target Release: | --- | ||||||||||
| Hardware: | Unspecified | ||||||||||
| OS: | Unspecified | ||||||||||
| Whiteboard: | xen | ||||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||||
| Doc Text: | Story Points: | --- | |||||||||
| Clone Of: | Environment: | ||||||||||
| Last Closed: | 2012-01-02 11:54:54 UTC | Type: | --- | ||||||||
| Regression: | --- | Mount Type: | --- | ||||||||
| Documentation: | --- | CRM: | |||||||||
| Verified Versions: | Category: | --- | |||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||
| Bug Depends On: | |||||||||||
| Bug Blocks: | 523117 | ||||||||||
| Attachments: |
|
||||||||||
|
Description
Kevin Stange
2011-12-29 18:57:28 UTC
I should note as well that a 5.7 kernel shows this same guest with the actual correct amount of RAM. All 6.x kernels report less: 2.6.18-274.7.1.el5xen MemTotal: 262144 kB MemFree: 51016 kB Buffers: 5912 kB Cached: 50892 kB SwapCached: 0 kB Active: 42968 kB Inactive: 40788 kB HighTotal: 0 kB HighFree: 0 kB LowTotal: 262144 kB LowFree: 51016 kB SwapTotal: 1048568 kB SwapFree: 1048568 kB Dirty: 0 kB Writeback: 0 kB AnonPages: 27036 kB Mapped: 9080 kB Slab: 13148 kB PageTables: 2540 kB NFS_Unstable: 0 kB Bounce: 0 kB CommitLimit: 1179640 kB Committed_AS: 124912 kB VmallocTotal: 34359738367 kB VmallocUsed: 2640 kB VmallocChunk: 34359735727 kB Hello Kevin, Could you provide dmesg output from guest in both cases? Assuming these are PV guests, then we can at least partially blame a
configuration patch we put in for 6.2 (it went in to the -162 kernel)
[virt] xen: bump memory limit for x86_64 domU PV guest to 128Gb
This patch allows PV guests to have up to 128 GB of memory in their config, but
also uses an extra 384 kB for the memory map. Assuming it did indeed only need
an extra 384 kB, then we're still missing almost 36 MB. So we should double
check it.
Please attach both guest xen config files. Also add ignore_loglevel to the
guests' kernel command lines and capture full dmesg outputs from both the 6.1
guest and the 6.2 guest immediately after booting them up.
Just to note though, while missing 14% of the memory is indeed a bug that we'd
like to figure out, RHEL6.x (.1/.2/etc.) aren't supported with less than 512
MB.
Here's my Xen configuration file. This file doesn't change at all when I upgrade the kernel and reboot. I have been testing all of these changes within the same guest configuration to validate that it's not related to minor configuration differences. bootloader = "/usr/bin/pygrub" maxmem = "4096" vcpu_avail = "3" vcpus = "16" memory = "256" name = "[hidden]" vif = [ "mac=[hidden], bridge=[hidden], ip=[hidden], vifname=[hidden]" ] disk = [ "phy:[hidden],sda1,w", "phy:[hidden],sda2,w" ] vfb = [ "type=vnc,vncpasswd=[hidden]" ] I'll attach the dmesg output as files due to the length. As a matter of interest, I wanted to see how much RAM was lost on guests larger than 512 MB as well to make this a "supported" problem. Here's how it ended up: Expected: rev-131.21.2, rev-220.2.1 (loss) 256 MB: 242460 kB, 205384 kB (15%) 512 MB: 499980 kB, 310300 kB (38%) 1024 MB: 1015024 kB, 669016 kB (34%) 2048 MB: 2045136 kB, 1550352 kB (24%) There doesn't seem to be any scaling factor for this behavior. The numbers are the same for every reboot. In this case, the config file changed as follows: 512 MB: maxmem = "8192" memory = "512" 1024 MB: maxmem = "16384" memory = "1024" 2048 MB: maxmem = "32768" memory = "2048" Let me know if I can provide any more information to help find the problem. Created attachment 550085 [details]
2.6.32-131.21.1.el6.x86_64 dmesg output
Created attachment 550086 [details]
2.6.32-220.2.1.el6.x86_64 dmesg output
Hello Kevin, can you please try lowering the "maxmem" setting in your VM config? Bug 523122 comes to mind. RHEL-6.2 should be more cautious/conservative than RHEL-6.1 -- it should prepare for ballooning up later on, from the initial memory setting. This may require more reserved memory up-front. Since you don't mention encountering the ballooning problem (bug 523122) under 6.1, I believe you may not have actually used more than the initial "memory" setting in your guests. If so, please consider lowering "maxmem" and reporting back about the results. Thanks. Upon lowering the maxmem to 4096 on my 2048 MB guest, I have 1880092 kB available, lowering it to 2048 on the same guest, I have 2045020 kB available, which is about the same as the 6.1 kernel's memory allocation. The scaling of this number is set by the management software I'm using automatically, so it's not easy to get around. This seems to mitigate the problem (when setting maxmem = memory), however it prevents ballooning the guest's RAM entirely to ensure that the full value of "memory" is initially available. Does this qualify as a bug or is this somehow working as intended? What is the kernel reserving such a huge percentage of the existing "memory" value for, since ballooning offers additional memory from domain-0 when invoked up to "maxmem"? It is working as intended. The kernel has to reserve memory for page structures up to the maxmem that you specify. If this is working as intended, why is it that the 5.7 kernel reserves no memory in advance (at least that I can tell) and still manages to be able to balloon to a larger memory space when needed? Put another way, what was broken that this new behavior of reserving a lot of RAM fixes, and why can't the kernel allocate those page structures on demand? Also, why does my physical server with 24 GB of RAM running the same kernel not apparently need to reserve any memory for paging structures, showing the full 24 GB available to applications? (In reply to comment #13) > If this is working as intended, why is it that the 5.7 kernel reserves no > memory in advance (at least that I can tell) and still manages to be able to > balloon to a larger memory space when needed? Put another way, what was broken > that this new behavior of reserving a lot of RAM fixes, and why can't the > kernel allocate those page structures on demand? If you look at "diff of dmesgs" attachment, You'll notice that it's not only page tables consuming more memory but other subsystems reserve additional memory as well. And memory subsystem/ballooning implementation of xenified 2.18 and pvops 2.32 kernels is quite different so they couldn't be compared as is. Ballooning feature introduced in 6.2 is nothing new, it's the way the current pvops upstream kernel behaves as guest. Possible solution to this issue might be in using memory hot-plug. See bug 523122 comment 10. So you can try out if that approach would work better for your case. Is there a particular patch we could find and revert, building a custom kernel RPM, which would return the behavior to how it worked in 6.1? Would this be a terrible idea, what does this adjustment in 6.2 fix exactly? Disregard, I realize as I read the other bug more completely that 6.2 is the first release in the 6.x series that can handle ballooning properly. I attempted to find a way to make use of memory hotplugging, but I can't find user-space tools which can actually invoke the hotplug versus ballooning. Are you aware of how this is supposed to be invoked? Hotplugging is simply a completely different implementation of ballooning. It is invoked in the same way in the host. In that case, how are you suggesting that I try using hotplugging instead of ballooning in the guest? Do I need to modify the kernel itself or change something via sysctl or otherwise? You for sure will have to rebuild kernel. If you are not sure what to do than better ask author of hot-plug ballooning patches, he might give you the latest patches and probably help with getting them running/testing on your system. PS: If you manage to run/test memory hot-plug ballooning, It would be nice to have feedback on it. At least it might help the author to push it to upstream kernel if it behaves better then the current implementation. PS2: as for reverting to no ballooning behaviour you can just set maxmem == memory or revert patches from bug 523122. |