Bug 132947
Summary: | [PATCH] kernel memory leak on x86_64 in 32/64 mixed mode | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Axel Thimm <axel.thimm> | ||||
Component: | kernel | Assignee: | Jim Paradis <jparadis> | ||||
Status: | CLOSED RAWHIDE | QA Contact: | |||||
Severity: | high | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 3 | CC: | barryn, benny+bugzilla, bryans, davej, dgunchev, oliva, peterm, stesmi, wtogami | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2004-11-03 19:49:39 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 130887, 135876 | ||||||
Attachments: |
|
Description
Axel Thimm
2004-09-20 10:01:21 UTC
http://people.redhat.com/arjanv/2.6/ Please see if it is still an issue with newer kernels from here. Are there any hints these packages should have this leak fixed? Diffing 541 to it, and checking the Changelogs of the base kernels I couldn't find anything addressing this. I'd love to test those kernels but they don't even boot on both my Tyan systems. The Oopsing I see seems to be an unrelated bug, so I'll file it seperately. Could someone@redhat verify this memory leak on x86_64 running 32 bit apps? Any longer configure script will eat up all your memory. Or you can try to rebuild the kernel, works also :( In reply to comment #2: > I'd love to test those kernels but they don't even boot on both my > Tyan systems. That was true for 582. 584 boots fine, but shows the same memory leak behaviour. does slabtop show any obvious signs of leakage? When I checked it with 541 slabtop did not show any numbers summing up anything near the 1GB that had leaked. Would that be the indication, or are there other scales relevant here? Migrating to FC3t2 (problem stil persists in FC2, but I hope FC3t2 gets more focus, especially through bug #130887) It looks like it's a process creation/destruction issue, e.g. the memory leak seems to go with the number of generated processes and not process life time. That's why it is only visible in certain process generating scenarios like configure scripts or email scanners. With openoffice it would take you days to create as many 32 bit processes to detect the memory leak. I definately agree with that. I am currently running a 64bit FC2 521 kernel on my testserver with a 32bit mail antivirus scanner installed and every 8 or so days it OOMs. I have not started digging yet but the bug definately exists. Turning off the antivirus scanner does not help at all. This looks like it might be a page table leak of some kind. I brought up a system to text mode login (i.e. relatively quiescent) and logged in on two VTs. On one of them I mounted a 32-bit FC root, chroot'ed to it, and mounted /proc under that. Then on each I just did several iterations of "cat /proc/meminfo". Every time the 32-bit "cat" ran, the PageTables entry increased by exactly 24kb, whereas multiple iterations of the 64-bit "cat" didn't change the number at all. Continuing to investigate. This is still a problem in 1.624 :-( I've looked into it a little bit, adding debug statements before every increment and decrement of nr_page_table_pages. I found out that, when I run /lib/ld-linux.so.2 from an init=/bin/bash session, it leaks exactly one PTE page, allocated by install_arg_page, called from ia32_setup_arg_pages, called by load_elf32_binary (that's load_elf_binary, #define-renamed in arch/x86_64/ia32/ia32_binfmt.c). When I run /lib/libc.so.6, it leaks two PTE pages, one exactly as above, and one allocated by handle_mm_fault, called by inode_has_perm, called by do_page_fault, called by vma_prio_tree_insert, called by error_exit, called by __clear_user, called by __clear_user (presumably the stack dump just got confused because of the MMU exception), called by load_elf_interp, called by load_elf32_binary. AFAICT, the latter is zeroing out the BSS for libc.so, that had been previously allocated with a MAP_ANON mmap. Unfortunately, I can't see anything particularly wrong with the way these PTEs are allocated, so presumably the problem is on the other end: whatever should be deallocating them isn't. I haven't investigated this possibility yet. vsyscall32, which was my first suspicion, doesn't make any difference as far as the leak is concerned. It seems to be happening on static as well as dynamic executables, so it's nothing that ld-linux.so or libc is doing (not that I thought so). Diff'ing 2.6.7 and fc3 doesn't show anything obvious to me, but I'll keep looking If I had to hazard a guess, I'd say it's something from the 64-bit executable that fails to be deallocated at exec() time. The new definition of TASK_SIZE limits the maps considered valid for the executable, and I don't see as many ptes being freed with 2.6.9 as I do with 2.6.7 at exec time. The comment above the first SET_PERSONALITY in fs/binfmt_elf.c is particularly enlightening, and pretty much proves my theory is correct. Now on to figure out how to fix it. Created attachment 105676 [details]
Patch that fixes the Fedora-local patch that introduces a ia32-compat memory leak in x86_64
As it turns out, the culprit is a Fedora-local patch:
linux-2.6.8-flexmmap-x86-64.patch, that modifies the way TASK_SIZE is defined.
I'm not entirely sure about why it breaks, since SET_PERSONALITY appears to be
defined in such a way that the TIF_IA32 flag is only set at the point it should
be, but it somehow still causes memory to leak. Since the modified TASK_SIZE
setting is not upstream, and it apparently is only necessary for exec-shield
randomized mmap (?), I came up with this patch for the patch we install at
build time. I've rebuilt vmlinuz on a tree on which I'd previously built 1.640
plus a few unrelated patches, booted into it, and am now half-way through a
32-bit-only GCC+binutils+GDB bootstrap, without apparent leaks. Yay!
flexmmap has nothing to do with execshield but about getting the maximum use out of the virtual address space, and in this case also compatibility with our 32 bit distro Right, but I was concerned that the upstream definition of TASK_SIZE might break exec-shield randomization, since, without the 32-bit-limiting TASK_SIZE, exec-shield might choose memory addresses for the executable or the dynamic loader that were not within the 32-bit address space. As it turns out, it apparently doesn't, but it now occurs to me that this box is already fully prelinked, so maybe that would hide any problem in randomizing the dynamic loader load location. But then, TASK_SIZE *is* overridden to 0xffffffff in ia32_binfmt.c, so perhaps that would be enough to avoid trouble. I'm also experiencing this same problem on 2.6.8-1.521smp x86_64 on FC2. Is there a planned update from Red Hat that will resolve this problem? Also does the 2.6.9-1.640smp kernel in the development tree have this fixed? Alexandre, thanks for spotting and fixing this! I have rebuilt FC2 kernels with your fix and voila, x86_64 can run i386 binaries w/o leaking again! This is just a verification on the FC2 platform. I hope this fix makes it not only to rawhide/fc3, but also to the next FC2 kernel errata. Thanks, again, it was quite a painful bug ... :) Bryan, just install the 2.6.8-1.521 src.rpm, go to where the sources/patches were extracted (usually /usr/src/redhat/SOURCES), apply Alexandre's patch in attachment (id=105676), and rebuild the kernel rpm. *** Bug 137518 has been marked as a duplicate of this bug. *** will be fixed in next build. 2.6.9-1.667 does indeed fix the problem for me. As far as I'm concerned this bug can be closed. |