Bug 231312
Summary: | reproducible stack overflow with trivial test program | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Paul Clements <paul.clements> | ||||||
Component: | kernel | Assignee: | Dave Anderson <anderson> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Martin Jenner <mjenner> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | medium | ||||||||
Version: | 5.0 | CC: | dzickus, james.bottomley, smoser | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | ppc64 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | RHBA-2007-0959 | Doc Type: | Bug Fix | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2007-11-07 19:43:03 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Paul Clements
2007-03-07 17:05:21 UTC
Created attachment 149467 [details]
test harness (shell script)
Created attachment 149469 [details]
test program (C source)
compile with: gcc a.c -o c -lstdc++
The cause of this is the RHEL ppc64 kernel having 64k pages on by default. The problem is in fs/binfmt_elf.c:randomize_stack_top() which has this code: #ifndef STACK_RND_MASK #define STACK_RND_MASK (0x7ff >> (PAGE_SHIFT - 12)) /* 8MB of VA */ #endif static unsigned long randomize_stack_top(unsigned long stack_top) { unsigned int random_variable = 0; if ((current->flags & PF_RANDOMIZE) && !(current->personality & ADDR_NO_RANDOMIZE)) { random_variable = get_random_int() & STACK_RND_MASK; random_variable <<= PAGE_SHIFT; } #ifdef CONFIG_STACK_GROWSUP return PAGE_ALIGN(stack_top) + random_variable; #else return PAGE_ALIGN(stack_top) - random_variable; #endif } if you have 64k pages, this makes your randomization 128MB. Co-incidentally, in the new binary format, only 128MB is left between the top of process memory and the first mapping, so for a stack rlimit of < 128MB you stand a non zero chance of randomizing your stack base away entirely and thus producing random crashes. Sorry, that's code from the proposed fix on lkml. the true define is #define STACK_RND_MASK 0x7ff /* with 4K pages 8MB of VA */ The fix is now committed to mainline as commit d1cabd63262707ad5d6bb730f25b7a2852734595 Author: James Bottomley <James.Bottomley> Date: Fri Mar 16 13:38:35 2007 -0800 [PATCH] fix process crash caused by randomisation and 64k pages This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release. Patch as put into 2.6.21-rc4-git2: diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c index 51db118..a2fceba 100644 --- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -507,7 +507,7 @@ out: #define INTERPRETER_ELF 2 #ifndef STACK_RND_MASK -#define STACK_RND_MASK 0x7ff /* with 4K pages 8MB of VA */ +#define STACK_RND_MASK (0x7ff >> (PAGE_SHIFT - 12)) /* 8MB of VA */ #endif static unsigned long randomize_stack_top(unsigned long stack_top) I agree that the patch should be applied, but I cannot reproduce this. If compiled natively on the RHEL5 machine with gcc 4.1.1, it runs with no problem. But the test directions indicate that the test program to be compiled on a RHEL3 machine with gcc 3.2.3-42. However, the closest I can come to that is a RHEL3 machine with gcc 3.2.3-46: # cat /etc/redhat-release Red Hat Enterprise Linux AS release 3 (Taroon) [root@p630 root]# gcc --version gcc (GCC) 3.2.3 20030502 (Red Hat Linux 3.2.3-46) Copyright (C) 2002 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. # gcc a.c -o a -lstdc++ # scp a [ to RHEL5 machine ] But it does not run on the RHEL5 machine: # cat /etc/redhat-release Red Hat Enterprise Linux Server release 5 (Tikanga) # ./a ./a: error while loading shared libraries: libstdc++.so.5: cannot open shared object file: No such file or directory # So I compiled it without the libstc++ on the RHEL3 machine, but then it runs OK on the RHEL5 machine. Perhaps without the libstc++, the libaries are not even close to the stack since they seem to be moved to 64-bit space: 10000 pid: 14058 00100000-00120000 r-xp 00100000 00:00 0 [vdso] 10000000-10010000 r-xp 00000000 fd:00 6520866 /root/testdir-3.2.3/a 10010000-10020000 rw-p 00000000 fd:00 6520866 /root/testdir-3.2.3/a 80eb5a0000-80eb5d0000 r-xp 00000000 fd:00 2031917 /lib64/ld-2.5.so 80eb5d0000-80eb5e0000 r--p 00020000 fd:00 2031917 /lib64/ld-2.5.so 80eb5e0000-80eb5f0000 rw-p 00030000 fd:00 2031917 /lib64/ld-2.5.so 80eb5f0000-80eb770000 r-xp 00000000 fd:00 2031918 /lib64/libc-2.5.so 80eb770000-80eb780000 r--p 00180000 fd:00 2031918 /lib64/libc-2.5.so 80eb780000-80eb790000 rw-p 00190000 fd:00 2031918 /lib64/libc-2.5.so 80eb790000-80eb7a0000 rw-p 80eb790000 00:00 0 ffffc000000-ffffc150000 rw-p ffffc000000 00:00 0 [stack] Maybe you could attach your "a" binary if by some chance it's different than the one I'm creating? Although I don't see how it will get around the "libstdc++.so.5" error. I tried creating a symbolic link from the current libstdc version to libstdc++.so.5 like so: # cd /usr/lib # ls -l libstdc* lrwxrwxrwx 1 root root 18 May 3 11:59 libstdc++.so.5 -> libstdc++.so.6.0.8 lrwxrwxrwx 1 root root 18 May 3 08:42 libstdc++.so.6 -> libstdc++.so.6.0.8 -rwxr-xr-x 1 root root 1187328 Jan 17 20:24 libstdc++.so.6.0.8 # But I still get the same error. (???) How did you get it all to work in your environment? You need to install the compat-libstdc++ package on your RHEL5 machine: compat-libstdc++-33-3.2.3-61 Ok, I first installed compat-libstdc++-33-3.2.3-61.ppc.rpm, but "a" still fails with the "error while loading shared libraries: libstdc++.so.5". So I installed compat-libstdc++-33-3.2.3-61.ppc64.rpm as well, and "a" works OK, but runs fine, but presumably because it's in 64-bit space: 10000 pid: 14841 00100000-00120000 r-xp 00100000 00:00 0 [vdso] 10000000-10010000 r-xp 00000000 fd:00 6520867 /root/testdir-3.2.3/a 10010000-10020000 rw-p 00000000 fd:00 6520867 /root/testdir-3.2.3/a 80eb5a0000-80eb5d0000 r-xp 00000000 fd:00 2031917 /lib64/ld-2.5.so 80eb5d0000-80eb5e0000 r--p 00020000 fd:00 2031917 /lib64/ld-2.5.so 80eb5e0000-80eb5f0000 rw-p 00030000 fd:00 2031917 /lib64/ld-2.5.so 80eb5f0000-80eb770000 r-xp 00000000 fd:00 2031918 /lib64/libc-2.5.so 80eb770000-80eb780000 r--p 00180000 fd:00 2031918 /lib64/libc-2.5.so 80eb780000-80eb790000 rw-p 00190000 fd:00 2031918 /lib64/libc-2.5.so 80eb790000-80eb7a0000 rw-p 80eb790000 00:00 0 80eb7d0000-80eb890000 r-xp 00000000 fd:00 2031697 /lib64/libm-2.5.so 80eb890000-80eb8a0000 r--p 000b0000 fd:00 2031697 /lib64/libm-2.5.so 80eb8a0000-80eb8b0000 rw-p 000c0000 fd:00 2031697 /lib64/libm-2.5.so 80eb9d0000-80eb9f0000 r-xp 00000000 fd:00 2031926 /lib64/libgcc_s-4.1.1-20070105.so.1 80eb9f0000-80eba00000 rw-p 00010000 fd:00 2031926 /lib64/libgcc_s-4.1.1-20070105.so.1 40000010000-40000120000 r-xp 00000000 fd:00 4574268 /usr/lib64/libstdc++.so.5.0.7 40000120000-40000140000 rw-p 00110000 fd:00 4574268 /usr/lib64/libstdc++.so.5.0.7 40000140000-40000150000 rw-p 40000140000 00:00 0 ffffab80000-ffffacd0000 rw-p ffffab80000 00:00 0 # So, can you confirm that it should use the "ppc" package, and also attach your "a" executable?
> So, can you confirm that it should use the "ppc" package,
> and also attach your "a" executable?
Although the "ppc" package doesn't seem to make sense, because the
executable I built is looking in /usr/lib64:
# ldd /usr/tmp/a
linux-vdso64.so.1 => (0x0000000000100000)
libstdc++.so.5 => /usr/lib64/libstdc++.so.5 (0x0000040000010000)
libc.so.6 => /lib64/libc.so.6 (0x00000080eb5f0000)
libm.so.6 => /lib64/libm.so.6 (0x00000080eb7d0000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00000080eb9d0000)
/lib64/ld64.so.1 (0x00000080eb5a0000)
#
# rpm2cpio compat-libstdc++-33-3.2.3-61.ppc64.rpm | cpio -t
./usr/lib64/libstdc++.so.5
./usr/lib64/libstdc++.so.5.0.7
#
# rpm2cpio compat-libstdc++-33-3.2.3-61.ppc.rpm | cpio -t
./usr/lib/libstdc++.so.5
./usr/lib/libstdc++.so.5.0.7
#
Ok, I re-compiled it on the RHEL3 machine: gcc -m32 a.c -o a -lstdc++ and now I can get it to core dump... Just for documentation purposes, here's an example of a failure: ... 10000 pid: 21750 00100000-00120000 r-xp 00100000 00:00 0 [vdso] 0fba0000-0fbc0000 r-xp 00000000 fd:00 8883491 /lib/libgcc_s-4.1.1-20070105.so.1 0fbc0000-0fbd0000 rw-p 00010000 fd:00 8883491 /lib/libgcc_s-4.1.1-20070105.so.1 0fc90000-0fd50000 r-xp 00000000 fd:00 8883490 /lib/libm-2.5.so 0fd50000-0fd60000 r--p 000b0000 fd:00 8883490 /lib/libm-2.5.so 0fd60000-0fd70000 rw-p 000c0000 fd:00 8883490 /lib/libm-2.5.so 0fee0000-0ffa0000 r-xp 00000000 fd:00 4574266 /usr/lib/libstdc++.so.5.0.7 0ffa0000-0ffb0000 rwxp 000c0000 fd:00 4574266 /usr/lib/libstdc++.so.5.0.7 0ffc0000-0ffe0000 r-xp 00000000 fd:00 8883484 /lib/ld-2.5.so 0ffe0000-0fff0000 r--p 00010000 fd:00 8883484 /lib/ld-2.5.so 0fff0000-10000000 rw-p 00020000 fd:00 8883484 /lib/ld-2.5.so 10000000-10010000 r-xp 00000000 fd:00 6520871 /root/testdir-3.2.3/a 10010000-10020000 rwxp 00000000 fd:00 6520871 /root/testdir-3.2.3/a f7e60000-f7fc0000 r-xp 00000000 fd:00 8883485 /lib/libc-2.5.so f7fc0000-f7fd0000 r--p 00160000 fd:00 8883485 /lib/libc-2.5.so f7fd0000-f7fe0000 rw-p 00170000 fd:00 8883485 /lib/libc-2.5.so f8230000-f8380000 rw-p f8230000 00:00 0 [stack] limit 10 limit 9 limit 8 limit 7 limit 6 limit 5 limit 4 ./a.sh: line 7: 21750 Segmentation fault (core dumped) $prog ERROR The task ran with "ulimit -s 10000", so it could conceivably allow the stack to reach from a top of f8380000 down to f79bc000. That would put it way down into the no-man's land between the /root/testdir-3.2.3/a data region and the first region used by /lib/libc-2.5.so. But it never made it that far, but rather the DAR register shows 00000000F7FAEE50, which puts it in the non-writable libc-2.5.so segment between f7e60000-f7fc0000, causing the segmentation violation: # dmesg a/21750: potentially unexpected fatal signal 11. NIP: 00000000100004C0 LR: 0000000010000534 CTR: 00000000F7ED6380 REGS: c00000003cf6bea0 TRAP: 0300 Not tainted (2.6.18-8.el5) MSR: 000000000000D032 <EE,PR,ME,IR,DR> CR: 40000482 XER: 00000000 DAR: 00000000F7FAEE50, DSISR: 000000000A000000 TASK = c00000003a565ae0[21750] 'a' THREAD: c00000003cf68000 CPU: 1 GPR00: FFFFFFFFFFF85EC0 00000000F8028F90 000000000FFF9710 0000000000000008 GPR04: 00000000F8026948 0000000000000008 0000000000000000 0000000000000000 GPR08: 0000000000008000 0000000000000003 0000000000000000 0000000010010000 GPR12: 00000000F8028F90 0000000010018A7C 0000000000000000 0000000000000000 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR24: 000000000FFCEB40 00000000F837FAE0 00000000F837FAF4 0000000000000001 GPR28: 0000000000000000 000000000FFEF6D8 00000000F7FCFFF4 00000000F8028F90 NIP [00000000100004C0] 0x100004c0 LR [0000000010000534] 0x10000534 in 2.6.18-19.el5 You can download this test kernel from http://people.redhat.com/dzickus/el5 An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2007-0959.html |