Bug 1749633
Summary: | kernel: brk can grow the heap into the area reserved for the stack | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | DJ Delorie <dj> |
Component: | kernel | Assignee: | Waiman Long <llong> |
kernel sub component: | Memory Management | QA Contact: | Li Wang <liwan> |
Status: | CLOSED ERRATA | Docs Contact: | |
Severity: | unspecified | ||
Priority: | unspecified | CC: | ashankar, codonell, cye, dj, fweimer, llong, longman, mm-maint, mnewsome, pfrankli, qe-baseos-tools-bugs, skolosov |
Version: | 8.1 | ||
Target Milestone: | rc | ||
Target Release: | 8.1 | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | kernel-4.18.0-161.el8 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | 1718844 | Environment: | |
Last Closed: | 2020-04-28 16:25:40 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1747453 |
Description
DJ Delorie
2019-09-06 05:04:21 UTC
Even the “normal run” in bug 1718844 comment 2 is buggy: 7fffc2120000-7fffc2130000 rw-p 00000000 00:00 0 [heap] 7fffe7de0000-7fffe7e10000 rw-p 00000000 00:00 0 [stack] There is just not enough room between the heap and the stack. It means that if the heap expands and then a deep recursion starts, consuming a lot of stack, the stack will overflow. The reproducer from bug 1410097 comment 6 should apply here as well. Enhanced reproducer with more autodetection. It should always exit with 0, but sometimes ASLR results in us not exercising the bug: #include <err.h> #include <limits.h> #include <stdbool.h> #include <stdint.h> #include <stdio.h> #include <stdlib.h> #include <sys/resource.h> static char *first_address; enum { buffer_size = 4096 }; static void recurse (size_t depth, size_t maximum) { char buffer[buffer_size - 256]; if (first_address == NULL) first_address = buffer; printf ("depth %d stack address %p distance %td bytes\n", depth, &buffer, first_address - buffer); if (depth < maximum) recurse (depth + 1, maximum); asm volatile ("" ::: "memory"); /* Prevent tail recursion. */ } int main (void) { struct rlimit rlim; if (getrlimit (RLIMIT_STACK, &rlim) != 0) err (1, "getrlimit (RLIMIT_STACK)"); if (rlim.rlim_cur == RLIM_INFINITY) { printf ("warning: stack size set to unlimited, cannot test\n"); return 0; } /* Reserve 32 KiB for the printf call. */ size_t recursion_depth = (rlim.rlim_cur - 32 * 1024) / buffer_size; printf ("info: stack size %zu bytes, recursion depth %zu\n", (size_t) rlim.rlim_cur, recursion_depth); void *initial_heap_ptr = malloc (1); printf ("info: address on stack: %p\n", &rlim); printf ("info: address on heap: %p\n", initial_heap_ptr); if ((uintptr_t) &rlim < (uintptr_t) initial_heap_ptr) { printf ("warning: stack is below heap, cannot test\n"); return 0; } size_t gap_size = (uintptr_t) &rlim - (uintptr_t) initial_heap_ptr; printf ("info: stack/heap gap size: %zu bytes\n", gap_size); /* 16 GiB guards against excessive execution time. */ if (gap_size > SSIZE_MAX || gap_size > 16LL * 1024 * 1024 * 1024) { printf ("warning: stack/heap gap is too large, cannot test\n"); return 0; } /* Try to fill the gap between heap and stack. */ void **heap_filler = 0; size_t total = 0; while (true) { size_t sz = 16000; void **next = malloc (sz); if (next == NULL) { printf ("info: memory allocation failure\n"); break; } if ((uintptr_t) next < (uintptr_t) initial_heap_ptr) { printf ("info: malloc returned %p, below initial allocation\n", next); break; } if ((uintptr_t) next > (uintptr_t) &rlim) { printf ("info: malloc returned %p, above stack\n", next); break; } total += sz; *next = heap_filler; heap_filler = next; } printf ("info: heap-allocated %zu bytes\n", total); recurse (0, recursion_depth); return 0; } Success looks like this: info: stack size 8388608 bytes, recursion depth 2040 info: address on stack: 0x7fffdb881060 info: address on heap: 0x23e006b0 info: stack/heap gap size: 140736274631088 bytes warning: stack/heap gap is too large, cannot test Or a recursion without a crash. Failure looks like this: info: stack size 8388608 bytes, recursion depth 2040 info: address on stack: 0x7ffc55f21500 info: address on heap: 0x7ffb9bd2d6b0 info: stack/heap gap size: 3122609744 bytes info: malloc returned 0x7ffb99ca3010, below initial allocation info: heap-allocated 3118304000 bytes depth 0 stack address 0x7ffc55f205e0 distance 0 bytes depth 1 stack address 0x7ffc55f1f6c0 distance 3872 bytes depth 2 stack address 0x7ffc55f1e7a0 distance 7744 bytes depth 3 stack address 0x7ffc55f1d880 distance 11616 bytes depth 4 stack address 0x7ffc55f1c960 distance 15488 bytes depth 5 stack address 0x7ffc55f1ba40 distance 19360 bytes depth 6 stack address 0x7ffc55f1ab20 distance 23232 bytes depth 7 stack address 0x7ffc55f19c00 distance 27104 bytes depth 8 stack address 0x7ffc55f18ce0 distance 30976 bytes depth 9 stack address 0x7ffc55f17dc0 distance 34848 bytes depth 10 stack address 0x7ffc55f16ea0 distance 38720 bytes depth 11 stack address 0x7ffc55f15f80 distance 42592 bytes depth 12 stack address 0x7ffc55f15060 distance 46464 bytes depth 13 stack address 0x7ffc55f14140 distance 50336 bytes depth 14 stack address 0x7ffc55f13220 distance 54208 bytes depth 15 stack address 0x7ffc55f12300 distance 58080 bytes depth 16 stack address 0x7ffc55f113e0 distance 61952 bytes depth 17 stack address 0x7ffc55f104c0 distance 65824 bytes depth 18 stack address 0x7ffc55f0f5a0 distance 69696 bytes depth 19 stack address 0x7ffc55f0e680 distance 73568 bytes depth 20 stack address 0x7ffc55f0d760 distance 77440 bytes depth 21 stack address 0x7ffc55f0c840 distance 81312 bytes depth 22 stack address 0x7ffc55f0b920 distance 85184 bytes depth 23 stack address 0x7ffc55f0aa00 distance 89056 bytes depth 24 stack address 0x7ffc55f09ae0 distance 92928 bytes depth 25 stack address 0x7ffc55f08bc0 distance 96800 bytes depth 26 stack address 0x7ffc55f07ca0 distance 100672 bytes depth 27 stack address 0x7ffc55f06d80 distance 104544 bytes depth 28 stack address 0x7ffc55f05e60 distance 108416 bytes depth 29 stack address 0x7ffc55f04f40 distance 112288 bytes depth 30 stack address 0x7ffc55f04020 distance 116160 bytes depth 31 stack address 0x7ffc55f03100 distance 120032 bytes depth 32 stack address 0x7ffc55f021e0 distance 123904 bytes Segmentation fault (core dumped) If you run in the test in a loop, you will eventually see the failure on x864-64, too, so it is not a bug specific to ppc64le. It is just much, much, much more likely to trigger there. I cannot reproduce this with 5.4rc6 upstream, neither with x86-64 nor with ppc64le. It looks like ld.so is mapped so low that this is no longer a problem on ppc64le: info: stack size 8388608 bytes, recursion depth 2040 info: address on stack: 0x7ffff21d7b20 info: address on heap: 0x121cd06b0 info: stack/heap gap size: 140732393354352 bytes warning: stack/heap gap is too large, cannot test Likewise on x86-64: info: stack size 8388608 bytes, recursion depth 2040 info: address on stack: 0x7ffd5a12cc10 info: address on heap: 0x5555570296b0 info: stack/heap gap size: 46901094266208 bytes It seems that in both cases, ld.so (which determines the start of the heap via the end of its data segment) is mapped similar to a PIE program, which makes sense actually. The tested RHEL kernels (with the failure) are kernel-4.18.0-147.12.el8.ppc64le and kernel-4.18.0-147.12.el8.x86_64. I cannot reproduce this issue with kernel-3.10.0-1110.el7.x86_64, so technically, this qualifies as a RHEL 8 regression, I think. Oops. To clarify, the reproducer in comment 3 needs to be run with an explicit loader invocation to tickle the bug, like this: /lib64/ld-linux-x86-64.so.2 ./a.out Explicit loader invocations are not entirely obscure because they provide features not otherwise available, like non-inheriting LD_PRELOAD (see bug 1747453). longman, care to take a look? It looks like the bug is pretty well explained and has a reproducer. I think this falls into your area rather than mine :) P. (In reply to Prarit Bhargava from comment #8) > longman, care to take a look? It looks like the bug is pretty well > explained and has a reproducer. I think this falls into your area rather > than mine :) Sure. I will take a look at this BZ. -Longman Actually, I am not able to reproduce the problem with the reproducer. On an x86-64 system with 4.18.0-147.12.el8.x86_64 kernel: info: stack size 8388608 bytes, recursion depth 2040 info: address on stack: 0x7ffc9804d600 info: address on heap: 0x1fa56b0 info: stack/heap gap size: 140722825756496 bytes warning: stack/heap gap is too large, cannot test On an ppcle system with 4.18.0-147.el8.ppc64le kernel: info: stack size 8388608 bytes, recursion depth 2040 info: address on stack: 0x7fffdc75cc38 info: address on heap: 0x36f80670 info: stack/heap gap size: 140735969871304 bytes warning: stack/heap gap is too large, cannot test Is there any special compilation or loader option that is used to compile the reproducer? Also is there any special kernel boot command option that was added (/proc/cmdline)? -Longman (In reply to Waiman Long from comment #10) > Actually, I am not able to reproduce the problem with the reproducer. Hmmph. Did you use an explicit loader invocation? How many times have you run this test? Thanks. (In reply to Florian Weimer from comment #11) > (In reply to Waiman Long from comment #10) > > Actually, I am not able to reproduce the problem with the reproducer. > > Hmmph. Did you use an explicit loader invocation? How many times have you > run this test? Thanks. Yes, you are right. I forgot to use the explicit loader invocation. Even then, -Longman You need to run it in a loop, like this: while /lib64/ld-linux-x86-64.so.2 ./a.out ; do : ; done Does your POWER system use the radix MMU? Mine did, maybe that makes failure much more likely? I have found that upstream commit that will fix this bug: commit bbdc6076d2e5d07db44e74c11b01a3e27ab90b32 Author: Kees Cook <keescook> Date: Tue May 14 15:43:57 2019 -0700 binfmt_elf: move brk out of mmap when doing direct loader exec I will backport this commit to RHEL8. -Longman Patch(es) available on kernel-4.18.0-161.el8 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:1769 |