Bug 113830 - Random stack overruns with -lrt
Random stack overruns with -lrt
Status: CLOSED WONTFIX
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
1
i586 Linux
high Severity high
: ---
: ---
Assigned To: Arjan van de Ven
Brian Brock
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2004-01-19 01:20 EST by Kasper Dupont
Modified: 2007-11-30 17:10 EST (History)
0 users

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2004-09-29 15:58:46 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Kasper Dupont 2004-01-19 01:20:13 EST
Description of problem:
The maximum allowed stack height is random and when a program is
compiled with -lrt this maximum is sometimes unacceptedly low. The
problem can be reproduced with this program that executes itself again
until it eventually fails:

#include <unistd.h>
void rek(int i)
{
  if (i) rek(i-1);
}
int main(int argc, char ** argv)
{
  rek(2000);
  execlp(argv[0],argv[0],NULL);
  return 1;
}

Version-Release number of selected component (if applicable):
glibc-2.3.2-101.4

How reproducible:
Always

Steps to Reproduce:
1. Compile program with -lrt
2. Run program
  
Actual results:
Segmentation fault

Expected results:
Program runs forever

Additional info:
The problem can sometimes be reproduced with as litle as 50 stack
frames. But 2000 was choosen to reproduce the problem a litle faster.
Comment 1 Jakub Jelinek 2004-01-19 03:42:59 EST
If you are on i586, then you are using LinuxThreads, not NPTL.
Now it depends on which exact LT you're using.
Can you ldd that program?
LinuxThreads in /lib/i686 (which is what should be used on i586)
don't limit the stack in any way, while LT in /lib limit it to
less than 2MB.  But if it fails with 50 frames, it means a kernel
bug.
Comment 2 Kasper Dupont 2004-01-19 11:36:41 EST
Output from ldd a.out:
        librt.so.1 => /lib/librt.so.1 (0x007ca000)
        libc.so.6 => /lib/libc.so.6 (0x001ae000)
        libpthread.so.0 => /lib/libpthread.so.0 (0x00400000)
        /lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x00199000)

There is no /lib/i686. The installed glibc is glibc-2.3.2-101.4.i386.rpm
Comment 3 Jakub Jelinek 2004-01-19 11:57:37 EST
Reassigning to kernel then.  setrlimit RLIMIT_STACK 2MB should mean
there are really 2MBs of stack allowed.
Comment 4 Kasper Dupont 2004-03-07 09:11:09 EST
The problem exist only with the i386 version of glibc. I found an i686
system where I was unable to reproduce the problem. After installing
the i386 version of glibc on this i686 system I was able to reproduce
the problem.

I have looked on the /proc/%d/maps pseudofile of the process just
before it dumps.

00199000-001ab000 r-xp 00000000 21:05 46953      /lib/ld-2.3.2.so
001ab000-001ac000 rw-p 00011000 21:05 46953      /lib/ld-2.3.2.so
001ae000-002cc000 r-xp 00000000 21:05 46954      /lib/libc-2.3.2.so
002cc000-002cf000 rw-p 0011d000 21:05 46954      /lib/libc-2.3.2.so
002cf000-002d2000 rw-p 00000000 00:00 0
00400000-0040e000 r-xp 00000000 21:05 46992      /lib/libpthread-0.10.so
0040e000-0040f000 rw-p 0000e000 21:05 46992      /lib/libpthread-0.10.so
0040f000-00451000 rw-p 00000000 00:00 0
007ca000-007d0000 r-xp 00000000 21:05 47045      /lib/librt-2.3.2.so
007d0000-007d1000 rw-p 00005000 21:05 47045      /lib/librt-2.3.2.so
007d1000-007dc000 rw-p 00000000 00:00 0
08048000-08049000 r-xp 00000000 21:05 30695      /tmp/a.out
08049000-0804a000 rw-p 00000000 21:05 30695      /tmp/a.out
3ff13000-3ff14000 rw-p 00000000 00:00 0
3ff2d000-3ff2e000 rw-p 00000000 00:00 0
bfe09000-c0000000 rw-p fff07000 00:00 0

And here is one from a case where it doesn't dump.

00199000-001ab000 r-xp 00000000 21:05 46953      /lib/ld-2.3.2.so
001ab000-001ac000 rw-p 00011000 21:05 46953      /lib/ld-2.3.2.so
001ae000-002cc000 r-xp 00000000 21:05 46954      /lib/libc-2.3.2.so
002cc000-002cf000 rw-p 0011d000 21:05 46954      /lib/libc-2.3.2.so
002cf000-002d2000 rw-p 00000000 00:00 0
00400000-0040e000 r-xp 00000000 21:05 46992      /lib/libpthread-0.10.so
0040e000-0040f000 rw-p 0000e000 21:05 46992      /lib/libpthread-0.10.so
0040f000-00451000 rw-p 00000000 00:00 0
007ca000-007d0000 r-xp 00000000 21:05 47045      /lib/librt-2.3.2.so
007d0000-007d1000 rw-p 00005000 21:05 47045      /lib/librt-2.3.2.so
007d1000-007dc000 rw-p 00000000 00:00 0
08048000-08049000 r-xp 00000000 21:05 30695      /tmp/a.out
08049000-0804a000 rw-p 00000000 21:05 30695      /tmp/a.out
3ffa3000-3ffa4000 rw-p 00000000 00:00 0
3ffa9000-3ffaa000 rw-p 00000000 00:00 0
bfe72000-c0000000 rw-p fff1e000 00:00 0

Only the last three mappings differ, and there is a clear connection
between the location of the last mapping and the segmentation fault.
The cases where the program dups the start address of the last mapping
have been in the range bfe00000-bfe09000 in the cases where program
works the start address of the last mapping have been in the range
bfe72000-bfeb6000.
Comment 5 Kasper Dupont 2004-03-14 07:32:53 EST
I tried to install kernel-2.4.20-30.9.i586.rpm from Red Hat Linux 9.
With this kernel I am unable to reproduce the problem. Looks like it
is the combination of glibc and kernel versions used in Fedora Core 1
that is causing the problem.
Comment 6 Kasper Dupont 2004-03-20 08:31:59 EST
The problem seems to be, that the i386 version of glibc incorrectly
assumes the stack pointer will always be in the highest 2MB of virtual
address space. The kernel will randomize the stack location by moving
the stack pointer a random offset down. The offset is picked from the
range 0-2MB thus sometimes the stack pointer will be too low already
when the program starts.

A workaround that seems to work is the following statement executed as
root:
echo 0 > /proc/sys/kernel/exec-shield-randomize 

But it will only work as long as the program use less than 2MB of stack.

Should this bug be reassigned to glibc again? I don't see any symptoms
indicating this is a kernel bug.
Comment 7 Jakub Jelinek 2004-03-20 09:15:32 EST
glibc makes no incorrect assumptions in this regard.
The problem really is (or was, dunno) on the kernel side,
in that the randomization shouldn't be accounted into RLIMIT_STACK.

If you want to use more than 2MB of stack, then you either must not
use threads (and librt) at all, or need to use LinuxThreads FLOATING_STACKS
or NPTL.  The latter 2 aren't built in the i386 glibc, but you can
build them yourself if you want.  LinuxThreads !FLOATING_STACKS simply
have the limitation of 2MB per stack.
Comment 8 Kasper Dupont 2004-03-20 10:38:52 EST
I don't think this is related to RLIMIT_STACK. Since RLIMIT_STACK is
by default 10MB on Fedora Core 1, and the problem happens already with
the stack pointer moved down aproximately 2MB. You can use more than
2MB of stack as long as you don't call any glibc functions.

The main problem isn't that LinuxThreads have a 2MB stack limit, this
would be acceptable if the kernel didn't take a large fraction of
that. The problem is, that the kernel will take a random amount of
those 2MB of stack space before the application even starts. This will
turn the 2MB of available stack space into a random amount of stack
space which is less than 1MB on average, and even sometimes less than
4KB of stack space.

The problem affects not only threaded programs. Any program using
Posix shared memory is affected even if it doesn't use threads.

Right now the i386 version of glibc is incompatible with the kernel
causing programs to crash at random. Even ls crash sometimes.

Try this on a Fedora Core 1 system using the i386 version of glibc:
while ls ; do true ; done >/dev/null
Comment 9 Kasper Dupont 2004-03-21 18:02:24 EST
How do I build a glibc with NPTL? I tried adding i386 to the
nptlarches define in the start of the .spec file, but that couldn't
compile. Then I tried to write i586 instead of i386 under nptlarches
and build with rpmbuild -ba --target=i386,i586 glibc.spec, this time
the build completed and I could also install and use the glibc I had
just compiled, but nothing has changed.
Comment 10 David Lawrence 2004-09-29 15:58:46 EDT
Thanks for the bug report. However, Red Hat no longer maintains this version of
the product. Please upgrade to the latest version and open a new bug if the problem
persists.

The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases, 
and if you believe this bug is interesting to them, please report the problem in
the bug tracker at: http://bugzilla.fedora.us/

Note You need to log in before you can comment on or make changes to this bug.