Bug 88757 - random sigsegs crashes in lots of programs right after installing
random sigsegs crashes in lots of programs right after installing
Status: CLOSED NOTABUG
Product: Red Hat Linux
Classification: Retired
Component: glibc (Show other bugs)
9
All Linux
high Severity high
: ---
: ---
Assigned To: Jakub Jelinek
Brian Brock
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2003-04-13 05:59 EDT by Bohdan Vlasyuk
Modified: 2005-10-31 17:00 EST (History)
0 users

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2003-10-03 04:41:11 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Bohdan Vlasyuk 2003-04-13 05:59:30 EDT
Description of problem:

After installing rh9 (mediacheck ok!) over previously working rh8,
most applications misbehave. These include incorrect error messages
(for example "awk '{print $1}'" says "cannot open file [garbage]"),
crashes (the stacktrace for "ls" and "awk" is shown below). I'm not sure
if it worked on the installation stage, though, when I try to reinstall
it fails miserably even to create a boot disk (also due to AWK errors).

I'm not sure which programs also fail to work, mount is the only one I
now remember. But probably, the question should be "which programs
actually *work*". Among these was "vim".

I suspect it's glibc problem, because similar symptomps appear in most
programs which I could run.

I've tried disabling NPTL with LD_KERNEL_SOMETHING (as described in release 
notes), and with kernel command line. both gave zero effect.

How reproducible:

Crashes happen constantly and error messages and [garbage] are all
same every time.

I have too few resources to reproduce it in global (like installing
rh8 anew, then rh9), and it might be not reproducible due to my
custom settings.
    
Actual results:

ls says "Segmentation violation"
awk fails to work

Expected results:

ls should output directory listing
awk should work

Additional info:

The stack trace:
ls:

GNU gdb Red Hat Linux (5.3post-0.20021129.18rh)
Copyright 2003 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu"...
(no debugging symbols found)...
(gdb) Starting program: /bin/ls
(no debugging symbols found)...(no debugging symbols found)...
(no debugging symbols found)...
Program received signal SIGSEGV, Segmentation fault.
0x51ffffff in ?? ()
(gdb) #0  0x51ffffff in ?? ()
#1  0x08052fb7 in strcpy ()
#2  0x0804d4c8 in strcpy ()
#3  0x0804d7a1 in strcpy ()
#4  0x0804d8f1 in strcpy ()
#5  0x0804cb9f in strcpy ()
#6  0x0804b722 in strcpy ()
#7  0x08049f75 in strcpy ()
#8  0x42015574 in __libc_start_main () from /lib/tls/libc.so.6
(gdb)

awk (/tmp/awk_cmdl contains arbitrary awk commands":
GNU gdb Red Hat Linux (5.3post-0.20021129.18rh)
Copyright 2003 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu"...
(no debugging symbols found)...
(gdb) Starting program: /bin/awk -f /tmp/awk_cmdl /etc/fstab
(no debugging symbols found)...(no debugging symbols found)...
(no debugging symbols found)...(no debugging symbols found)...
Program received signal SIGSEGV, Segmentation fault.
0x76093cd0 in ?? ()
(gdb) #0  0x76093cd0 in ?? ()
#1  0x0806591d in do_input ()
#2  0x0806a4dd in main ()
#3  0x42015574 in __libc_start_main () from /lib/tls/libc.so.6
(gdb)
Comment 1 Bohdan Vlasyuk 2003-04-13 06:23:46 EDT
If it would help, here is mount backtrace both with and without NPTL:

Script started on Sun 13 Apr 2003 04:16:48 PM EEST
sh-2.05b# gdb mount <<< "r
bt"
GNU gdb Red Hat Linux (5.3post-0.20021129.18rh)
Copyright 2003 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu"...
(no debugging symbols found)...
(gdb) Starting program: /bin/mount
(no debugging symbols found)...(no debugging symbols found)...
Program received signal SIGSEGV, Segmentation fault.
0x4207a42b in strlen () from /lib/tls/libc.so.6
(gdb) #0  0x4207a42b in strlen () from /lib/tls/libc.so.6
#1  0x08051998 in error ()
#2  0x0804cb21 in strcpy ()
#3  0x42015574 in __libc_start_main () from /lib/tls/libc.so.6
(gdb) sh-2.05b# exit

Script done on Sun 13 Apr 2003 04:18:53 PM EEST

Script started on Sun 13 Apr 2003 04:21:06 PM EEST
sh-2.05b# export LD_ASSUME_KERNEL=2.4.1
sh-2.
05b# gdb mount <<< "r
> bt"
GNU gdb Red Hat Linux (5.3post-0.20021129.18rh)
Copyright 2003 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu"...
(no debugging symbols found)...
(gdb) Starting program: /bin/mount
(no debugging symbols found)...(no debugging symbols found)...
Program received signal SIGSEGV, Segmentation fault.
0x4009118b in strlen () from /lib/i686/libc.so.6
(gdb) #0  0x4009118b in strlen () from /lib/i686/libc.so.6
#1  0x08051998 in error ()
#2  0x0804cb21 in strcpy ()
#3  0x4002c8d7 in __libc_start_main () from /lib/i686/libc.so.6
(gdb) sh-2.05b# exit

Script done on Sun 13 Apr 2003 04:21:47 PM EEST
Comment 2 Ulrich Drepper 2003-04-17 14:56:27 EDT
If this would be a generic problem nothing would work at all.

Did you reinstall the entire system, or did you update?  If the latter, try a
fresh install first.  There is no way for me to guess what the problem is since
the backtraces are useless.  Try to do some debugging yourself.  Without more
information I will have to close the bug as WORKSFORME.
Comment 3 Bohdan Vlasyuk 2003-04-18 03:51:08 EDT
it was upgrade, indeed.

i'm not actually interested in fresh install, but i could
try some debugging.

could you suggest some starting points for the debigging?
how can i trace the cause of sigsegvs. it sounds like memory
corruption (?), notice the strange awk behavior. what ither information
can me useful? probably debuginfo could help although i'm not sure
where can i find debuginfo for glibc and awk.

waiting for comments.
Comment 4 Ulrich Drepper 2003-04-22 04:39:46 EDT
There is not much I can say.  First try to find the exact call path.  Does the
program reach main()?  If yes, where does it really stop (strcpy called from
__libc_start_main cannot be right, that function isn't called in
__libc_start_main).  Single step if necessary on asm level to the place where it
crashes.
Comment 5 Michael Young 2003-04-25 09:55:22 EDT
If I was debugging this, I would be checking the integrity of the glibc
packages, assuming the necessary programs don't segfault, by running
rpm -qa | grep glibc     # to check for multiple packages, and that the 
                         # glibc and glibc-common versions are the same.
rpm -V glibc glibc-common mount      # Do the glibc and mount files match the
                                     # checksums from the packages ?
ldd /bin/mount        # Are we getting the libc library from the right place?

Also if you think there may be memory problems, look at the memtest86 utility
available from http://www.memtest86.com/ to check for problems.
Comment 6 Bohdan Vlasyuk 2003-04-26 12:32:55 EDT
i've got it up and running [up and crashing ;-)] in vmware now i shall try
to debug it a bit further.

rpm -qa|grep glibc shows 2.3.2-27.9 both plain glibc anbd glibc-common.
rpm -V mount gawk gives the solution! both mount and awk executables fail
the md5sum.

it seems like the cause of the problem.
probably I'll check it later and i will report it here

i guess now you can close the bug.
Comment 7 Ulrich Drepper 2003-10-03 04:41:11 EDT
User bug, broken binaries.

Note You need to log in before you can comment on or make changes to this bug.