This service will be undergoing maintenance at 20:00 UTC, 2017-04-03. It is expected to last about 30 minutes
Bug 98893 - top segmentation faults under heavy load
top segmentation faults under heavy load
Status: CLOSED RAWHIDE
Product: Red Hat Linux
Classification: Retired
Component: procps (Show other bugs)
7.3
i686 Linux
medium Severity medium
: ---
: ---
Assigned To: Daniel Walsh
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2003-07-09 17:58 EDT by gerry.morong
Modified: 2007-04-18 12:55 EDT (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2004-03-29 10:23:43 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description gerry.morong 2003-07-09 17:58:23 EDT
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; compaq)

Description of problem:
After upgrading from kernel from 2.4.18-24.7.smp to 2.4.20-18.7.smp, top will 
segmentation fault after several house on a loaded system.

Version-Release number of selected component (if applicable):
2.0.7-12

How reproducible:
Always

Steps to Reproduce:
1.Load up 4 CPU system system to achieve about 4.0 load average
2.Run top with no options
3.Leave it running for several hours until a segmentation fault is reported.
    

Actual Results:  Top stops with a segmentation fault

Expected Results:  Top should continue running

Additional info:
Comment 1 Alexander Larsson 2003-07-10 06:35:14 EDT
I can't debug this without a backtrace or at least a strace of the crash.
However, the segfault is very likely to have been fixed already in later
versions of procps. Can you try the one in RHL9 (2.0.11-6) or the current
rawhide version and see if they fix the problem.
Comment 2 gerry.morong 2003-07-10 16:05:59 EDT
Rebuilt the 2.0.11 from the source (the binary package expected a newer 
glibc).  Should know something by Friday (7/11/2003)
Comment 3 gerry.morong 2003-07-21 14:49:24 EDT
Died again.

Core was generated by `top'.
Program terminated with signal 11, Segmentation fault.
Reading symbols from /lib/libproc.so.2.0.11...(no debugging symbols found)...
done.
Loaded symbols for /lib/libproc.so.2.0.11
Reading symbols from /usr/lib/libncurses.so.5...(no debugging symbols found)...
done.
Loaded symbols for /usr/lib/libncurses.so.5
Reading symbols from /lib/i686/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib/i686/libc.so.6
Reading symbols from /lib/ld-linux.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/ld-linux.so.2
#0  0x40026eec in stat2proc () from /lib/libproc.so.2.0.11
(gdb) where
#0  0x40026eec in stat2proc () from /lib/libproc.so.2.0.11
#1  0x40027532 in readproc () from /lib/libproc.so.2.0.11
#2  0x0804f276 in readproctab2 ()
#3  0x0804bf92 in show_procs ()
#4  0x0804a1f4 in main ()
#5  0x42017589 in __libc_start_main () from /lib/i686/libc.so.6
(gdb) 
Comment 4 Alexander Larsson 2003-08-06 04:49:11 EDT
Interesting. It seems to die parsing /proc/$pid/stat. I haven't seen that
before. Unfortunately there is no debug information in the package you built, so
its a bit hard to figure out exactly what happened.

Could you try building a version with debug information and try again?
Note that the makefiles for procps manually strip the binaries on install, so i
have this patch in recent rpms:

--- procps-2.0.11/Makefile.dontstrip	2002-12-04 21:49:07.000000000 +0100
+++ procps-2.0.11/Makefile	2003-01-21 10:53:40.000000000 +0100
@@ -14,7 +14,7 @@
 export USRBINDIR  =  $(DESTDIR)/usr/bin
 export PROCDIR    =  $(DESTDIR)/usr/bin# /usr/proc/bin for Solaris devotees
 export OWNERGROUP =  --owner 0 --group 0
-export INSTALLBIN =  install --strip
+export INSTALLBIN =  install
 export INSTALLLIB =  install
 export INSTALLSCT =  install
 export INSTALLMAN =  install --mode a=r
Comment 5 Alexander Larsson 2004-02-05 05:23:55 EST
Mass reassign to new owner
Comment 6 Daniel Walsh 2004-03-28 21:26:18 EST
Could you check to see if this problem is fixed by version 3.1.15 or
later?

Comment 7 gerry.morong 2004-03-29 10:09:08 EST
Daniel,

I have been reassigned and no longer tied to the IT group where I 
reported this issue.  Since I was the only one tracking this issue, I 
would just close it.

Gerry

Note You need to log in before you can comment on or make changes to this bug.