Bug 141140 - xfs stops occasionally under heavy load
xfs stops occasionally under heavy load
Status: CLOSED WORKSFORME
Product: Fedora
Classification: Fedora
Component: xorg-x11 (Show other bugs)
3
All Linux
medium Severity high
: ---
: ---
Assigned To: X/OpenGL Maintenance List
David Lawrence
:
Depends On:
Blocks: FC5Target
  Show dependency treegraph
 
Reported: 2004-11-29 11:26 EST by Matthias Saou
Modified: 2007-11-30 17:10 EST (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2005-05-16 17:10:20 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
strace of xfs from the start to the stop (278.04 KB, text/plain)
2004-11-29 11:28 EST, Matthias Saou
no flags Details

  None (edit)
Description Matthias Saou 2004-11-29 11:26:32 EST
Description of problem:
From time to time, (nearly?) always when my computer is under heavy
load, xfs stops.

Version-Release number of selected component (if applicable):
xorg-x11-xfs 6.8.1-12.FC3.1

How reproducible:
Occasionally.

Actual results:
The GNOME desktop gets really slow, things appear hung when they
aren't. New applications refuse to launch with a message about not
being able to open the "fixed" font, and all gtk1 applications seem to
change their font to an ugly fixed-sized one (like xscreensaver's
pasword dialog).

Expected results:
The xfs service shouldn't stop or crash!

Additional info:
If xfs is re-started, it helps all currently running applications
behave normally again, but new applications still can't be run. I know
other users that have suffered this problem, and I have been seen it
since Fedora Core 2 myself.
Comment 1 Matthias Saou 2004-11-29 11:28:26 EST
Created attachment 107547 [details]
strace of xfs from the start to the stop

This strace output shows a run of xfs up until it exited as described in the
bug report.
Comment 2 Matthias Saou 2005-01-05 12:52:44 EST
I'm still seeing this issue on my laptop, mostly when I'm rebuilding
packages (thus using losts of memory and CPU) while using GNOME as always.

Is there anything more I can do to track down this problem?
Comment 3 Mike A. Harris 2005-02-01 00:59:42 EST
Please attach a gdb backtrace of xfs failing.

Setting status to "NEEDINFO", awaiting backtrace.
Comment 4 tracy smith 2005-02-28 12:57:49 EST
Hi.  I am also having this problem.  I'm running Fedora Core 1.  My 
problem happens in several different ways.  If I'm logged in
continuously for 3 or so days, then the xfs will crash and
applications  like xpdf will refuse to start with font errors.  Many
other applications will also refuse to start, like acroread.  Doing a
restart of xfs in /etc/init.d does not change this.  I have to reboot
the computer for the apps to resume working.  I first noticed this
problem 
with emacs.  I can no longer run both emacs and xemacs.  Either, I run 
one or the other.  If I run emacs, then xemacs will core dump.  If I
then restart xfs, then I can get xemacs to run (most of the time), but
then emacs will not run (core dump).  

My X-server is ok, and my browser (firefox) doesn't seem to be
affected.  I have never had this problem before in any version of 
redhat running on this machine since redhat 8.0.  
Comment 5 Mike A. Harris 2005-03-22 10:15:14 EST
Please attach a gdb backtrace of xfs failing.

Setting status to "NEEDINFO", awaiting backtrace.

Comment 6 Søren Sandmann Pedersen 2005-03-22 11:13:25 EST
The way to generate such a stacktrace is to:

1. Make sure xfs is running OK
2. as root, run 
         
          gdb --pid `pidof gdb`

3. Do whatever you do to make xfs crash
4. Type "bt" into gdb, which should by now report that xfs has crashed
5. Attach the output to this bug.
Comment 7 Søren Sandmann Pedersen 2005-03-22 11:15:37 EST
     gdb --pid `/sbin/pidof xfs`

of course, not "pidof gdb".
Comment 8 tracy smith 2005-03-22 13:04:14 EST
Here is a backtrace of xfs failure (as per above instructions) after trying 
to run emacs:

[root@arwen init.d]# gdb --pid `/sbin/pidof xfs`
GNU gdb Red Hat Linux (5.3.90-0.20030710.41rh)
Copyright 2003 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu".
Attaching to process 3734
Reading symbols from /usr/X11R6/bin/xfs...(no debugging symbols found)...done.
Using host libthread_db library "/lib/tls/libthread_db.so.1".
Reading symbols from /usr/X11R6/lib/libXfont.so.1...done.
Loaded symbols for /usr/X11R6/lib/libXfont.so.1
Reading symbols from /lib/tls/libm.so.6...done.
Loaded symbols for /lib/tls/libm.so.6
Reading symbols from /usr/lib/libz.so.1...done.
Loaded symbols for /usr/lib/libz.so.1
Reading symbols from /lib/tls/libc.so.6...done.
Loaded symbols for /lib/tls/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
Reading symbols from /lib/libnss_files.so.2...done.
Loaded symbols for /lib/libnss_files.so.2
Reading symbols from /lib/libnss_nisplus.so.2...done.
Loaded symbols for /lib/libnss_nisplus.so.2
Reading symbols from /lib/libnsl.so.1...done.
Loaded symbols for /lib/libnsl.so.1
0x0093bc32 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
(gdb) bt
#0  0x0093bc32 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1  0x00f092ad in ___newselect_nocancel () from /lib/tls/libc.so.6
#2  0x08055258 in NameForAtom ()
#3  0x0804a288 in ?? ()
#4  0xbfef6f34 in ?? ()
#5  0x00000001 in ?? ()
#6  0x095c9f0c in ?? ()
(gdb)
Comment 9 Mike A. Harris 2005-03-22 13:11:12 EST
Can you rebuild xfs from source with debug symbols?  Alternatively,
rebuild the entire xorg-x11 rpm with DebuggableBuild enabled, so a full
traceback can be obtained.

I've added Jakub and Uli to CC for comment also.
Comment 10 tracy smith 2005-03-22 14:15:44 EST
I'm not a newbie, and have experience recompiling my kernel and upgrading 
X11, but I don't have any experience rebuilding xfs or my x11 stuff.  I don't 
want to screw up my system right now, as I have some major work deadlines coming 
up.  If you can provide simple instructions so that I won't ruin xfs, I'll try 
to rebuild it.  I have this version of xfs:

Version:      @(#) /etc/init.d/xfs 2.0

Otherwise, I'll be more flexible in a couple of weeks.  I'm paranoid about 
dependencies and have had some painful x11/nvidia-driver lessons.  Sorry to 
be a weenie.  ;)
Comment 11 Søren Sandmann Pedersen 2005-03-23 17:29:12 EST
Actually, the instructions are not right. There is a missing step 2.5:

    2.5. Type "cont <return>" into gdb

Sorry about that.

Comment 12 tracy smith 2005-03-24 11:08:34 EST
Ok, here is the sequence ammended with step '2.5' above for an xfs failure 
caused by running xemacs:


[root@arwen tsmith]# gdb --pid `/sbin/pidof xfs`
GNU gdb Red Hat Linux (5.3.90-0.20030710.41rh)
Copyright 2003 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu".
Attaching to process 3122
Reading symbols from /usr/X11R6/bin/xfs...(no debugging symbols found)...done.
Using host libthread_db library "/lib/tls/libthread_db.so.1".
Reading symbols from /usr/X11R6/lib/libXfont.so.1...done.
Loaded symbols for /usr/X11R6/lib/libXfont.so.1
Reading symbols from /lib/tls/libm.so.6...done.
Loaded symbols for /lib/tls/libm.so.6
Reading symbols from /usr/lib/libz.so.1...done.
Loaded symbols for /usr/lib/libz.so.1
Reading symbols from /lib/tls/libc.so.6...done.
Loaded symbols for /lib/tls/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
Reading symbols from /lib/libnss_files.so.2...done.
Loaded symbols for /lib/libnss_files.so.2
Reading symbols from /lib/libnss_nisplus.so.2...done.
Loaded symbols for /lib/libnss_nisplus.so.2
Reading symbols from /lib/libnsl.so.1...done.
Loaded symbols for /lib/libnsl.so.1
0x006a0c32 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
(gdb) cont
Continuing.

Program exited with code 0177.
(gdb) bt
No stack.
(gdb)
Comment 13 Mike A. Harris 2005-04-10 22:24:22 EDT
There is no backtrace.
Comment 14 Mike A. Harris 2005-04-10 22:26:28 EDT
Please report this problem in X.Org bugzilla located at
http://bugs.freedesktop.org in the "xorg" component.  Once you've filed
a report there, update this report with the X.Org bug report URL, and we
will track the issue in the X.Org bugzilla.

Comment 15 Mike A. Harris 2005-04-10 22:29:58 EDT
Setting status to "NEEDINFO", awaiting upstream bug report URL
for tracking.
Comment 16 Mike A. Harris 2005-05-16 17:10:20 EDT
The information above does not yield specific steps with which
this problem can be reproduced.  Since the problem has only been
reported by one person, it seems it is not affecting the majority
of the userbase.

This means it is probably either a local issue, or an obscure bug
in xfs which is hard to trigger and/or requires a specific
configuration or specific set of fonts to be installed, and a
specific set of events to happen.  It is also possible, that if this
only happens when the machine is under very very heavy load, that
the machine could be running out of memory, and the kernel OOM reaper
is killing off processes.  If the OOM reaper kills xfs, then this
problem would manifest.  I believe your /var/log/messages would indicate
kernel log messages about what processes have been killed by the OOM
reaper if this is what is occuring.

Since we can not reproduce the problem given the above information
however, and nobody has yet been able to provide us with a proper
backtrace with debugging symbols, there is currently nothing we can
do about this problem at this time.

I'm setting the status of this to "WORKSFORME", however if someone who
can reproduce this problem can either provide us with detailed steps
to reproduce this 100% of the time so we can set up a test machine
to reproduce and debug the problem, or someone can rebuild xfs and
its libraries with debugging information, and generate a proper
backtrace and attach it to the bug report, (preferably both), then at
that time please feel free to reopen the bug report once the information
is attached/updated.

Alternatively, you can file a bug report for the issue in X.Org bugzilla
at http://bugs.freedesktop.org in the "xorg" component, and paste the
URL here in this bug report and Red Hat will continue to track the issue
in the centralized X.Org bug tracker, and will review any bug fixes that
become available for consideration in future updates.

Setting status to "WORKSFORME" (unable to reproduce under any circumstances
so far).

Note You need to log in before you can comment on or make changes to this bug.