Bug 120065 - X server hangs in mutex
X server hangs in mutex
Product: Fedora
Classification: Fedora
Component: XFree86 (Show other bugs)
i686 Linux
medium Severity medium
: ---
: ---
Assigned To: X/OpenGL Maintenance List
David Lawrence
Depends On:
  Show dependency treegraph
Reported: 2004-04-05 15:14 EDT by Steve Knodle
Modified: 2007-11-30 17:10 EST (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2004-10-12 14:14:28 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
combined strace, and gdb output, and XFree86.0.log (51.18 KB, text/plain)
2004-04-05 15:16 EDT, Steve Knodle
no flags Details

  None (edit)
Description Steve Knodle 2004-04-05 15:14:33 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040312

Description of problem:
X server hangs in a mutex wait daily at around 4am.
X server log indicates a crash of some sort, but X server
is still spinning in mutex wait.

XFree log, "strace -p PID" output, and "gdb -p PID" output
are attached.
Note: my machine has nvidia card.  This has happened to one
of my coworkers who also has an nvidia card.

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1. This happens regularly at about the same time of day.

Actual Results:  X screen has died, system is at the console login
prompt on pty1
"ps ax" shows X process still running.
killing primary gdm-binary process restarts it all correctly.

Additional info:

# strace -p PID
Process 5062 attached - interrupt to quit
futex(0xd20760, FUTEX_WAIT, 2, NULL)    = -1 EINTR (Interrupted system
--- SIGALRM (Alarm clock) @ 0 (0) ---
sigreturn()                             = ? (mask now [SEGV IO])
futex(0xd20760, FUTEX_WAIT, 2, NULL)    = -1 EINTR (Interrupted system
--- SIGALRM (Alarm clock) @ 0 (0) ---
sigreturn()                             = ? (mask now [SEGV IO])

#### gdb -p PID
(gdb) where
#0  0x00bf07a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1  0x00cd004e in __lll_mutex_lock_wait () from /lib/tls/libc.so.6
#2  0x00c6c710 in _L_mutex_lock_7373 () from /lib/tls/libc.so.6
#3  0x08fcc480 in ?? ()
#4  0x00000000 in ?? ()
(gdb) quit
Comment 1 Steve Knodle 2004-04-05 15:16:26 EDT
Created attachment 99120 [details]
combined strace, and gdb output, and XFree86.0.log
Comment 2 Mike A. Harris 2004-04-06 06:50:35 EDT
You can not debug the X server via gdb nor strace the X server from
a terminal inside the X server you are debugging.  That will never
work, and is not a bug.

The only way to strace or debug the X server, is by having 2
computers via ethernet or serial cable or similar, and debugging
the X server via remote shell to the computer running the X server.
Comment 3 Steve Knodle 2004-04-06 12:02:36 EDT
Comment #2 is entirely irrelevant to the situation described.
The X server is waiting for a mutex apparently held by a
process/thread that has either died, or forgotten it held the mutex lock.
No commands were running from terminal sessions run by the
X server at the times of the problem.  (I am NOT in working 
at 4am.  Sorry, my body is not 24/7).  Very likely the problem
is cron related, but how?

My goal is to:
1. find out which process/thread holds the lock,
2. Find out why it died.
3. Find out why the kernel didn't clean up the lock if the process died.

The "strace" and "gdb" commands described here were run from a 
different virtual terminal. Even gdb works fine, as long as you
don't stop the X server at a breakpoint and then hot-key to the
X server's window :-)
Comment 4 Steve Knodle 2004-04-07 12:18:47 EDT
To preclude confusion:
My X server was again hung when I came in this morning, as was
my colleague's.
This time I ssh'd in from another machine, ran strace and gdb
remotely, as described in comment#2.
The X server was once again waiting in a mutex, the output the
same as in the previous logs.
By the way, neither of us is using the binary nvidia driver.
Our installations are both pure test2.
Comment 5 Mike A. Harris 2004-10-12 14:14:28 EDT
We are unable to reproduce this problem in any OS release.  Users
who have experienced this problem are encouraged to upgrade to the
latest version of Fedora Core, which can be obtained from:


If this issue turns out to still be reproduceable in the latest
version of Fedora Core, please file a bug report in the X.Org
bugzilla located at http://bugs.freedesktop.org in the "xorg"

Once you've filed your bug report to X.Org, if you paste the new
bug URL here, Red Hat will continue to track the issue in the
centralized X.Org bug tracker, and will review any bug fixes that
become available for consideration in future updates.

Note You need to log in before you can comment on or make changes to this bug.