Bug 120018 - X server hangs in mi module using nv driver
Summary: X server hangs in mi module using nv driver
Keywords:
Status: CLOSED UPSTREAM
Alias: None
Product: Fedora
Classification: Fedora
Component: xorg-x11
Version: rawhide
Hardware: i686
OS: Linux
medium
high
Target Milestone: ---
Assignee: X/OpenGL Maintenance List
QA Contact: David Lawrence
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2004-04-05 08:25 UTC by David Fraser
Modified: 2007-11-30 22:10 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2004-09-21 09:18:51 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description David Fraser 2004-04-05 08:25:42 UTC
Description of problem:
Every few hours on Fedora Core 2 Test 2, my X server hangs.
The mouse cursor can still be moved around the screen, but the
keyboard is unresponsive (pressing NumLock doesn't change the NumLock
status, Ctrl-Alt-F1 doesn't switch away from the X server, etc)
I can still ssh into the machine, and all the programs, including X,
are still running.
Killing X, or switching to runlevel 4, does not restore the screen.
I can successfully restart the machine.
Couldn't find any apparently related messages in /var/log/messages or
XFree86.0.log
Am running KDE.
Happens repeatedly (with both kernel 2.4.22 and 2.6.3)

Version-Release number of selected component (if applicable):
xorg-x11-0.0.6.6-0.0.2004_03_11.9
kernel-2.4.22-1.2140.nptl.caps.rhfc1.ccrma
kernel-2.6.3-2.1.253.2.1
kdebase-3.2.1-1.4

How reproducible:
Happens randomly, often when dragging with the mouse (but could be
coincidence)

Comment 1 David Fraser 2004-04-05 09:24:07 UTC
Sorry, it seems I had bad versions of libGLcore hanging around.
I have removed these, will reopen if that does not fix the problem.

Comment 2 lupus 2004-04-07 12:13:25 UTC
got the same problem

is it libGLcore fault?

if so which version solves it?

Comment 3 lupus 2004-04-07 12:18:42 UTC
I have fc2 test2 running from iso install.

btw running gnome

Comment 4 David Fraser 2004-04-07 12:57:26 UTC
you can try and work out the problem by going:
ps -ef | grep X
to find the X process number
then run gdb and say
  attach 39483
(where 39483 is the process number)
that should list the modules that have been loaded.
you can also say bt to get a backtrace
(for example mine gave 
#0  0x08b257d1 in ?? ()
#1  0x08b2620d in ?? ()
#2  0x08ba5856 in ?? ()
#3  0x08ba5291 in ?? ()
#4  0x08ba6502 in ?? ()
#5  0x08ba5103 in ?? ()
#6  0x0817231a in miFillUniqueSpanGroup ()
#7  0x0816fcf4 in miCleanupSpanData ()
#8  0x0817149a in miWideDash ()
#9  0x0816ba27 in miPolyRectangle ()
#10 0x0819638c in miSpritePolyRectangle ()
#11 0x080be6b3 in ProcPolyRectangle ()
#12 0x080bb5d2 in Dispatch ()
#13 0x080ce36c in main ()
#14 0x00b01eb3 in __libc_start_main () from /lib/tls/libc.so.6
)
The problem did seem to improve when I remove the versions of
libGLcore I had (they were not fedora versions) - so check the paths
of all the libraries listed by gdb and make sure they are correct.

Comment 5 David Fraser 2004-04-07 12:58:38 UTC
I removed the bad libraries and the crashes decreased.
However I did still have some problems, with the following backtrace:
#0  0x09a7a8c3 in ?? ()
#1  0x09ae58f0 in ?? ()
#2  0x09afcf5b in ?? ()
#3  0x080bf053 in ProcPutImage ()
#4  0x080bb5d2 in Dispatch ()
#5  0x080ce36c in main ()
#6  0x00b01eb3 in __libc_start_main () from /lib/tls/libc.so.6
I have now tried to upgrade to the latest builds of xorg from fedora,
and will report if I still have problems.

Comment 6 David Fraser 2004-04-07 14:19:27 UTC
I have now reproduced the same stack trace as in comment 4 twice by
trying to run ksnapshot and select a region of the screen.
It seemed to cause the crash when lots of other apps (konsole,
firefox, thunderbird, gimp) were open and the cursor was dragged round
the screen for a while when selecting the region.
So I am reopening this bug.

Comment 7 David Fraser 2004-04-13 12:19:53 UTC
more backtraces:
#0  0x09f078db in ?? ()
#1  0x09f72bcd in ?? ()
#2  0x08196180 in miSpritePolyPoint ()
#3  0x080be11f in ProcPolyPoint ()
#4  0x080bb5b2 in Dispatch ()
#5  0x080ce34c in main ()
#6  0x00b01eb3 in __libc_start_main () from /lib/tls/libc.so.6

#0  0x090638db in ?? ()
#1  0x09108cb5 in ?? ()
#2  0x081986c2 in miSpriteGlyphs ()
#3  0x08163c60 in CompositeGlyphs ()
#4  0x081660f8 in ProcRenderCompositeGlyphs ()
#5  0x0816709d in ProcRenderDispatch ()
#6  0x080bb5b2 in Dispatch ()
#7  0x080ce34c in main ()
#8  0x00b01eb3 in __libc_start_main () from /lib/tls/libc.so.6

Comment 8 yuval aviel 2004-04-16 07:21:45 UTC
Had the same problem.
Machine: laptop, HP compaq nx 9000.
Video card: ATI Radeon IGP
The FC2T2 had me hungs when starting X session.
After updating packages with yum (13/4/2004) X hungs only when
performing shutdown.
Lately (15/4/04), I updated the packages again, and I'm back to the
old behavior. X hungs (sometimes) when kde starts. Also now the usb
mouse doesn't respond.

Comment 9 yuval aviel 2004-04-21 09:10:46 UTC
All X problems seemed to be solved in the latest update (20/4).
Wonder whats next ;) ...

Comment 10 David Fraser 2004-05-03 15:35:02 UTC
Nope, I still get this using xorg-x11-6.7.0-0.5
Interestingly though I now only get it using the nv module, not using
VESA. So I'll change the summary - maybe it's nv-specific
Just tried now and was able to reproduce it within 2 minutes. Backtrace:
(gdb) bt
#0  0x08b47ab9 in ?? ()
#1  0x08b484f5 in ?? ()
#2  0x08bb7df6 in ?? ()
#3  0x08bb7831 in ?? ()
#4  0x08bb8aa2 in ?? ()
#5  0x08bb76a3 in ?? ()
#6  0x081723ea in miFillUniqueSpanGroup ()
#7  0x0816fdc4 in miCleanupSpanData ()
#8  0x0817156a in miWideDash ()
#9  0x0816bae7 in miPolyRectangle ()
#10 0x081964dc in miSpritePolyRectangle ()
#11 0x080be0c3 in ProcPolyRectangle ()
#12 0x080bafe2 in Dispatch ()
#13 0x080cddbc in main ()
#14 0x007a2f43 in __libc_start_main () from /lib/tls/libc.so.6


Comment 11 yuval aviel 2004-05-04 07:32:56 UTC
Do you have acpi=on and using KDE desktop?
It might be the same bug as
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=121510


Comment 12 Enrico Scholz 2004-05-04 07:44:02 UTC
Not here; ACPI is not compiled into the kernel and I am using Gnome
1.4 + sawfish.

But the 'nv' driver is used.

Comment 13 David Fraser 2004-05-04 07:54:15 UTC
I also don't have ACPI.
Enrico, have you tried getting a back trace of the hang?
Changing this to test3 as it definitely appears there

Comment 14 Enrico Scholz 2004-05-04 08:17:37 UTC
I do not get good backtraces; only '??'. X11 hangs in a small loop
(setting the tested registers to the same value let me move the
mousecursor again, but then I get SIGPIPE and can not recover) and it
consumes 100% of the CPU.

It happens on a machine which I see at the weekend only, so I can not
provide further details now.

Comment 15 Mike A. Harris 2004-05-04 18:54:24 UTC
gdb does not have the ability to debug a running X server.  This
is due to the X server's built in custom ELF loader mechanism.  In
order to debug the X server in a meaningful way, you need to use
a patched version of gdb which understands the X server ELF loader.

I have a very old version of gdb on ftp://people.redhat.com/mharris
which is patched to do this, but it probably doesn't work anymore
on current generation distributions.  The only way to debug is to
port the gdb patch forward, or to build a static server, or to have
fun with xf86msg() et al. inside the server or modules.

Debugging fun.  ;o)

Comment 16 David Fraser 2004-05-05 08:01:52 UTC
Yup, you're right, your patched gdb just crashes when trying to load X
(after saying it couldn't find symbols).
Any hints on how to build a static server? Or an rpm :-) ?

Comment 17 Thomas 2004-07-04 15:39:55 UTC
have the same problem with xorg 6.7.0 (also in xfree 4.4.0) in 
slackware.. happens when i drag around in gqview.. annoying.. 

Comment 18 Thomas 2004-07-04 15:43:02 UTC
by the way, you that say you have to reboot for fixing the problem, 
this is not true, you can startx from ssh, then if you stop X you get 
console back.

Comment 19 Enrico Scholz 2004-09-07 09:51:00 UTC
FWIW, still with xorg-x11-6.7.99.903-2

And no, I can not test it with plain FC software ;)

Comment 20 Mike A. Harris 2004-09-07 22:44:24 UTC
Please report this issue in the upstream X.Org X11 project's
bugzilla, located at http://bugs.freedesktop.org in the "xorg"
component.  Once you've filed your report in X.Org bugzilla,
if you paste the X.org bug URL here, we will track the issue
upstream, and review any fixes Nvidia provides for consideration
in future updates of xorg-x11.

Thanks in advance.

Comment 21 Enrico Scholz 2004-09-07 23:00:17 UTC
http://freedesktop.org/bugzilla/show_bug.cgi?id=417 seems to describe
the issue already and I can not provide further details :(

Comment 22 Mike A. Harris 2004-09-21 09:18:15 UTC
Thanks for the info Enrico.  We'll track the issue in the upstream
X.Org bugzilla now.  Please handle any status updates and feedback
in the X.Org report.  If fixes become available, we will review
them for consideration in future updates.

Thanks again.

Setting status to "UPSTREAM" for tracking in X.Org bugzilla.


Note You need to log in before you can comment on or make changes to this bug.