Bug 446395 - X11 crashes when restarted
Summary: X11 crashes when restarted
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: xorg-x11-server
Version: 9
Hardware: x86_64
OS: Linux
low
medium
Target Milestone: ---
Assignee: Adam Jackson
QA Contact:
URL:
Whiteboard:
: 446346 (view as bug list)
Depends On: 450389
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-05-14 13:37 UTC by Simon Andrews
Modified: 2009-07-14 17:40 UTC (History)
2 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2009-07-14 17:40:42 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
Xorg log file (40.01 KB, text/plain)
2008-05-14 13:41 UTC, Simon Andrews
no flags Details
GDM greeter log (349 bytes, text/plain)
2008-05-14 13:42 UTC, Simon Andrews
no flags Details
X log from gdm (34.27 KB, text/plain)
2008-05-14 13:43 UTC, Simon Andrews
no flags Details
Log file from 2 consecutive runs of startx (5.23 KB, text/plain)
2008-05-15 10:32 UTC, Simon Andrews
no flags Details
Update log (3.82 KB, text/plain)
2008-05-22 10:26 UTC, Alec Leamas
no flags Details

Description Simon Andrews 2008-05-14 13:37:41 UTC
Description of problem:
There seems to be a generic problem with X crashing when it is restarted.  I
noticed this because gdm was crashing, and there are gdm bug reports which seem
to be related to this, but it seems to be a more generic problem

Version-Release number of selected component (if applicable):
xorg-x11-server-Xorg-1.4.99.901-29.20080415.fc9.x86_64
xorg-x11-drv-nv-2.1.8-1.fc9.x86_64
gdm-2.22.0-1.fc9.x86_64

How reproducible:
Always


Steps to Reproduce:
1.Start an X server
2.Stop the X server
3.Start another one.
  
Actual results:
Screen freezes.  Computer is non-responsive.  Can't switch to virtual terminal,
can't kill X with Ctrl+Alt+Backspace

Expected results:
X starts again.

Additional info:
I first noticed this as gdm crashing on startup.  If I started in runlevel 3 and
started gdm it worked.  If I started in rl3 and did telinit 5 it worked.  If
however I did telinit 5, telinit 3, telinit 5 then the machine would freeze the
second time.  Other similar bugs have been reported (BZ#446307, BZ#446706).

I also found that if I took out rhgb then gdm wouldn't crash when starting.  I
also found that if I started in runlevel 3 then I could do startx and it would
work, but if I logged out of that session and did startx again it would freeze.

The only errors I've seen have been in the gdm logs, the X11 logs don't (to my
eyes) show anything unusual, but I'll attach them all.

From what I've seen the reports which show similar symptoms to this have all
been nvidia hardware (as is my machine).

I've tried wiping out my xorg.conf and reverting to the original gdm custom.conf
file but with no improvement.

Comment 1 Simon Andrews 2008-05-14 13:41:13 UTC
Created attachment 305358 [details]
Xorg log file

Comment 2 Simon Andrews 2008-05-14 13:42:05 UTC
Created attachment 305359 [details]
GDM greeter log

Comment 3 Simon Andrews 2008-05-14 13:43:52 UTC
Created attachment 305360 [details]
X log from gdm

Finishes with:

error setting MTRR (base = 0xd8000000, size = 0x08000000, type = 1) Invalid
argument (22)

..which is the only error I've been able to find.  If it's any help then
/proc/mtrr says:

reg00: base=0x00000000 (   0MB), size=65536MB: write-back, count=1
reg01: base=0xd7f00000 (3455MB), size=	 1MB: uncachable, count=1
reg02: base=0xd8000000 (3456MB), size= 128MB: uncachable, count=1
reg03: base=0xe0000000 (3584MB), size= 512MB: uncachable, count=1

Comment 4 Simon Andrews 2008-05-15 09:03:23 UTC
Another potentially useful bit of info is that when the machine freezes it seems
it's not just X which dies.  It's not responsive to pings, and if I have an
active ssh session open it hangs.  If I'm running top the last active process I
see is init (when I do init 5 to cause the hang).

Comment 5 Simon Andrews 2008-05-15 10:32:25 UTC
Created attachment 305462 [details]
Log file from 2 consecutive runs of startx

This log shows 2 runs of startx, the first of which succeeds and the second of
which hangs, it was captured from an external connection so hopefully should be
complete.

I've also found that if I switch back to VC1 after issuing startx I can start
and stop it as many times as I like so long as I don't allow the display to
switch to the X-session.  As soon as I switch to VC7 the machine immediately
hangs.

Comment 6 Alec Leamas 2008-05-15 15:51:16 UTC
This might be related that I even in FC8 hasn't been able to get the nvidia 3D
drivers to work - see http://www.nvnews.net/vbulletin/showthread.php?t=104229.
The crash reported in this thread have identical, unpleasant symptoms.

Cross-referencing: Possibly related to bug 446346.

I think the severity of this bug should be increased - it's actually a crash
requiring a reboot, which IMHO should be at least "high"

Comment 7 Simon Andrews 2008-05-16 08:28:57 UTC
Although there may be some relation to the F8 problem, my machine was running F8
just fine before I upgraded, and had been doing so with both the nv and nvidia
drivers.

I've tried some other drivers for my machine with mixed results.

Using the framebuffer caused the same crashes as with nv.

Using the nouveau driver I can get X to be stable as long as I only use 800x600
resolution.  Above that I get the same freezes as with nv.  I tried reducing the
resolution with nv but it still froze.

Comment 8 Alec Leamas 2008-05-19 14:52:21 UTC
*** Bug 446346 has been marked as a duplicate of this bug. ***

Comment 9 Simon Andrews 2008-05-19 15:10:52 UTC
This bug should probably be moved out of the nv driver section, since although
it's only been reported against (I think) Nvidia hardware on x86_64 I can
reproduce the crash using either the framebuffer or vesa drivers as well.

Comment 10 Alec Leamas 2008-05-22 07:55:15 UTC
After updating (right now!) this seems to be fixed for me, I can do a init
3/init 5 cycle without any problems.

Comment 11 Simon Andrews 2008-05-22 08:49:39 UTC
I've just done an update and my machine still crashes on the second launch of an
X server in a session.

I didn't have any xorg packages in the update though.  Can you post a list of
the packages which were updated in order to fix this on your system please.

Comment 12 Alec Leamas 2008-05-22 10:02:43 UTC
Certainly, if I just had any idea to get the list. I guess it's possible to make
an rpm query like "list all packages including some kind of date" and sort it -
but I don't know how to do it. Amy hint out there?

Comment 13 Simon Andrews 2008-05-22 10:11:44 UTC
tail /var/log/yum.log

..list everything since the last time you know it failed.

Comment 14 Alec Leamas 2008-05-22 10:26:14 UTC
Created attachment 306358 [details]
Update log

What's updated since last crash. Besides this, I have also messed around with
the startup priorities to resolve a hald issue, see bug 443602

Comment 15 Simon Andrews 2008-05-22 13:33:02 UTC
Looking down the list I don't see anything significantly different to what I
have.  You don't seem to have any updates I don't have.

Also, the only thing on that list which seems related to X is the
mesa-libGL-7.1-0.29.fc9.i386, and I have mesa-libGL installed already.

I also tried resetting the priorities as per your other bug even though my HAL
was OK - that didn't have any effect.

Truly curious...

Comment 16 Simon Andrews 2008-05-22 14:15:38 UTC
I thought I'd made some progress with this, but now I'm not sure.

I noticed that on the second launch of X it didn't die straight away.  The
hatched screen with the X cursor appeared and it was only after the window
manager launched that it died.

I'd also forgotten that this machine has a soundcard (which is never used). 
When I turned the speakers on I found that the X session died as soon as the
startup sound started to play (so that the sound got stuck in a look).

I've tried disabling sound all together but it still crashes on the second launch.

I've also tried just running twm.  On the second launch of X this will start and
stay responsive, but as soon as you click the mouse to bring up the twm menu
then the whole system locks up.  It seems therefore that there's something about
the first program to run under X which kills it, rather than the launching of X
itself.

Comment 17 Simon Andrews 2008-05-22 15:03:04 UTC
Just to narrow this down to X I tried running from runlevel 1 and the crash
still happened, so I don't think it can be any of the accessory services, it
must be X itself.

Comment 18 Alec Leamas 2008-05-22 18:28:33 UTC
Sorry to say, but I was wrong thinking the bug was gone. I was fooled by the
fact that the first init3/init5 cycle after a reboot succeeds (this might very
well have been the situation since the beginning). However, a new init3/init5
sequence  still exhibits the bug. :-(

As I stated earlier, I have a running kernel and virtual terminals after the
bug. If anyone wants to propose something meaningful to do e. g., with gdb I'm
willing to try. For the moment, I have no idea what to look for, though.

Comment 19 Simon Andrews 2008-05-29 10:36:19 UTC
I've just tried the newly released proprietary nvidia drivers (the Livna bundle)
on this machine and X works as it should again.  I'd still like to be able to
get this to work with the free drivers, but for now at least I've got a workround.

Comment 20 Alec Leamas 2008-05-29 17:20:49 UTC
I have tried installing the livna drivers, and the bug is still there. Have
tried 1.5's new autoconfiguration, limiting the xorg.conf to a single "driver:
nvidia" option, but it's still the same.

Using the nvidia package I can also provoke a bug just by starting glxgears -
this tosses the machine into a completely unresponsive state roughly a second
after the glxgears window is started.

Had a glance at dmesg and  /var/log/gdm/0:log, but everything looks fine there.

Using nvidia-settings to disable-all state gives same result.

This is just so strange...

Comment 21 Alec Leamas 2008-06-03 09:10:38 UTC
Upgraded another box with nvidia graphics to FC9. This box does not exhibit this
bug. So it's either something with the sw configuration on my and possibly
Simon's computer, or the specific hw in use.

My failing card is a nVidia NV43[ GeForce 6600GT] according to hwbrowser.

Comment 22 Simon Andrews 2008-06-03 09:49:41 UTC
I too doubt that this is a very widespread issue otherwise there would have been
a lot more shouting about it.

My card is an nVidia NV43GL Quadro FX 550 (rev 2a).

Other things which might be relevant

x86_64

4Gb memory (may be relevant since I'm getting MTRR errors)

Comment 23 Alec Leamas 2008-06-03 10:06:29 UTC
I have 2G of ram and no MTRR errors. 

Comment 24 Alec Leamas 2008-06-07 11:15:56 UTC
Updated to today's state hoping things would improve. They don't.

Digging a little more. There is a pulseaudio related process gconf-helper which
is dead (zombie) when the login screen hangs. This process is up & running when
the login greeter works.

I have a bridged network configuration. After upgrade to F-9 this  refuses to
restart cleanly, basically there are no network connections when going from rc3
to rc5. This might very well be the root cause, and another issue which needs to
be resolved. Stay tuned.

Comment 25 Alec Leamas 2008-06-07 15:47:06 UTC
Found a walkaround for the network restart  issue, see bug 450389. With this
fix, init 3 /init 5 now works as expected.

Comment 26 Bug Zapper 2009-06-10 00:48:35 UTC
This message is a reminder that Fedora 9 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 9.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '9'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 9's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 9 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 27 Bug Zapper 2009-07-14 17:40:42 UTC
Fedora 9 changed to end-of-life (EOL) status on 2009-07-10. Fedora 9 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.