Description of problem: There seems to be a generic problem with X crashing when it is restarted. I noticed this because gdm was crashing, and there are gdm bug reports which seem to be related to this, but it seems to be a more generic problem Version-Release number of selected component (if applicable): xorg-x11-server-Xorg-1.4.99.901-29.20080415.fc9.x86_64 xorg-x11-drv-nv-2.1.8-1.fc9.x86_64 gdm-2.22.0-1.fc9.x86_64 How reproducible: Always Steps to Reproduce: 1.Start an X server 2.Stop the X server 3.Start another one. Actual results: Screen freezes. Computer is non-responsive. Can't switch to virtual terminal, can't kill X with Ctrl+Alt+Backspace Expected results: X starts again. Additional info: I first noticed this as gdm crashing on startup. If I started in runlevel 3 and started gdm it worked. If I started in rl3 and did telinit 5 it worked. If however I did telinit 5, telinit 3, telinit 5 then the machine would freeze the second time. Other similar bugs have been reported (BZ#446307, BZ#446706). I also found that if I took out rhgb then gdm wouldn't crash when starting. I also found that if I started in runlevel 3 then I could do startx and it would work, but if I logged out of that session and did startx again it would freeze. The only errors I've seen have been in the gdm logs, the X11 logs don't (to my eyes) show anything unusual, but I'll attach them all. From what I've seen the reports which show similar symptoms to this have all been nvidia hardware (as is my machine). I've tried wiping out my xorg.conf and reverting to the original gdm custom.conf file but with no improvement.
Created attachment 305358 [details] Xorg log file
Created attachment 305359 [details] GDM greeter log
Created attachment 305360 [details] X log from gdm Finishes with: error setting MTRR (base = 0xd8000000, size = 0x08000000, type = 1) Invalid argument (22) ..which is the only error I've been able to find. If it's any help then /proc/mtrr says: reg00: base=0x00000000 ( 0MB), size=65536MB: write-back, count=1 reg01: base=0xd7f00000 (3455MB), size= 1MB: uncachable, count=1 reg02: base=0xd8000000 (3456MB), size= 128MB: uncachable, count=1 reg03: base=0xe0000000 (3584MB), size= 512MB: uncachable, count=1
Another potentially useful bit of info is that when the machine freezes it seems it's not just X which dies. It's not responsive to pings, and if I have an active ssh session open it hangs. If I'm running top the last active process I see is init (when I do init 5 to cause the hang).
Created attachment 305462 [details] Log file from 2 consecutive runs of startx This log shows 2 runs of startx, the first of which succeeds and the second of which hangs, it was captured from an external connection so hopefully should be complete. I've also found that if I switch back to VC1 after issuing startx I can start and stop it as many times as I like so long as I don't allow the display to switch to the X-session. As soon as I switch to VC7 the machine immediately hangs.
This might be related that I even in FC8 hasn't been able to get the nvidia 3D drivers to work - see http://www.nvnews.net/vbulletin/showthread.php?t=104229. The crash reported in this thread have identical, unpleasant symptoms. Cross-referencing: Possibly related to bug 446346. I think the severity of this bug should be increased - it's actually a crash requiring a reboot, which IMHO should be at least "high"
Although there may be some relation to the F8 problem, my machine was running F8 just fine before I upgraded, and had been doing so with both the nv and nvidia drivers. I've tried some other drivers for my machine with mixed results. Using the framebuffer caused the same crashes as with nv. Using the nouveau driver I can get X to be stable as long as I only use 800x600 resolution. Above that I get the same freezes as with nv. I tried reducing the resolution with nv but it still froze.
*** Bug 446346 has been marked as a duplicate of this bug. ***
This bug should probably be moved out of the nv driver section, since although it's only been reported against (I think) Nvidia hardware on x86_64 I can reproduce the crash using either the framebuffer or vesa drivers as well.
After updating (right now!) this seems to be fixed for me, I can do a init 3/init 5 cycle without any problems.
I've just done an update and my machine still crashes on the second launch of an X server in a session. I didn't have any xorg packages in the update though. Can you post a list of the packages which were updated in order to fix this on your system please.
Certainly, if I just had any idea to get the list. I guess it's possible to make an rpm query like "list all packages including some kind of date" and sort it - but I don't know how to do it. Amy hint out there?
tail /var/log/yum.log ..list everything since the last time you know it failed.
Created attachment 306358 [details] Update log What's updated since last crash. Besides this, I have also messed around with the startup priorities to resolve a hald issue, see bug 443602
Looking down the list I don't see anything significantly different to what I have. You don't seem to have any updates I don't have. Also, the only thing on that list which seems related to X is the mesa-libGL-7.1-0.29.fc9.i386, and I have mesa-libGL installed already. I also tried resetting the priorities as per your other bug even though my HAL was OK - that didn't have any effect. Truly curious...
I thought I'd made some progress with this, but now I'm not sure. I noticed that on the second launch of X it didn't die straight away. The hatched screen with the X cursor appeared and it was only after the window manager launched that it died. I'd also forgotten that this machine has a soundcard (which is never used). When I turned the speakers on I found that the X session died as soon as the startup sound started to play (so that the sound got stuck in a look). I've tried disabling sound all together but it still crashes on the second launch. I've also tried just running twm. On the second launch of X this will start and stay responsive, but as soon as you click the mouse to bring up the twm menu then the whole system locks up. It seems therefore that there's something about the first program to run under X which kills it, rather than the launching of X itself.
Just to narrow this down to X I tried running from runlevel 1 and the crash still happened, so I don't think it can be any of the accessory services, it must be X itself.
Sorry to say, but I was wrong thinking the bug was gone. I was fooled by the fact that the first init3/init5 cycle after a reboot succeeds (this might very well have been the situation since the beginning). However, a new init3/init5 sequence still exhibits the bug. :-( As I stated earlier, I have a running kernel and virtual terminals after the bug. If anyone wants to propose something meaningful to do e. g., with gdb I'm willing to try. For the moment, I have no idea what to look for, though.
I've just tried the newly released proprietary nvidia drivers (the Livna bundle) on this machine and X works as it should again. I'd still like to be able to get this to work with the free drivers, but for now at least I've got a workround.
I have tried installing the livna drivers, and the bug is still there. Have tried 1.5's new autoconfiguration, limiting the xorg.conf to a single "driver: nvidia" option, but it's still the same. Using the nvidia package I can also provoke a bug just by starting glxgears - this tosses the machine into a completely unresponsive state roughly a second after the glxgears window is started. Had a glance at dmesg and /var/log/gdm/0:log, but everything looks fine there. Using nvidia-settings to disable-all state gives same result. This is just so strange...
Upgraded another box with nvidia graphics to FC9. This box does not exhibit this bug. So it's either something with the sw configuration on my and possibly Simon's computer, or the specific hw in use. My failing card is a nVidia NV43[ GeForce 6600GT] according to hwbrowser.
I too doubt that this is a very widespread issue otherwise there would have been a lot more shouting about it. My card is an nVidia NV43GL Quadro FX 550 (rev 2a). Other things which might be relevant x86_64 4Gb memory (may be relevant since I'm getting MTRR errors)
I have 2G of ram and no MTRR errors.
Updated to today's state hoping things would improve. They don't. Digging a little more. There is a pulseaudio related process gconf-helper which is dead (zombie) when the login screen hangs. This process is up & running when the login greeter works. I have a bridged network configuration. After upgrade to F-9 this refuses to restart cleanly, basically there are no network connections when going from rc3 to rc5. This might very well be the root cause, and another issue which needs to be resolved. Stay tuned.
Found a walkaround for the network restart issue, see bug 450389. With this fix, init 3 /init 5 now works as expected.
This message is a reminder that Fedora 9 is nearing its end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 9. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '9'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 9's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 9 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug to the applicable version. If you are unable to change the version, please add a comment here and someone will do it for you. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Fedora 9 changed to end-of-life (EOL) status on 2009-07-10. Fedora 9 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. Thank you for reporting this bug and we are sorry it could not be fixed.