Red Hat Bugzilla – Bug 505716
Anaconda graphic install fails and, post install, the X server is unusable, nVidia NV43 [GeForce 6600 GT]
Last modified: 2010-01-04 09:53:03 EST
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:126.96.36.199) Gecko/2009042708 Fedora/3.0.10-1.fc10 Firefox/3.0.10
Graphics card: nVidia Corporation NV43 [GeForce 6600 GT] rev 162
Twin LCD monitors attached via DVI connectors
Anaconda: Attempting an installation from DVD (using either one or both monitors), the graphic install fails to start.
It gracefully falls back to text mode but only performs a bare bones installation after that - there are no further options to select after the next step.
/tmp/X.log reports... Fatal server error: Detected GPU lockup
Attachments to follow:
Retrying the install... At the first anaconda prompt screen, using the command line argument - xdriver=vesa (additional arguments, for personnel preference only, were also used - resolution=1280x1024 vga=791) the installation proceeded as expected.
Attachments to follow:
Post install, the X server successfully starts up using the vesa driver. stopping X (init 3) changing the driver to nouveau in xorg.conf, and restarting X (init 5) results in a continual loop between blank screen and shutdown LCD monitors (no signal displayed). Can't break out of the loop by switching to consoles, finally used Ctrl-Alt-Del to kill it.
Attachments to follow, collected after reboot:
Steps to Reproduce:
1.1. Insert DVD
1.2. Proceed to install
1.3. X server fails to start, gracefully falls back to (very) limited text mode installation where installation can proceed.
2.1 restart installation from DVD
2.2 add xdriver=vesa to kernel boot line (first installer screen and TAB key)
2.3 installation proceeds as expected.
3.1 Boot from newly installed OS. System starts up in X using vesa driver
3.2 Kill X, change driver listed in /etc/X11/xorg.conf from vesa to nouveau, then restart X
3.3 Display loops between blank screen and 'no signal' on LCD monitors. System eventually responds to Ctrl-Alt-Del key combination
I'm rating this as Severity: High because...
1. Anaconda (the installer) no longer prompts for input after falling back to text mode - other than Disk selection/formatting? From there it installs a barebones, (non X system) only. No other option is given. It's too much work to then complete the expected installation.
2. Nouveau driver running under x is unusable.
Created attachment 347700 [details]
Anaconda.log - Install using nouveau driver (3_anaconda.log)
Created attachment 347701 [details]
dmesg output - Install using nouveau driver (3_dmesg)
Created attachment 347702 [details]
X.log output - Install using nouveau driver (3_tmp_X.log)
Created attachment 347703 [details]
Anaconda.log - Install using vesa driver (4_vesa_anaconda.log)
Created attachment 347705 [details]
dmesg output - Install using vesa driver (4_vesa_dmesg)
Created attachment 347706 [details]
X.log output - Install using vesa driver (4_vesa_X.log)
Created attachment 347707 [details]
messages log from post install (7_messages)
Created attachment 347708 [details]
xorg.conf file used post install (7_xorg.conf)
Created attachment 347709 [details]
Xorg.log from post install (7_Xorg.0.log)
hmm, how very odd, it appears "something" is nuking a part of VRAM the driver uses to setup the graphics engines.. either that, or the aperture is simply not working for some unknown reason.
I've hacked up a build of xorg-x11-drv-nouveau to dump some registers as it starts up. Can you install the appropriate package from http://koji.fedoraproject.org/koji/taskinfo?taskID=1422336 and post the Xorg.0.log from running it?
There's a slight chance this may actually fix your issue accidently at the same time.
Installed and ran - with the same result. The log is attached as Xorg.0.log-register-dump
I had installed the closed source NVIDIA drivers from rpmfusion. I removed them from the system using yum and forced a reinstall of the mesa-libGL rpm as per http://fedorasolved.org/video-solutions/remove-nvidia-installer . Hopefully that was enough to get a clean result, if not I'll do it again as per new instructions, or as a virgin install on a spare drive.
The attached log (Xorg.0.log-register-dump) was done with the previous config file "xorg.conf file used post install (7_xorg.conf)".
Created attachment 348394 [details]
Xorg.log output - using nouveau driver after removing closed source drivers (Xorg.0.log-register-dump)
Since this bugzilla report was filed, there have been several major updates in various components of the Xorg system, which may have resolved this issue. Users who have experienced this problem are encouraged to upgrade their system to the latest version of their packages. For packages from updates-testing repository you can use command
yum upgrade --enablerepo='*-updates-testing'
Alternatively, you can also try to test whether this bug is reproducible with the upcoming Fedora 12 distribution by downloading LiveMedia of F12 Beta available at http://alt.fedoraproject.org/pub/alt/nightly-composes/ . By using that you get all the latest packages without need to install anything on your computer. For more information on using LiveMedia take a look at https://fedoraproject.org/wiki/FedoraLiveCD .
Please, if you experience this problem on the up-to-date system, let us now in the comment for this bug, or whether the upgraded system works for you.
If you won't be able to reply in one month, I will have to close this bug as INSUFFICIENT_DATA. Thank you.
[This is a bulk message for all open Fedora Rawhide Xorg-related bugs. I'm adding myself to the CC list for each bug, so I'll see any comments you make after this and do my best to make sure every issue gets proper attention.]
Glenn, if you update your system per Comment #14 is this still an issue for you?
Fedora Bugzappers volunteer triage team
I haven't been able to get to do an update yet, but I'd prefer to do an install from scratch. This bug really burns on a fresh, clean install so I need to do that to be sure. Sorry for the delay, I'll endeavour to get to it in the next week or so, if you're able to hold off closing till then. Sorry for the tardiness and time wasting. It wasn't meant to happen. :(
Um, It was working - sort of!
It started the install and successfully switched over to the quality graphics and everything appeared to be working okay, however after determining dependencies the graphics dropped out and I was left with a blank screen. I let it continue to see what drive noises I could hear and the machine powered down of its own accord.
I restarted the machine and smelt smoke - but too late, to power down and two bright flashes was the response. The power supply chips (? what appear to be) have given up the ghost and the card was way too hot to handle directly. Massive overheating of the card. The fan still turns freely but as to what caused it I don't know but suspect that it's been a huge current draw for the whole card to be as hot as it was. So, no more bug reports from that card.
We've had two other successful installs of F12, in this household, using more recent nvidia cards in two different machines and had no problems.
Make of it what you will but I guess you can consider this particular bug closed now. I have no way to replicate any of it.
Looking closely at the old setup. I suspect that in manipulating the hard drive cables I've pushed the video card against the neighbouring capture card and jammed the fan. There was minimal clearance and it wouldn't have taken much pressure to close the gap. It's the simplest explanation, and most probable.
Glenn, first my sympathy for your loss. :) Understand that it will no longer be possible for you to replicate this issue, and as there are no (obvious) dupes, I will go ahead and set this bug closed. Thank you, very much, for all your efforts to assist us with making Fedora as good a distribution as we can.
in general no bit of computer hardware should ever be able to fail that spectacularly due to nothing but software error. When something like what you describe happens, there is almost certainly a physical cause (as you later seem to have deduced). offering a card which allows software control of the fan speed but does _not_ throttle the GPU so it doesn't overheat if the fan stops running would be a really terrible idea and I've never heard of such a card. (Frankly, even a card which will cheerfully melt itself if the fan happens to fail for some physical reason is still a terrible design; CPUs have been protected against this kind of problem for years now, it's not like fan failure is a particularly uncommon occurrence). So I concur with Chris in sympathizing for your traumatic loss of hardware :), but I'm almost sure this must have been caused by something like what you describe. And also, your graphics card manufacturer deserves a solid kicking :)