Bug 505716

Summary: Anaconda graphic install fails and, post install, the X server is unusable, nVidia NV43 [GeForce 6600 GT]
Product: [Fedora] Fedora Reporter: Glenn McKechnie <graybeard>
Component: xorg-x11-drv-nouveauAssignee: Ben Skeggs <bskeggs>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: low    
Version: 11CC: airlied, ajax, awilliam, bskeggs, campbecg, mcepl, mcepl
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-12-26 17:53:59 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Anaconda.log - Install using nouveau driver (3_anaconda.log)
none
dmesg output - Install using nouveau driver (3_dmesg)
none
X.log output - Install using nouveau driver (3_tmp_X.log)
none
Anaconda.log - Install using vesa driver (4_vesa_anaconda.log)
none
dmesg output - Install using vesa driver (4_vesa_dmesg)
none
X.log output - Install using vesa driver (4_vesa_X.log)
none
messages log from post install (7_messages)
none
xorg.conf file used post install (7_xorg.conf)
none
Xorg.log from post install (7_Xorg.0.log)
none
Xorg.log output - using nouveau driver after removing closed source drivers (Xorg.0.log-register-dump) none

Description Glenn McKechnie 2009-06-13 04:06:55 UTC
User-Agent:       Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.0.10) Gecko/2009042708 Fedora/3.0.10-1.fc10 Firefox/3.0.10

Graphics card: nVidia Corporation NV43 [GeForce 6600 GT] rev 162
Twin LCD monitors attached via DVI connectors 

Anaconda: Attempting an installation from DVD (using either one or both monitors), the graphic install fails to start.
It gracefully falls back to text mode but only performs a bare bones installation after that - there are no further options to select after the next step.
/tmp/X.log reports... Fatal server error: Detected GPU lockup

Attachments to follow:
3_anaconda.log
3_dmesg
3_tmp_X.log

Retrying the install... At the first anaconda prompt screen, using the command line argument - xdriver=vesa (additional arguments, for personnel preference only, were also used - resolution=1280x1024 vga=791) the installation proceeded as expected.

Attachments to follow:
4_vesa_anaconda.log
4_vesa_dmesg
4_vesa_X.log

Post install, the X server successfully starts up using the vesa driver. stopping X (init 3) changing the driver to nouveau in xorg.conf, and restarting X (init 5) results in a continual loop between blank screen and shutdown LCD monitors (no signal displayed). Can't break out of the loop by switching to consoles, finally used Ctrl-Alt-Del to kill it.

Attachments to follow, collected after reboot:
7_messages
7_Xorg.0.log
7_xorg.conf

Reproducible: Always

Steps to Reproduce:
1.1. Insert DVD
1.2. Proceed to install
1.3. X server fails to start, gracefully falls back to (very) limited text mode installation where installation can proceed.

2.1 restart installation from DVD
2.2 add xdriver=vesa to kernel boot line (first installer screen and TAB key)
2.3 installation proceeds as expected.

3.1 Boot from newly installed OS. System starts up in X using vesa driver
3.2 Kill X, change driver listed in /etc/X11/xorg.conf from vesa to nouveau, then restart X
3.3 Display loops between blank screen and 'no signal' on LCD monitors. System eventually responds to Ctrl-Alt-Del key combination
 





Actual Results:  
see above

Expected Results:  
Working display

I'm rating this as Severity: High because...
1. Anaconda (the installer) no longer prompts for input after falling back to text mode - other than Disk selection/formatting? From there it installs a barebones, (non X system) only. No other option is given. It's too much work to then complete the expected installation.
2. Nouveau driver running under x is unusable.

Comment 1 Glenn McKechnie 2009-06-13 04:08:59 UTC
Created attachment 347700 [details]
Anaconda.log - Install using nouveau driver (3_anaconda.log)

Comment 2 Glenn McKechnie 2009-06-13 04:09:51 UTC
Created attachment 347701 [details]
dmesg output - Install using nouveau driver (3_dmesg)

Comment 3 Glenn McKechnie 2009-06-13 04:10:47 UTC
Created attachment 347702 [details]
X.log output - Install using nouveau driver (3_tmp_X.log)

Comment 4 Glenn McKechnie 2009-06-13 04:11:43 UTC
Created attachment 347703 [details]
Anaconda.log - Install using vesa driver (4_vesa_anaconda.log)

Comment 5 Glenn McKechnie 2009-06-13 04:12:54 UTC
Created attachment 347705 [details]
dmesg output - Install using vesa driver (4_vesa_dmesg)

Comment 6 Glenn McKechnie 2009-06-13 04:14:08 UTC
Created attachment 347706 [details]
X.log output - Install using vesa driver (4_vesa_X.log)

Comment 7 Glenn McKechnie 2009-06-13 04:15:27 UTC
Created attachment 347707 [details]
messages log from post install (7_messages)

Comment 8 Glenn McKechnie 2009-06-13 04:16:20 UTC
Created attachment 347708 [details]
xorg.conf file used post install (7_xorg.conf)

Comment 9 Glenn McKechnie 2009-06-13 04:17:50 UTC
Created attachment 347709 [details]
Xorg.log from post install (7_Xorg.0.log)

Comment 10 Ben Skeggs 2009-06-13 04:57:53 UTC
hmm, how very odd, it appears "something" is nuking a part of VRAM the driver uses to setup the graphics engines.. either that, or the aperture is simply not working for some unknown reason.

Comment 11 Ben Skeggs 2009-06-18 05:47:00 UTC
I've hacked up a build of xorg-x11-drv-nouveau to dump some registers as it starts up.  Can you install the appropriate package from http://koji.fedoraproject.org/koji/taskinfo?taskID=1422336 and post the Xorg.0.log from running it?

There's a slight chance this may actually fix your issue accidently at the same time.

Comment 12 Glenn McKechnie 2009-06-18 08:50:25 UTC
Installed and ran - with the same result. The log is attached as Xorg.0.log-register-dump

I had installed the closed source NVIDIA drivers from rpmfusion. I removed them from the system using yum and forced a reinstall of the mesa-libGL rpm as per http://fedorasolved.org/video-solutions/remove-nvidia-installer . Hopefully that was enough to get a clean result, if not I'll do it again as per new instructions, or as a virgin install on a spare drive.

The attached log (Xorg.0.log-register-dump) was done with the previous config file "xorg.conf file used post install (7_xorg.conf)".

Installed rpms:
xorg-x11-drv-nouveau-debuginfo-0.0.12-39.1.20090528git0c17b87.fc11.x86_64
xorg-x11-drv-nouveau-0.0.12-39.1.20090528git0c17b87.fc11.x86_64

Comment 13 Glenn McKechnie 2009-06-18 08:53:31 UTC
Created attachment 348394 [details]
 Xorg.log output - using nouveau driver after removing closed source drivers (Xorg.0.log-register-dump)

Comment 14 Matěj Cepl 2009-11-05 18:31:39 UTC
Since this bugzilla report was filed, there have been several major updates in various components of the Xorg system, which may have resolved this issue. Users who have experienced this problem are encouraged to upgrade their system to the latest version of their packages. For packages from updates-testing repository you can use command

yum upgrade --enablerepo='*-updates-testing'

Alternatively, you can also try to test whether this bug is reproducible with the upcoming Fedora 12 distribution by downloading LiveMedia of F12 Beta available at http://alt.fedoraproject.org/pub/alt/nightly-composes/ . By using that you get all the latest packages without need to install anything on your computer. For more information on using LiveMedia take a look at https://fedoraproject.org/wiki/FedoraLiveCD .

Please, if you experience this problem on the up-to-date system, let us now in the comment for this bug, or whether the upgraded system works for you.

If you won't be able to reply in one month, I will have to close this bug as INSUFFICIENT_DATA. Thank you.

[This is a bulk message for all open Fedora Rawhide Xorg-related bugs. I'm adding myself to the CC list for each bug, so I'll see any comments you make after this and do my best to make sure every issue gets proper attention.]

Comment 15 Chris Campbell 2009-12-06 15:21:01 UTC
Glenn, if you update your system per Comment #14 is this still an issue for you?

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 16 Glenn McKechnie 2009-12-07 11:10:27 UTC
I haven't been able to get to do an update yet, but I'd prefer to do an install from scratch. This bug really burns on a fresh, clean install so I need to do that to be sure. Sorry for the delay, I'll endeavour to get to it in the next week or so, if you're able to hold off closing till then. Sorry for the tardiness and time wasting. It wasn't meant to happen. :(

Comment 17 Glenn McKechnie 2009-12-26 11:08:56 UTC
Um, It was working - sort of!

It started the install and successfully switched over to the quality graphics and everything appeared to be working okay, however after determining dependencies the graphics dropped out and I was left with a blank screen. I let it continue to see what drive noises I could hear and the machine powered down of its own accord.  
I restarted the machine and smelt smoke - but too late, to power down and two bright flashes was the response. The power supply chips (? what appear to be) have given up the ghost and the card was way too hot to handle directly. Massive overheating of the card. The fan still turns freely but as to what caused it I don't know but suspect that it's been a huge current draw for the whole card to be as hot as it was. So, no more bug reports from that card.
 
We've had two other successful installs of F12, in this household, using more recent nvidia cards in two different machines and had no problems.

Make of it what you will but I guess you can consider this particular bug closed now. I have no way to replicate any of it.

Comment 18 Glenn McKechnie 2009-12-26 11:24:19 UTC
Looking closely at the old setup. I suspect that in manipulating the hard drive cables I've pushed the video card against the neighbouring capture card and jammed the fan. There was minimal clearance and it wouldn't have taken much pressure to close the gap. It's the simplest explanation, and most probable.

Comment 19 Chris Campbell 2009-12-26 17:53:59 UTC
Glenn, first my sympathy for your loss. :) Understand that it will no longer be possible for you to replicate this issue, and as there are no (obvious) dupes, I will go ahead and set this bug closed. Thank you, very much, for all your efforts to assist us with making Fedora as good a distribution as we can.

Comment 20 Adam Williamson 2010-01-04 14:53:03 UTC
in general no bit of computer hardware should ever be able to fail that spectacularly due to nothing but software error. When something like what you describe happens, there is almost certainly a physical cause (as you later seem to have deduced). offering a card which allows software control of the fan speed but does _not_ throttle the GPU so it doesn't overheat if the fan stops running would be a really terrible idea and I've never heard of such a card. (Frankly, even a card which will cheerfully melt itself if the fan happens to fail for some physical reason is still a terrible design; CPUs have been protected against this kind of problem for years now, it's not like fan failure is a particularly uncommon occurrence). So I concur with Chris in sympathizing for your traumatic loss of hardware :), but I'm almost sure this must have been caused by something like what you describe. And also, your graphics card manufacturer deserves a solid kicking :)