Bug 531147

Summary: X.org with ATI Radeon locks up during Fedora installation
Product: [Fedora] Fedora Reporter: Jonathan Larmour <jifl-bugzilla>
Component: xorg-x11-drv-atiAssignee: Jérôme Glisse <jglisse>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: low    
Version: rawhideCC: awilliam, fdc, mcepl, mcepl, xgl-maint
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-10-31 02:12:34 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Description Flags
x.org logfile from installer
lspci -v output none

Description Jonathan Larmour 2009-10-27 01:08:10 UTC
Description of problem:

This problem is with a new Dell Studio laptop when installing the Fedora 12 x86/64 Beta. I haven't got as far as running a completed installation, but I see no reason to doubt it would suffer similarly. X crashes pretty reliably during the timezone selection phase of installation unless I pass "nomodeset" on the kernel args for the installer. I originally reported this in a comment to bug #517625 but was informed that that bug was now only for cases where adding pcie_aspm=off to the kernel command line was a successful workaround - in my case it did not help.

Version-Release number of selected component (if applicable):

Fedora 12 Beta

How reproducible:

At the point mentioned below, easily within a few seconds.

Steps to Reproduce:

The lockup occurs just by accepting defaults until you get to the Timezone selection page. All I have to do is move the mouse over and over the graphical map and then it locks up in a few seconds (it doesn't lock up if left outside).  
Actual results:

Once locked up, the mouse pointer can still move, but nothing else works, including ctrl-alt-f1, ctrl-alt-del, alt-sysrq-anything. Sometimes the screen goes completely blank instead (and locks up). Sometimes there's some dots at the very top left of the screen. Unfortunately the lock-up means I can't switch back to the console to look at X.log :-(. Repeating the exact same with "nomodeset" on the kernel cmdline and there's no lockup.

Expected results:

No freeze!

Additional info:

Here's the relevant lspci -v which I've had to transcribe, not cut'n'paste:
01:00.0 VGA compatible controller: ATI Technologies Inc Mobility Radeon HD 3650
(prog-if 00 [VGA controller])
 Subsystem: Dell Device 02a0
 Flags: bus master, fast devsel, latency 0, IRQ 16
 Memory at d0000000 (32-bit, prefetchable) [size=256M]
 I/O ports at 2000 [size=256]
 Memory at fc000000 (32-bit, non-prefetchable) [size=64K]
 [virtual] Expansion ROM at fc020000 [disabled] [size=128K]
 Capabilities: [50] Power Management version 3
 Capabilities: [58] Express Legacy Endpoint, MSI 00
 Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
 Capabilities: [100] Vendor Specific Information <?>
 Kernel driver in use: radeon

Transcribing /tmp/X.log is too hard :), but I did notice:
(EE) AIGLX error: dlopen of /usr/lib64/dri/r600_dri.so failed
(/usr/lib64/dri/r600_dri.so: cannot open shared object file: No such file or
(EE) AIGLX: Reverting to software rendering

which may be a factor, albeit unlikely. I can copy out anything else specific if you ask.

Unfortunately I cannot properly install FC12 Beta as I'm going to have to swap
out this laptop due to an unrelated fault. But hopefully this may allow someone
to reproduce the problem, without even having to get as far as installing FC12
beta (no hard disk changes are required for example). I can hopefully provide more data (e.g. X.log) next week if my laptop issue is resolved, but FC12 schedules may not be able to wait for that. If downloading an i386 or Live image would be a real benefit I could do that, but bandwidth is a bit of an issue for me.

I have only set the severity as medium as there is a workaround, but it is going to lead to a poor user experience if such a workaround is needed. Needless to say, crashing during installation is not going to win new followers, and later yum updates won't help with installation.

Incidentally I tried running the installer with the extra kernel arg radeon.modeset=0 as also suggested in bug #517625 (instead of nomodeset), but this resulted in /sbin/loader SEGVing on boot before anaconda has even had a chance to properly get going:

running install...
running /sbin/loader
loader received SIGSEGV! Backtrace:
install exited abnormally [1/1]

Should I submit this as a separate bug? Is this meant to work?

Comment 1 Adam Williamson 2009-10-27 01:42:55 UTC
This may be 528593, or at least one of the possibly multiple problems currently stacked up in that bug.

The message about DRI is not the cause of this problem, ignore that message.

It would help very much if you could get kernel / X.org logs from after a failure. To do this, boot without 'nomodeset', wait for it to fail. Take a note of the time when it fails. Then reboot, adding the parameter '3' to the kernel command line. This will boot you to a text console. Log in, and copy /var/log/Xorg.0.log and /var/log/messages to your home directory. Now you can reboot again with 'nomodeset' to get a working desktop, and attach the two log files to this report, along with a note of the time when the crash happened (so we know where to look in /var/log/messages ).


Fedora Bugzappers volunteer triage team

Comment 2 Jonathan Larmour 2009-10-27 04:45:11 UTC
Created attachment 366200 [details]
x.org logfile from installer

By copying the logs in a loop to a usb key I've been able to get them. However the syslog just ends with:
<7>SELinux: initialized (dev sdb, type vfat), uses genfs_contexts
which is from me mounting the usb key, prior to the lock-up.

The X logfile is attached, but to me doesn't appear to have any evidence - maybe X froze enough there wasn't any. It may have to wait until next week when I can put a proper install on the laptop and get remote network access to get the logs as well as run GDB. Please leave the bug open until then.

There's a good chance this is the same as bug #528593 (which I hadn't found in my search before, thanks), and if so, having it crash installation on half of new Dell laptops would make it quite a nasty problem IMHO justifying being an F12 blocker, but we'll see.

Comment 3 Adam Williamson 2009-10-28 01:33:11 UTC
Yeah, this could wind up as a blocker, but it's still too vague for me to be comfortable: I don't know if everyone on 528593 (and you) actually has the same problem.

Jerome, Matej, Francois - can you guys look at this and 528593 as a priority? Can you think of any further triage we can do to clarify which reporters have the same problem as each other and which have a different problem? Can anyone reproduce this on any of their hardware? Thanks.

Comment 4 Jérôme Glisse 2009-10-28 13:18:46 UTC
pcie_aspm=off might just hide the problem, thus all this R600/R700 lockup might share the same root. Jonathan please attach a full lspci -v, my investigation seems to show than only Intel motherboard (ICH8, ICH9 family) are affected by this. I am running more test, will report soon if i think we can merge all R600/R700 lockup bug into one.

Comment 5 Jonathan Larmour 2009-10-29 00:15:52 UTC
Created attachment 366518 [details]
lspci -v output

Hi Jerome,

Indeed you're right - this has an ICH9 board. I've attached the full lspci -v anyway just in case it's still useful. Let me know if I can provide anything else (while for the moment still only using the installer image, not a full installation - hopefully rectified tomorrow or soon after).

Comment 6 Jérôme Glisse 2009-10-29 13:57:19 UTC
Can you try if following iso works :

Comment 7 Jonathan Larmour 2009-10-31 02:12:34 UTC
Thanks Jerome. I've tried the ISO from your comment #6 and have found no problems. In particular I tried the time zone selector, which behaved properly. Strictly it isn't the same context (anaconda) but even so...

I think since there's a reasonable possibility this may have been a dup of bug #528593 I'll clear this one from your lists and mark it closed - I can presumably reopen it later if it proves to still be present when there is an updated installation ISO.

Comment 8 Adam Williamson 2009-10-31 05:29:32 UTC
thanks, jon. if you can run it for a while and make sure it doesn't hang that'd be helpful. we think this is potentially part of a wider issue with r600+ICH combination which doesn't seem to be fixed that. thanks.

Fedora Bugzappers volunteer triage team