Bug 125974

Summary: Kernel 2.6.6-1.427 & 1.435 causes a system lockup under X Windows.
Product: [Fedora] Fedora Reporter: David A. Cafaro <dac>
Component: kernelAssignee: Dave Jones <davej>
Status: CLOSED NEXTRELEASE QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 2CC: mharris, pfrields
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-04-16 05:53:57 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Log of failed startup none

Description David A. Cafaro 2004-06-14 18:33:06 UTC
Description of problem:
I recently updated my laptop to the 2.6.6-1.427 kernel from the
2.6.5-1.358 kernel on Fedora Core 2.  When rebooting to run level 5
using the new kernel, all seemed fine untill the system got to the
point where it says it was setting hostname.  At this point the system
froze solid and the HD light stayed on continuesly.  There was no
response from the keyboard or mouse, and a hard reset was neccessary.

Upon the second reboot attempt, I started the system in run level 3. 
The new kernel seemed to boot fine and I was presented the login
prompt.  I logged in as root and tested several things to make sure
the system was stable.  After determining everything was fine I tried
running startx.  X Windows never made it.  Eventualy I was left with a
blank screen and a continuesly spinning harddrive light.  Checking the
logs I found X Windows had died at this point:

(II) RADEON(0): [agp] Ring read ptr mapped at 0xf5d6d000
(II) RADEON(0): [agp] vertex/indirect buffers handle = 0xec102000
(II) RADEON(0): [agp] Vertex/indirect buffers mapped at 0xf5b6d000
(II) RADEON(0): [agp] GART texture map handle = 0xec302000
(II) RADEON(0): [agp] GART Texture map mapped at 0xf568d000
(II) RADEON(0): [drm] register handle = 0xe8100000
(II) RADEON(0): [dri] Visual configs initialized

There were no logs from the boot into run level 5.  All that I saw was
that the gui said it was setting hostname, and nothing more.  I'll try
and find out more details of when exactly it locked during boot.

Previous to this I had no problem booting with the older kernel into
run level 5.  X Windows ran fine with hardware acceleration enabled.



Version-Release number of selected component (if applicable):
kernel-2.6.6-1.427


How reproducible:
Always


Steps to Reproduce:
1.  Boot with new kernel in run level 5


Actual Results:  The system locks with the harddrive light continuesly
running.


Expected Results:  The system boots to a GUI login screen


Additional info:

Sharp MM20 Laptop, Efficeon 1Ghz processor, ATI Radeon Graphics,
Synaptics Touchpad.  1024x768x24 Hardware Acclerated X Windows.

Comment 1 David A. Cafaro 2004-06-15 13:06:30 UTC
The new Kernel 2.6.6-1.435 has the same problem.  Systems still locks
up at the setting hostname point in the Graphical boot process.  Still
trying to get more info, but it's doesn't seem to be booting far
enough to get any log messages. besides the X Windows log message
posted above.

Comment 2 David A. Cafaro 2004-06-23 23:33:21 UTC
Ok narrowed it down to this:

setting

Option "AGPMode" "4"
or 
Option "AGPMode" "2"

in the xorg.con file causes the lockup.  With this line commented out
the system boots and AGP DRI works, just works at AGP 1x.  The
transmeta bus should support 4x according to architecture documents,
and under the 2.6.5 kernel 4x mode worked fine.  

Looking at the release notes for the kernels there was an update to
the frontside bus support for the efficeon processor between the 2.6.5
kernels and the 2.6.6 and above kernels.  I'm going to look into this
more.  Also looks like I need to report the bug to the kernel.org bug
list as a self compiled 2.6.7 kernel has the same issue.

Comment 3 Mike A. Harris 2004-06-24 16:26:35 UTC
If the "AGPMode" setting is used in the X config file, it must match
the AGP mode set in the computer's CMOS settings.  If there is a
discrepancy between the AGP mode the BIOS has set, and the one the
X server is configured to use, the result is undefined behaviour,
and a system lockup should be expected.

In general, users should set the AGP mode in the BIOS, and never use
the X server AGPmode setting, as it does not do the same thing.

Hope this helps.

Comment 4 David A. Cafaro 2004-06-25 01:50:36 UTC
On this system there are no settings in the bios to adjust the
AGPMode.  It's a laptop and the graphics card chip is intergrated into
the system.  I would normaly assume that the AGPmode would default to
the 4x, but that doesn't seem to be the case.  In the older 2.6.5
kernels the AGPMode setting could be used to set the system to the 4x
mode.  Unfortunetly this seems to be broken in the 2.6.6 and greater
kernels.  The default is 1x, and can't be changed.. yet.

I'm trying to find a good benchmark to actualy test if the system is
running in AGP 1x or 4x.  Also still need to look again at the changes
to the northbridge driver between the 2.6.5 and 2.6.6 kernels.  

I wish it was something as simple as a BIOS setting, but looks like
this may have been designed as a software switch (why doesn't that
surprise me).  I'll know better with some preformance tests.  Thanks
for the info!



Comment 5 Dave Jones 2004-07-04 13:39:12 UTC
there's a lot of magic at work here. There are a whole load of chipset
/ graphic board combinations that just do not work at all when put
into higher modes (despite the hardware advertising that they are
capable of such).  The binary drivers for NVIDIA/ATI cards for example
have dozens of workarounds for known issues with timings that must be
tweaked etc when certain combinations are detected. These workarounds
are of course undocumented, so the free drivers don't have them.

This means using a mode greater than x1 is pretty hit and miss.
It's the same situation with AGP Fast Writes.
It works fine for some folks, not for others..


Comment 6 Dave Jones 2004-11-27 20:21:22 UTC
mass update for old bugs:

Is this still a problem with the 2.6.9 based update kernel ?


Comment 7 David A. Cafaro 2004-11-27 21:10:11 UTC
I don't know yet, I will have to check, but I'm not running the 2.6.9 based
update kernel due to bugs in ACPI which at this point is more important than AGP
4x for me.  I'll still try and give the 2.6.9 a test just to see if the AGP
issue is fixed, if so I'll post it here.

Comment 8 David A. Cafaro 2004-11-27 21:35:50 UTC
Created attachment 107505 [details]
Log of failed startup

AGP mode 2/4 still causes the system to lock up using the 2.6.9-1.6_FC2 kernel.
 I've included the logfile from the failed startup.

Comment 9 Dave Jones 2005-04-16 05:53:57 UTC
Fedora Core 2 has now reached end of life, and no further updates will be
provided by Red Hat.  The Fedora legacy project will be producing further kernel
updates for security problems only.

If this bug has not been fixed in the latest Fedora Core 2 update kernel, please
try to reproduce it under Fedora Core 3, and reopen if necessary, changing the
product version accordingly.

Thank you.