Bug 457971 - kernels "rc1" for 2.6.27 crash in drm_ati_pcigart_init (NULL pointer)
Summary: kernels "rc1" for 2.6.27 crash in drm_ati_pcigart_init (NULL pointer)
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: rawhide
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Dave Airlie
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-08-05 20:29 UTC by Michal Jaegermann
Modified: 2008-08-08 21:44 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2008-08-08 21:44:31 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
full dmesg for 2.6.27-0.215.rc1.git4.fc10.x86_64 showing the crash in question (41.49 KB, text/plain)
2008-08-05 20:29 UTC, Michal Jaegermann
no flags Details

Description Michal Jaegermann 2008-08-05 20:29:03 UTC
Created attachment 313488 [details]
full dmesg for 2.6.27-0.215.rc1.git4.fc10.x86_64 showing the crash in question

Description of problem:

Attempts to start X server with 2.6.27-rc1 kernels invariably end
up with kernels trying a NULL pointer dereference followed
by a blank screen.  The last rawhide kernel with a working X is
2.6.27-0.186.rc0.git15.

Later kernels boot without any issues to level3 (console).
Only an attempt to start X kills video although a machine is
still accessible over the net.

Version-Release number of selected component (if applicable):
kernel-2.6.27-0.215.rc1.git4.fc10.x86_64 and earlier "rc1"

How reproducible:
always

Additional information:
With 2.6.27-0.211.rc1.git3.fc10.x86_64 there was also:

[drm] Detected VRAM RAM=65536K, accessible=65536K, BAR=65536K
[drm] Can't use agp base @0xe0000000lx, won't fit

2.6.27-0.215.rc1.git4.fc10.x86_64 gives only

[drm] Initialized drm 1.1.0 20060810
pci 0000:01:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
[drm] Initialized radeon 1.29.0 20080528 on minor 0
[drm] Setting GART location based on new memory map

so this appears to be ok but later in dmesg output one can see

BUG kmalloc-32 (Tainted: G      D  ): Padding overwritten. 0xffff880015ac1fd8-0xffff880015ac1fff

and similar although these may be results of messed up stack.

Comment 1 Michal Jaegermann 2008-08-06 01:32:24 UTC
While with older "rc1" kernels an attempt to start X server was killing
it and video doing the same after booting kernel-2.6.27-0.226.rc1.git5.fc10.i686
either locks up a machine completely without any traces in logs
or reboots it.

With a remote login active before X is tried the following shows up
on a remote screen before my machine dies:

Message from syslogd@dyna0 at Aug  5 19:22:56 ...
 kernel:Oops: 0002 [1] SMP DEBUG_PAGEALLOC

Message from syslogd@dyna0 at Aug  5 19:22:56 ...
 kernel:Code: e8 48 fd ff ff 31 db eb 3f 48 c1 e8 20 83 ca 0c 25 ff 00 00 00 c1 e0 04 09 c2 eb 0f 48 c1 e8 20 c1 ea 08 c1 e0 18 09 c2 83 ca 0c <41> 89 54 9d 00 48 ff c3 48 3b 5d d0 0f 82 0d ff ff ff 0f 09 66

Message from syslogd@dyna0 at Aug  5 19:22:56 ...
 kernel:CR2: 0000000000000000

Message from syslogd@dyna0 at Aug  5 19:22:56 ...
 kernel:general protection fault: 0000 [2] SMP DEBUG_PAGEALLOC

Message from syslogd@dyna0 at Aug  5 19:22:56 ...
 kernel:Code: 02 00 00 49 8d 94 24 c0 02 00 00 48 8b 03 eb 1e 4c 39 6b 10 75 12 48 89 df e8 fe ff e5 e0 48 89 df e8 8b c1 db e0 eb 0b 48 89 c3 <48> 8b 00 48 39 d3 75 dd 4c 89 f7 e8 2d e7 ff e0 5b 41 5c 41 5d

Message from syslogd@dyna0 at Aug  5 19:23:03 ...
 kernel:Oops: 0000 [3] SMP DEBUG_PAGEALLOC

Message from syslogd@dyna0 at Aug  5 19:23:03 ...
 kernel:Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 48 85 c0 74 0c 48 39 da 74 07 90 ff 88 98 00 00 00 <49> 8b 3e e8 4e a6 de e0 48 8b 43 40 31 d2 48 c7 40 68 00 00 00

Message from syslogd@dyna0 at Aug  5 19:23:03 ...
 kernel:CR2: 0000000000000000

Comment 2 Michal Jaegermann 2008-08-07 21:27:10 UTC
kernel 2.6.27-0.237.rc2.fc10.x86_64 allows me to start X again.
This takes a looong time, and a lots of screen blinking, regardless
if this is 'startx' from a console or a login from gdm, but
eventually it gets there.

I did not try enough times yet to be sure that this is reliable
but so far so good.  A difference between "rc1" and "rc2"?


Note You need to log in before you can comment on or make changes to this bug.