Bug 457598 - kernel BUG at drivers/gpu/drm/drm_fops.c:144! invalid opcode: 0000
kernel BUG at drivers/gpu/drm/drm_fops.c:144! invalid opcode: 0000
Status: CLOSED RAWHIDE
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
rawhide
All Linux
low Severity low
: ---
: ---
Assigned To: Dave Airlie
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2008-08-01 14:33 EDT by Tom London
Modified: 2008-10-28 16:00 EDT (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-10-28 16:00:27 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
oopsing 2.6.27-0.211.rc1.git3.fc10.x86_64 after an attempt to start X server (21.50 KB, text/plain)
2008-08-02 21:06 EDT, Michal Jaegermann
no flags Details

  None (edit)
Description Tom London 2008-08-01 14:33:42 EDT
Description of problem:

kernel BUG at drivers/gpu/drm/drm_fops.c:144!
invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
Modules linked in: i915 drm ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core
ib_addr iscsi_tcp libiscsi scsi_transport_iscsi coretemp nf_conntrack_ipv4
ipt_REJECT iptable_filter ip_tables nf_conntrack_netbios_ns nf_conntrack_ipv6
xt_state nf_conntrack xt_tcpudp ip6t_ipv6header ip6t_REJECT ip6table_filter
ip6_tables x_tables ipv6 cpufreq_ondemand acpi_cpufreq fuse loop dm_multipath
kvm_intel kvm sr_mod cdrom ata_piix ppdev ata_generic arc4 ecb thinkpad_acpi
iTCO_wdt hwmon i2c_i801 firewire_ohci snd_hda_intel iTCO_vendor_support
firewire_core i2c_core sdhci_pci crypto_blkcipher snd_seq_dummy pata_acpi sdhci
crc_itu_t mmc_core snd_seq_oss snd_seq_midi_event yenta_socket snd_seq
rsrc_nonstatic snd_seq_device snd_pcm_oss snd_mixer_oss iwl3945 ac battery
parport_pc snd_pcm rfkill video bay parport snd_timer output hci_usb
snd_page_alloc mac80211 snd_hwdep bluetooth cfg80211 e1000e snd soundcore usblp
dm_snapshot dm_zero dm_mirror dm_log dm_mod ahci ext3 jbd mbcache uhci_hcd
ohci_hcd ehci
_hcd
Pid: 2903, comm: Xorg Not tainted (2.6.27-0.208.rc1.git2.fc10.i686 #1)
Pid: 2903, comm: Xorg Not tainted (2.6.27-0.208.rc1.git2.fc10.i686 #1) EIP:
0060:[<f96204de>] EFLAGS: 00213206 CPU: 1
EIP is at drm_open+0x440/0x48a [drm]
EAX: f4923bd8 EBX: 00000000 ECX: f3b7007c EDX: f3a00000
ESI: f3a00028 EDI: f3a00000 EBP: f3bfce70 ESP: f3bfce4c
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process Xorg (pid: 2903, ti=f3bfc000 task=f43b8000 task.ti=f3bfc000)
Stack: f3a00000 ebd30800 f3b7007c 00000000 f3a00028 ebd2e014 ffffffed f962f6d0 
      ebd30800 f3bfce88 f9620043 f3b7007c 00000000 f3bb9af0 f3bb9af0 f3bfceac 
      c049e30a ebd30800 f3b7007c 00000000 00000000 ebd30800 00000000 f3b7007c 
Call Trace:
[<f9620043>] ? drm_stub_open+0xa9/0x104 [drm]
[<c049e30a>] ? chrdev_open+0x12e/0x145
[<c049a548>] ? __dentry_open+0x15e/0x256
[<c049a6d1>] ? nameidata_to_filp+0x24/0x38
[<c049e1dc>] ? chrdev_open+0x0/0x145
[<c04a4996>] ? do_filp_open+0x32b/0x62b
[<c0449e4d>] ? trace_hardirqs_off_caller+0x15/0x91
[<c0408633>] ? sched_clock+0x8/0xb
[<c049a2a6>] ? get_unused_fd_flags+0xb3/0xbd
[<c051bb45>] ? _raw_spin_unlock+0x74/0x78
[<c049a2a6>] ? get_unused_fd_flags+0xb3/0xbd
[<c049a2f5>] ? do_sys_open+0x45/0xbb
[<c049a3b7>] ? sys_open+0x23/0x2b
[<c0403cbe>] ? syscall_call+0x7/0xb
======================
======================Code: ea 23 06 c7 8b 75 dc 31 d2 83 c6 28 89 f0 e8 a9 13
06 c7 8b 55 dc 8b 82 30 01 00 00 85 c0 74 0f 8b 4d e4 3b 81 28 01 00 00 74 16
<0f> 0b eb fe 8b 7d e4 8b 55 dc 8b 87 28 01 00 00 89 82 30 01 00 
EIP: [<f96204de>] drm_open+0x440/0x48a [drm] SS:ESP 0068:f3bfce4c
---[ end trace 366010d8e7d8bd66 ]---

Version-Release number of selected component (if applicable):
kernel-2.6.27-0.208.rc1.git2.fc10.i686

How reproducible:
Not sure.... Seems like every graphical boot

Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:
Comment 1 Tom London 2008-08-01 15:58:01 EDT
Not sure if this is related:  Got this on a subsequent reboot:

Aug  1 12:35:10 localhost gdm-simple-slave[2916]: DEBUG(+): GdmServer: Opening
logfile for server /var/log/gdm/:0.log
Aug  1 12:35:10 localhost gdm-simple-slave[2915]: DEBUG(+): GdmServer: Started X
server process 2916 - waiting for READY
Aug  1 12:35:10 localhost gdm-simple-slave[2915]: DEBUG(+): GdmSimpleSlave:
Started X server
Aug  1 12:35:11 localhost acpid: client connected from 2916[0:0]
Aug  1 12:35:13 localhost kernel: [drm] Initialized drm 1.1.0 20060810
Aug  1 12:35:13 localhost kernel: pci 0000:00:02.0: power state changed by ACPI
to D0
Aug  1 12:35:13 localhost kernel: pci 0000:00:02.0: PCI INT A -> GSI 16 (level,
low) -> IRQ 16
Aug  1 12:35:13 localhost kernel: [drm] Initialized i915 1.6.0 20080730 on minor 0
Aug  1 12:35:14 localhost kernel:
Aug  1 12:35:14 localhost kernel: =============================================
Aug  1 12:35:14 localhost kernel: [ INFO: possible recursive locking detected ]
Aug  1 12:35:14 localhost kernel: 2.6.27-0.208.rc1.git2.fc10.i686 #1
Aug  1 12:35:14 localhost kernel: ---------------------------------------------
Aug  1 12:35:14 localhost kernel: Xorg/2916 is trying to acquire lock:
Aug  1 12:35:14 localhost kernel: (&dev->struct_mutex){--..}, at: [<f9621efd>]
drm_release+0x343/0x3e0 [drm]
Aug  1 12:35:14 localhost kernel:
Aug  1 12:35:14 localhost kernel: but task is already holding lock:
Aug  1 12:35:14 localhost kernel: (&dev->struct_mutex){--..}, at: [<f9621e6c>]
drm_release+0x2b2/0x3e0 [drm]
Aug  1 12:35:14 localhost kernel:
Aug  1 12:35:14 localhost kernel: other info that might help us debug this:
Aug  1 12:35:14 localhost kernel: 1 lock held by Xorg/2916:
Aug  1 12:35:14 localhost kernel: #0:  (&dev->struct_mutex){--..}, at:
[<f9621e6c>] drm_release+0x2b2/0x3e0 [drm]
Aug  1 12:35:14 localhost kernel:
Aug  1 12:35:14 localhost kernel: stack backtrace:
Aug  1 12:35:14 localhost kernel: Pid: 2916, comm: Xorg Not tainted
2.6.27-0.208.rc1.git2.fc10.i686 #1
Aug  1 12:35:14 localhost kernel: [<c06805d9>] ? printk+0x14/0x1b
Aug  1 12:35:14 localhost kernel: [<c044b7e8>] __lock_acquire+0x6be/0x97d
Aug  1 12:35:14 localhost kernel: [<c0449e4d>] ? trace_hardirqs_off_caller+0x15/0x91
Aug  1 12:35:14 localhost kernel: [<c04087b6>] ? native_sched_clock+0xb3/0xd3
Aug  1 12:35:14 localhost kernel: [<c044bb11>] lock_acquire+0x6a/0x90
Aug  1 12:35:14 localhost kernel: [<f9621efd>] ? drm_release+0x343/0x3e0 [drm]
Aug  1 12:35:14 localhost kernel: [<c06815ca>] __mutex_lock_common+0xbc/0x2ed
Aug  1 12:35:14 localhost kernel: [<f9621efd>] ? drm_release+0x343/0x3e0 [drm]
Aug  1 12:35:14 localhost kernel: [<c06818a2>] mutex_lock_nested+0x33/0x3b
Aug  1 12:35:14 localhost kernel: [<f9621efd>] ? drm_release+0x343/0x3e0 [drm]
Aug  1 12:35:14 localhost kernel: [<f9621efd>] drm_release+0x343/0x3e0 [drm]
Aug  1 12:35:14 localhost kernel: [<c049cb91>] __fput+0xb8/0x148
Aug  1 12:35:14 localhost kernel: [<c049cc3d>] fput+0x1c/0x1e
Aug  1 12:35:14 localhost kernel: [<c049a137>] filp_close+0x55/0x5f
Aug  1 12:35:14 localhost kernel: [<c049a1b9>] sys_close+0x78/0xb2
Aug  1 12:35:14 localhost kernel: [<c0403cbe>] syscall_call+0x7/0xb
Aug  1 12:35:14 localhost kernel: =======================
Aug  1 12:35:36 localhost init: tty4 main process (2844) killed by TERM signal
Aug  1 12:35:36 localhost init: tty5 main process (2845) killed by TERM signal
Aug  1 12:35:36 localhost init: tty2 main process (2846) killed by TERM signal
Aug  1 12:35:36 localhost init: tty3 main process (2847) killed by TERM signal
Aug  1 12:35:36 localhost init: tty1 main process (2848) killed by TERM signal
Aug  1 12:35:36 localhost init: tty6 main process (2849) killed by TERM signal
[root@localhost ~]# 

The behavior is "only black screen" on boot to runlevel 5.
Comment 2 Tom London 2008-08-01 19:51:55 EDT
Interestingly, I booted up in runlevel 3 and then did a "startx". Graphical
stuff (gnome) came up without issue.....

No BUG and no recursive lock problem.
Comment 3 Michal Jaegermann 2008-08-02 21:06:44 EDT
Created attachment 313268 [details]
oopsing 2.6.27-0.211.rc1.git3.fc10.x86_64 after an attempt to start X server

With 2.6.27-0.211.rc1.git3.fc10.x86_64 an attempt to start X server
stops for me, regardless if 'startx' was used or if I went to level 5,
after
.....
finished PLL1
restore dac
(==) RADEON(0): Backing store disabled
(II) RADEON(0): [DRI] installation complete

was printed in logs.  With 2.6.27-0.208.rc1.git2.fc10.x86_64 this dies
earlier. Xorg.log consistently ends up with:
....
drmOpenDevice: open result is 11, (OK)
drmOpenByBusid: drmOpenMinor returns 11
drmOpenByBusid: drmGetBusid reports pci:0000:01:00.0

If 2.6.27-0.186.rc0.git15.fc10.x86_64 kernel is used then
starting X is without issues.  This is how drm shows up in dmesg
then:
....
[drm] Initialized drm 1.1.0 20060810
pci 0000:01:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
[drm] Initialized radeon 1.29.0 20080528 on minor 0
[drm] Setting GART location based on new memory map
[drm] Loading R300 Microcode
[drm] Num pipes: 2
[drm] writeback test succeeded in 1 usecs

as opposed to "[drm] Can't use agp base @0xe0000000lx, won't fit"
followed by a series of bombs.
Comment 4 Tom London 2008-08-04 09:57:32 EDT
Continue to get this with 0.215 on gdm startup.  Result is "black screen of death" (screen black and unresponsive.  Only apparent recourse is to power cycle).

Kerneloops'ed here: http://www.kerneloops.org/submitresult.php?number=47821

Aug  4 06:46:51 localhost kernel: [drm] Initialized drm 1.1.0 20060810
Aug  4 06:46:52 localhost kernel: pci 0000:00:02.0: power state changed by ACPI to D0
Aug  4 06:46:52 localhost kernel: pci 0000:00:02.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
Aug  4 06:46:52 localhost kernel: [drm] Initialized i915 1.6.0 20080730 on minor 0
Aug  4 06:46:53 localhost kernel: ------------[ cut here ]------------
Aug  4 06:46:53 localhost kernel: kernel BUG at drivers/gpu/drm/drm_fops.c:144!
Aug  4 06:46:53 localhost kernel: invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
Aug  4 06:46:53 localhost kernel: Modules linked in: i915 drm ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi scsi_transport_iscsi coretemp nf_conntrack_ipv4 ipt_REJECT iptable_filter ip_tables nf_conntrack_netbios_ns nf_conntrack_ipv6 xt_state nf_conntrack xt_tcpudp ip6t_ipv6header ip6t_REJECT ip6table_filter ip6_tables x_tables ipv6 cpufreq_ondemand acpi_cpufreq fuse loop dm_multipath kvm_intel kvm sr_mod cdrom ppdev thinkpad_acpi hwmon firewire_ohci yenta_socket snd_hda_intel firewire_core rsrc_nonstatic ata_piix sdhci_pci i2c_i801 sdhci i2c_core snd_seq_dummy ata_generic iTCO_wdt iTCO_vendor_support mmc_core crc_itu_t arc4 snd_seq_oss snd_seq_midi_event pata_acpi snd_seq ecb crypto_blkcipher battery snd_seq_device ac video bay snd_pcm_oss parport_pc output snd_mixer_oss parport snd_pcm snd_timer snd_page_alloc iwl3945 rfkill snd_hwdep snd mac80211 soundcore e1000e cfg80211 dm_snapshot dm_zero dm_mirror dm_log dm_mod ahci ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd
Aug  4 06:46:53 localhost kernel:
Aug  4 06:46:53 localhost kernel: Pid: 2822, comm: Xorg Not tainted (2.6.27-0.215.rc1.git4.fc10.i686 #1)
Aug  4 06:46:53 localhost kernel: EIP: 0060:[<f96074d5>] EFLAGS: 00213287 CPU: 1
Aug  4 06:46:53 localhost kernel: EIP is at drm_open+0x440/0x48b [drm]
Aug  4 06:46:53 localhost kernel: EAX: f39b0de8 EBX: 00000000 ECX: f39b0fcc EDX: f7c5b090
Aug  4 06:46:53 localhost kernel: ESI: f7c5b0b8 EDI: f7c5b090 EBP: f38f0e70 ESP: f38f0e4c
Aug  4 06:46:53 localhost kernel: DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Aug  4 06:46:53 localhost kernel: Process Xorg (pid: 2822, ti=f38f0000 task=f394df80 task.ti=f38f0000)
Aug  4 06:46:53 localhost kernel: Stack: f7c5b090 ec2acb00 f39b0fcc 00000000 f7c5b0b8 f391f4c4 ffffffed f96166f4 
Aug  4 06:46:53 localhost kernel:       ec2acb00 f38f0e88 f960703a f39b0fcc 00000000 f38e8e00 f38e8e00 f38f0eac 
Aug  4 06:46:53 localhost kernel:       c049e3ee ec2acb00 f39b0fcc 00000000 00000000 ec2acb00 00000000 f39b0fcc 
Aug  4 06:46:53 localhost kernel: Call Trace:
Aug  4 06:46:53 localhost kernel: [<f960703a>] ? drm_stub_open+0xa9/0x104 [drm]
Aug  4 06:46:53 localhost kernel: [<c049e3ee>] ? chrdev_open+0x12e/0x145
Aug  4 06:46:53 localhost kernel: [<c049a62f>] ? __dentry_open+0x15e/0x256
Aug  4 06:46:53 localhost kernel: [<c049a7b8>] ? nameidata_to_filp+0x24/0x38
Aug  4 06:46:53 localhost kernel: [<c049e2c0>] ? chrdev_open+0x0/0x145
Aug  4 06:46:53 localhost kernel: [<c04a4a71>] ? do_filp_open+0x32b/0x62b
Aug  4 06:46:53 localhost kernel: [<c04087b6>] ? native_sched_clock+0xb3/0xd3
Aug  4 06:46:53 localhost kernel: [<c0408633>] ? sched_clock+0x8/0xb
Aug  4 06:46:53 localhost kernel: [<c044942d>] ? lock_release_holdtime+0x1a/0x115
Aug  4 06:46:53 localhost kernel: [<c051bdd5>] ? _raw_spin_unlock+0x74/0x78
Aug  4 06:46:53 localhost kernel: [<c0682bc3>] ? _spin_unlock+0x22/0x25
Aug  4 06:46:53 localhost kernel: [<c049a3ee>] ? do_sys_open+0x47/0xbc
Aug  4 06:46:53 localhost kernel: [<c049a4af>] ? sys_open+0x23/0x2b
Aug  4 06:46:53 localhost kernel: [<c0403cbe>] ? syscall_call+0x7/0xb
Aug  4 06:46:53 localhost kernel: =======================
Aug  4 06:46:53 localhost kernel: Code: f3 b6 07 c7 8b 75 dc 31 d2 83 c6 28 89 f0 e8 b2 a6 07 c7 8b 55 dc 8b 82 30 01 00 00 85 c0 74 0f 8b 4d e4 3b 81 28 01 00 00 74 16 <0f> 0b eb fe 8b 7d e4 8b 55 dc 8b 87 28 01 00 00 89 82 30 01 00 
Aug  4 06:46:53 localhost kernel: EIP: [<f96074d5>] drm_open+0x440/0x48b [drm] SS:ESP 0068:f38f0e4c
Aug  4 06:46:53 localhost kernel: ---[ end trace f4c7a645fe5ae58a ]---
Aug  4 06:47:21 localhost gdm-simple-slave[2821]: DEBUG(+): GdmSignalHandler: handling signal 15
Comment 5 Tom London 2008-08-05 16:32:41 EDT
kernel-2.6.27-0.226.rc1.git5.fc10.i686 booted up twice in a row to runlevel 5 without issue.

* Tue Aug 05 2008 Dave Airlie <airlied@redhat.com> 
- more drm regressions squashed

Could this have fixed this?
Comment 6 Tom London 2008-08-06 09:23:52 EDT
Sigh, I was premature.

Still happens with 0.226:

http://www.kerneloops.org/submitresult.php?number=48600

Does boot up OK sometimes, though.....
Comment 7 Tom London 2008-08-07 10:09:30 EDT
Continuing a trend....

This still crashes with the "black screen of death" with kernel-2.6.27-0.237.rc2.fc10.i686....... Requires power cycle to recover.

Appears to happen about 1 in 2 or 3 boots.

http://www.kerneloops.org/submitresult.php?number=49006

[Should this be a blocker?]

kernel BUG at drivers/gpu/drm/drm_fops.c:144!
invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
Modules linked in: i915 drm ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi scsi_transport_iscsi coretemp nf_conntrack_ipv4 ipt_REJECT iptable_filter ip_tables nf_conntrack_netbios_ns nf_conntrack_ipv6 xt_state nf_conntrack xt_tcpudp ip6t_ipv6header ip6t_REJECT ip6table_filter ip6_tables x_tables ipv6 cpufreq_ondemand acpi_cpufreq fuse loop dm_multipath kvm_intel kvm thinkpad_acpi hwmon serio_raw i2c_i801 i2c_core sr_mod cdrom ata_piix ata_generic pata_acpi iTCO_wdt iTCO_vendor_support sdhci_pci sdhci mmc_core firewire_ohci firewire_core crc_itu_t yenta_socket rsrc_nonstatic arc4 ecb crypto_blkcipher iwl3945 rfkill mac80211 cfg80211 e1000e snd_hda_intel snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_page_alloc snd_hwdep snd soundcore bay video output battery ac ppdev parport_pc parport dm_snapshot dm_zero dm_mirror dm_log dm_mod ahci ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd
Pid: 3295, comm: Xorg Not tainted (2.6.27-0.237.rc2.fc10.i686 #1)
Pid: 3295, comm: Xorg Not tainted (2.6.27-0.237.rc2.fc10.i686 #1) EIP: 0060:[<f960a4cd>] EFLAGS: 00213283 CPU: 1
EIP is at drm_open+0x440/0x48b [drm]
EAX: f3199a28 EBX: 00000000 ECX: f327838c EDX: f7da40c0
ESI: f7da40e8 EDI: f7da40c0 EBP: f3190e70 ESP: f3190e4c
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process Xorg (pid: 3295, ti=f3190000 task=f67097e0 task.ti=f3190000)
Stack: f7da40c0 ebf95c00 f327838c 00000000 f7da40e8 f32193d4 ffffffed f96196f0 
      ebf95c00 f3190e88 f960a032 f327838c 00000000 f303ccb0 f303ccb0 f3190eac 
      c049e3e6 ebf95c00 f327838c 00000000 00000000 ebf95c00 00000000 f327838c 
Call Trace:
[<f960a032>] ? drm_stub_open+0xa9/0x104 [drm]
[<c049e3e6>] ? chrdev_open+0x12e/0x145
[<c049a627>] ? __dentry_open+0x15e/0x256
[<c049a7b0>] ? nameidata_to_filp+0x24/0x38
[<c049e2b8>] ? chrdev_open+0x0/0x145
[<c04a4a69>] ? do_filp_open+0x32b/0x62b
[<c04087b6>] ? native_sched_clock+0xb3/0xd3
[<c0408633>] ? sched_clock+0x8/0xb
[<c0449425>] ? lock_release_holdtime+0x1a/0x115
[<c051bdd5>] ? _raw_spin_unlock+0x74/0x78
[<c0682c4b>] ? _spin_unlock+0x22/0x25
[<c049a3e6>] ? do_sys_open+0x47/0xbc
[<c049a4a7>] ? sys_open+0x23/0x2b
[<c0403cbe>] ? syscall_call+0x7/0xb
======================
======================Code: 83 87 07 c7 8b 75 dc 31 d2 83 c6 28 89 f0 e8 3a 77 07 c7 8b 55 dc 8b 82 30 01 00 00 85 c0 74 0f 8b 4d e4 3b 81 28 01 00 00 74 16 <0f> 0b eb fe 8b 7d e4 8b 55 dc 8b 87 28 01 00 00 89 82 30 01 00 
EIP: [<f960a4cd>] drm_open+0x440/0x48b [drm] SS:ESP 0068:f3190e4c
---[ end trace b9500a62c179b94e ]---

Here are the last few lines of /var/log/Xorg.0.log when this crashed:
(II) intel(0): [drm] dma control initialized, using IRQ 16
(II) intel(0): RandR 1.2 enabled, ignore the following RandR disabled message.
(II) intel(0): Selecting standard 18 bit TMDS pixel format.
(II) intel(0): DPMS enabled
(II) intel(0): Set up textured video
(II) intel(0): Set up overlay video
(II) intel(0): direct rendering: Enabled
(--) RandR disabled
(II) Initializing built-in extension MIT-SHM
(II) Initializing built-in extension XInputExtension
(II) Initializing built-in extension XTEST
(II) Initializing built-in extension XKEYBOARD
(II) Initializing built-in extension XINERAMA
(II) Initializing built-in extension XFIXES
(II) Initializing built-in extension RENDER
(II) Initializing built-in extension RANDR
(II) Initializing built-in extension COMPOSITE
(II) Initializing built-in extension DAMAGE
(II) Initializing built-in extension XEVIE
(II) AIGLX: Screen 0 is not DRI2 capable
drmOpenDevice: node name is /dev/dri/card0

And here are the next lines from an Xorg.0.log from boot that worked:

455,1776d454
< drmOpenDevice: open result is 13, (OK)
< drmOpenByBusid: Searching for BusID pci:0000:00:02.0
< drmOpenDevice: node name is /dev/dri/card0
< drmOpenDevice: open result is 13, (OK)
< drmOpenByBusid: drmOpenMinor returns 13
< drmOpenByBusid: drmGetBusid reports pci:0000:00:02.0
< (II) AIGLX: enabled GLX_MESA_copy_sub_buffer
< (II) AIGLX: enabled GLX_SGI_swap_control and GLX_MESA_swap_control
< (II) AIGLX: Loaded and initialized /usr/lib/dri/i915_dri.so
< (II) GLX: Initialized DRI GL provider for screen 0
< (II) intel(0): Setting screen physical size to 246 x 185
< (II) config/hal: Adding input device Macintosh mouse button emulation
< (II) LoadModule: "evdev"
Comment 8 Ronald Warsow 2008-08-07 17:06:15 EDT
I don't see any traces in my logs, but it also happens to me that my X is dead, independantly if I switch to runlevel 5 (normal boot or startx from runlevel 3).
I can ssh into the box, but the box is really slow, I can count the echoed keystroke I keyed in.

So I set up a serial console, booted with "console=tty0 console=tty1 console=ttyS0,115200n8" and maybe it sounds stupid, but I was presented a graphical login screen as usual.

kernel is 2.6.27-0.237.rc2.fc10.x86_64
Comment 9 Ronald Warsow 2008-08-07 17:11:10 EDT
sorry I forgot: it's an  RS690 box
Comment 10 Tom London 2008-08-08 10:02:46 EDT
Still fails with kernel-2.6.27-0.241.rc2.git1.fc10.i686, but the line number is +1.

Same "black screen".


kernel BUG at drivers/gpu/drm/drm_fops.c:145!
invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
Modules linked in: i915 drm ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi scsi_transport_iscsi coretemp nf_conntrack_ipv4 ipt_REJECT iptable_filter ip_tables nf_conntrack_netbios_ns nf_conntrack_ipv6 xt_state nf_conntrack xt_tcpudp ip6t_ipv6header ip6t_REJECT ip6table_filter ip6_tables x_tables ipv6 cpufreq_ondemand acpi_cpufreq fuse loop dm_multipath kvm_intel kvm thinkpad_acpi hwmon serio_raw i2c_i801 i2c_core sr_mod cdrom ata_piix ata_generic pata_acpi iTCO_wdt iTCO_vendor_support sdhci_pci sdhci mmc_core firewire_ohci firewire_core crc_itu_t yenta_socket rsrc_nonstatic hci_usb bluetooth arc4 ecb crypto_blkcipher iwl3945 rfkill mac80211 cfg80211 e1000e snd_hda_intel snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_page_alloc snd_hwdep snd soundcore bay video output battery ac ppdev parport_pc parport dm_snapshot dm_zero dm_mirror dm_log dm_mod ahci ext3 jbd mbcache uhci_hcd ohci_hcd 
ehci_hcd
Pid: 3346, comm: Xorg Not tainted (2.6.27-0.241.rc2.git1.fc10.i686 #1)
Pid: 3346, comm: Xorg Not tainted (2.6.27-0.241.rc2.git1.fc10.i686 #1) EIP: 0060:[<f961b4d5>] EFLAGS: 00213206 CPU: 1
EIP is at drm_open+0x448/0x493 [drm]
EAX: f3011a28 EBX: f29340e8 ECX: f291869c EDX: f29340c0
ESI: 00000000 EDI: f29340e8 EBP: f28d4e70 ESP: f28d4e4c
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process Xorg (pid: 3346, ti=f28d4000 task=f301df80 task.ti=f28d4000)
Stack: f1a20700 f291869c f29340c0 f4b30430 00000000 f31c4b54 ffffffed f962a6f8 
      f1a20700 f28d4e88 f961b032 f291869c 00000000 f30f47e0 f30f47e0 f28d4eac 
      c049e812 f1a20700 f291869c 00000000 00000000 f1a20700 00000000 f291869c 
Call Trace:
[<f961b032>] ? drm_stub_open+0xa9/0x104 [drm]
[<c049e812>] ? chrdev_open+0x12e/0x145
[<c049aa53>] ? __dentry_open+0x15e/0x256
[<c049abdc>] ? nameidata_to_filp+0x24/0x38
[<c049e6e4>] ? chrdev_open+0x0/0x145
[<c04a4e95>] ? do_filp_open+0x32b/0x62b
[<c04087b6>] ? native_sched_clock+0xb3/0xd3
[<c0408633>] ? sched_clock+0x8/0xb
[<c04494a5>] ? lock_release_holdtime+0x1a/0x115
[<c051c205>] ? _raw_spin_unlock+0x74/0x78
[<c068307b>] ? _spin_unlock+0x22/0x25
[<c049a812>] ? do_sys_open+0x47/0xbc
[<c049a8d3>] ? sys_open+0x23/0x2b
[<c0403cbe>] ? syscall_call+0x7/0xb
======================
======================Code: 83 c3 28 89 d8 e8 6b 6b 06 c7 8b 45 e8 83 78 04 01 75 2e 8b 55 e4 8b 82 30 01 00 00 85 c0 74 0f 8b 4d e0 3b 81 28 01 00 00 74 16 <0f> 0b eb fe 8b 55 e0 8b 4d e4 8b 82 28 01 00 00 89 81 30 01 00 
EIP: [<f961b4d5>] drm_open+0x448/0x493 [drm] SS:ESP 0068:f28d4e4c
---[ end trace d23eb05a588c97fd ]---
Comment 11 Dave Airlie 2008-08-10 04:53:07 EDT
can you try upgrade libdrm/libdrm-devel to 2.4.0-0.17 when it lands?

http://kojipkgs.fedoraproject.org/packages/libdrm/2.4.0/0.17.fc10/

If this works or doesn't work let me know, I'm going to fix the kernel to not BUG_ON but I want to make sure I know what is happening.
Comment 12 Tom London 2008-08-10 14:19:04 EDT
OK. Installed and rebooted a couple of times; both time booted OK.

I noticed that I thought are new messages in /var/log/messages (sorry if I have that wrong).

Not sure this is relevant, but I don't remember seeing the multiple open attempts for /dev/dri/cardX (X>0).

Also, don't recall the logging of 'drmGetBusid:'.

I'll try rebooting lots of times and report back if I get any failures.

[root@localhost ~]# grep drm /var/log/Xorg.0.log
drmOpenDevice: node name is /dev/dri/card0
drmOpenDevice: node name is /dev/dri/card0
drmOpenByBusid: Searching for BusID pci:0000:00:02.0
drmOpenDevice: node name is /dev/dri/card0
drmOpenByBusid: drmOpenMinor returns -1
drmOpenDevice: node name is /dev/dri/card1
drmOpenByBusid: drmOpenMinor returns -1
drmOpenDevice: node name is /dev/dri/card2
drmOpenByBusid: drmOpenMinor returns -1
drmOpenDevice: node name is /dev/dri/card3
drmOpenByBusid: drmOpenMinor returns -1
drmOpenDevice: node name is /dev/dri/card4
drmOpenByBusid: drmOpenMinor returns -1
drmOpenDevice: node name is /dev/dri/card5
drmOpenByBusid: drmOpenMinor returns -1
drmOpenDevice: node name is /dev/dri/card6
drmOpenByBusid: drmOpenMinor returns -1
drmOpenDevice: node name is /dev/dri/card7
drmOpenByBusid: drmOpenMinor returns -1
drmOpenDevice: node name is /dev/dri/card8
drmOpenByBusid: drmOpenMinor returns -1
drmOpenDevice: node name is /dev/dri/card9
drmOpenByBusid: drmOpenMinor returns -1
drmOpenDevice: node name is /dev/dri/card10
drmOpenByBusid: drmOpenMinor returns -1
drmOpenDevice: node name is /dev/dri/card11
drmOpenByBusid: drmOpenMinor returns -1
drmOpenDevice: node name is /dev/dri/card12
drmOpenByBusid: drmOpenMinor returns -1
drmOpenDevice: node name is /dev/dri/card13
drmOpenByBusid: drmOpenMinor returns -1
drmOpenDevice: node name is /dev/dri/card14
drmOpenByBusid: drmOpenMinor returns -1
drmOpenDevice: node name is /dev/dri/card0
drmOpenDevice: open result is 12, (OK)
drmOpenDevice: node name is /dev/dri/card0
drmOpenDevice: open result is 12, (OK)
drmGetBusid returned ''
(II) [drm] loaded kernel module for "i915" driver.
(II) [drm] DRM interface version 1.3
(II) [drm] DRM open master succeeded.
(II) intel(0): [drm] Using the DRM lock SAREA also for drawables.
(II) intel(0): [drm] framebuffer mapped by ddx driver
(II) intel(0): [drm] added 1 reserved context for kernel
(II) intel(0): [drm] installed DRM signal handler
(II) intel(0): [drm] Registers = 0x2dc20000
(II) intel(0): [drm] ring buffer = 0xd0000000
(II) intel(0): [drm] mapped front buffer at 0xd1000000, handle = 0xd1000000
(II) intel(0): [drm] mapped back buffer at 0xd5000000, handle = 0xd5000000
(II) intel(0): [drm] mapped depth buffer at 0xd6000000, handle = 0xd6000000
(II) intel(0): [drm] mapped classic textures at 0xd7000000, handle = 0xd7000000
(II) intel(0): [drm] Initialized kernel agp heap manager, 33554432
(II) intel(0): [drm] dma control initialized, using IRQ 16
drmOpenDevice: node name is /dev/dri/card0
drmOpenDevice: open result is 13, (OK)
drmOpenByBusid: Searching for BusID pci:0000:00:02.0
drmOpenDevice: node name is /dev/dri/card0
drmOpenDevice: open result is 13, (OK)
drmOpenByBusid: drmOpenMinor returns 13
drmOpenByBusid: drmGetBusid reports pci:0000:00:02.0
[root@localhost ~]#
Comment 13 Tom London 2008-08-10 17:40:06 EDT
I've rebooted the system 11 times after installing the new libdrm, and it succeeded each time.

I'll continue to stress this, but so far it appears to be working.
Comment 14 Tom London 2008-08-23 18:19:08 EDT
I haven't seen this since 8 August: kernel-2.6.27-0.241.rc2.git1.fc10.i686 and libdrm/libdrm-devel 2.4.0-0.17.

Close?  (Or is this tracking some upstream merge?)
Comment 15 Christopher D. Stover 2008-10-28 16:00:27 EDT
I'm glad to see the problem finally got fixed for you Tom. =)

Note You need to log in before you can comment on or make changes to this bug.