Bug 488633

Summary: intel driver has locking errors (4GB memory, 32bit)
Product: [Fedora] Fedora Reporter: darrell pfeifer <darrellpf>
Component: xorg-x11-drv-intelAssignee: Kristian Høgsberg <krh>
Status: CLOSED DUPLICATE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: 11CC: ajax, ascii79, awilliam, mcepl, smcmackin, wtogami, xgl-maint
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-06-15 21:27:15 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
xorg.log none

Description darrell pfeifer 2009-03-04 23:54:03 UTC
Description of problem:

Newest intel driver has locking errors which leave it non-functional. The graphic display starts up but when GDM starts the result is a blank/black screen.


Version-Release number of selected component (if applicable):

2.6.29-0.196.rc6.git7.fc11 (PAE and nomodeset)

Last/currently working kernel is 2.6.29-0.159.rc6.git3.fc11.i686.PAE


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Intel Corporation Mobile GM965/GL960 Integrated Graphics Controller (rev 0c)

The xorg.log doesn't show any errors.

The /var/log/messages has

Mar  4 14:51:06 localhost kernel: [drm] Initialized drm 1.1.0 20060810
Mar  4 14:51:06 localhost kernel: pci 0000:00:02.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
Mar  4 14:51:06 localhost kernel: mtrr: type mismatch for e0000000,10000000 old: write-back new: write-combining
Mar  4 14:51:06 localhost kernel: [drm] MTRR allocation failed
Mar  4 14:51:06 localhost kernel: .  Graphics performance may suffer.
Mar  4 14:51:06 localhost kernel: [drm] Initialized i915 1.6.0 20080730 for 0000:00:02.0 on minor 0

// snip

Mar  4 14:51:31 localhost kernel: mtrr: type mismatch for e0000000,10000000 old: write-back new: write-combining
Mar  4 14:51:31 localhost kernel:
Mar  4 14:51:31 localhost kernel: =======================================================
Mar  4 14:51:31 localhost kernel: [ INFO: possible circular locking dependency detected ]
Mar  4 14:51:31 localhost kernel: 2.6.29-0.196.rc6.git7.fc11.i686.PAE #1
Mar  4 14:51:31 localhost kernel: -------------------------------------------------------
Mar  4 14:51:31 localhost kernel: Xorg/2868 is trying to acquire lock:
Mar  4 14:51:31 localhost kernel: (&mm->mmap_sem){----}, at: [<c0499756>] might_fault+0x48/0x85
Mar  4 14:51:31 localhost kernel:
Mar  4 14:51:31 localhost kernel: but task is already holding lock:
Mar  4 14:51:31 localhost kernel: (&dev->struct_mutex){--..}, at: [<f86b2d0e>] i915_gem_pwrite_ioctl+0x160/0x347 [i915]
Mar  4 14:51:31 localhost kernel:
Mar  4 14:51:31 localhost kernel: which lock already depends on the new lock.
Mar  4 14:51:31 localhost kernel:
Mar  4 14:51:31 localhost kernel:
Mar  4 14:51:31 localhost kernel: the existing dependency chain (in reverse order) is:
Mar  4 14:51:31 localhost kernel:
Mar  4 14:51:31 localhost kernel: -> #1 (&dev->struct_mutex){--..}:
Mar  4 14:51:31 localhost kernel:       [<c0458cee>] __lock_acquire+0x96a/0xac8
Mar  4 14:51:31 localhost kernel:       [<c0458ea7>] lock_acquire+0x5b/0x81
Mar  4 14:51:31 localhost kernel:       [<c070081d>] __mutex_lock_common+0xdd/0x338
Mar  4 14:51:31 localhost kernel:       [<c0700b1f>] mutex_lock_nested+0x33/0x3b
Mar  4 14:51:31 localhost kernel:       [<f85bc6fd>] drm_gem_mmap+0x36/0xfe [drm]
Mar  4 14:51:31 localhost kernel:       [<c04a0a4e>] mmap_region+0x269/0x3fb
Mar  4 14:51:31 localhost kernel:       [<c04a0e35>] do_mmap_pgoff+0x255/0x2a5
Mar  4 14:51:31 localhost kernel:       [<c040c804>] sys_mmap2+0x5f/0x80
Mar  4 14:51:31 localhost kernel:       [<c0409676>] syscall_call+0x7/0xb
Mar  4 14:51:31 localhost kernel:       [<ffffffff>] 0xffffffff
Mar  4 14:51:31 localhost kernel:
Mar  4 14:51:31 localhost kernel: -> #0 (&mm->mmap_sem){----}:
Mar  4 14:51:31 localhost kernel:       [<c0458bbb>] __lock_acquire+0x837/0xac8
Mar  4 14:51:31 localhost kernel:       [<c0458ea7>] lock_acquire+0x5b/0x81
Mar  4 14:51:31 localhost kernel:       [<c0499773>] might_fault+0x65/0x85
Mar  4 14:51:31 localhost kernel:       [<f86b2e0c>] i915_gem_pwrite_ioctl+0x25e/0x347 [i915]
Mar  4 14:51:31 localhost kernel:       [<f85bb854>] drm_ioctl+0x1b7/0x236 [drm]
Mar  4 14:51:31 localhost kernel:       [<c04be466>] vfs_ioctl+0x5c/0x76
Mar  4 14:51:31 localhost kernel:       [<c04bea1a>] do_vfs_ioctl+0x48b/0x4c9
Mar  4 14:51:31 localhost kernel:       [<c04bea9e>] sys_ioctl+0x46/0x66
Mar  4 14:51:31 localhost kernel:       [<c0409676>] syscall_call+0x7/0xb
Mar  4 14:51:31 localhost kernel:       [<ffffffff>] 0xffffffff
Mar  4 14:51:31 localhost kernel:
Mar  4 14:51:31 localhost kernel: other info that might help us debug this:
Mar  4 14:51:31 localhost kernel:
Mar  4 14:51:31 localhost kernel: 1 lock held by Xorg/2868:
Mar  4 14:51:31 localhost kernel: #0:  (&dev->struct_mutex){--..}, at: [<f86b2d0e>] i915_gem_pwrite_ioctl+0x160/0x347 [i915]
Mar  4 14:51:31 localhost kernel:
Mar  4 14:51:31 localhost kernel: stack backtrace:
Mar  4 14:51:31 localhost kernel: Pid: 2868, comm: Xorg Not tainted 2.6.29-0.196.rc6.git7.fc11.i686.PAE #1
Mar  4 14:51:31 localhost kernel: Call Trace:
Mar  4 14:51:31 localhost kernel: [<c06ff692>] ? printk+0x14/0x1a
Mar  4 14:51:31 localhost kernel: [<c045816f>] print_circular_bug_tail+0x5d/0x68
Mar  4 14:51:31 localhost kernel: [<c0458bbb>] __lock_acquire+0x837/0xac8
Mar  4 14:51:31 localhost kernel: [<c0458ea7>] lock_acquire+0x5b/0x81
Mar  4 14:51:31 localhost kernel: [<c0499756>] ? might_fault+0x48/0x85
Mar  4 14:51:31 localhost kernel: [<c0499773>] might_fault+0x65/0x85
Mar  4 14:51:31 localhost kernel: [<c0499756>] ? might_fault+0x48/0x85
Mar  4 14:51:31 localhost kernel: [<f86b2e0c>] i915_gem_pwrite_ioctl+0x25e/0x347 [i915]
Mar  4 14:51:31 localhost kernel: [<f85bb854>] drm_ioctl+0x1b7/0x236 [drm]
Mar  4 14:51:31 localhost kernel: [<f85bb854>] ? drm_ioctl+0x1b7/0x236 [drm]
Mar  4 14:51:31 localhost kernel: [<f86b2bae>] ? i915_gem_pwrite_ioctl+0x0/0x347 [i915]
Mar  4 14:51:31 localhost kernel: [<c04be466>] vfs_ioctl+0x5c/0x76
Mar  4 14:51:31 localhost kernel: [<c04bea1a>] do_vfs_ioctl+0x48b/0x4c9
Mar  4 14:51:31 localhost kernel: [<c04b3a13>] ? fsnotify_modify+0x54/0x5f
Mar  4 14:51:31 localhost kernel: [<c0475e98>] ? audit_syscall_entry+0x16b/0x191
Mar  4 14:51:31 localhost kernel: [<c04bea9e>] sys_ioctl+0x46/0x66
Mar  4 14:51:31 localhost kernel: [<c04bea9e>] ? sys_ioctl+0x46/0x66
Mar  4 14:51:31 localhost kernel: [<c0409676>] syscall_call+0x7/0xb
Mar  4 14:51:34 localhost gdm-simple-slave[2864]: DEBUG(+): GdmSignalHandler: handling signal 10
Mar  4 14:51:34 localhost gdm-simple-slave[2864]: DEBUG(+): GdmSignalHandler: Found 2 callbacks
Mar  4 14:51:34 localhost gdm-simple-slave[2864]: DEBUG(+): GdmSignalHandler: running 10 handler: 0x8053b60
Mar  4 14:51:34 localhost gdm-simple-slave[2864]: DEBUG(+): GdmSignalHandler: running 10 handler: 0x804d020
Mar  4 14:51:34 localhost gdm-simple-slave[2864]: DEBUG(+): Got callback for signal 10
Mar  4 14:51:34 localhost gdm-simple-slave[2864]: DEBUG(+): Got USR1 signal
Mar  4 14:51:34 localhost gdm-simple-slave[2864]: DEBUG(+): GdmSignalHandler: Done handling signals
Mar  4 14:51:34 localhost gdm-simple-slave[2864]: DEBUG(+): GdmServer: Got USR1 from X server - emitting READY
Mar  4 14:51:34 localhost gdm-simple-slave[2864]: DEBUG(+): GdmSlave: Server is ready - opening display :0

Comment 1 Matěj Cepl 2009-03-05 09:22:06 UTC
Thanks for the bug report.  We have reviewed the information you have provided above, and there is some additional information we require that will be helpful in our diagnosis of this issue.

Please attach your X server config file (/etc/X11/xorg.conf, if available) and X server log file (/var/log/Xorg.*.log) to the bug report as individual uncompressed file attachments using the bugzilla file attachment link below.

Could you please also try to run without any /etc/X11/xorg.conf (if you have one) whatsoever and let X11 autodetect your display and video card? Attach to this bug /var/log/Xorg.0.log from this attempt as well, please.

We will review this issue again once you've had a chance to attach this information.

Thanks in advance.

Comment 2 Kristian Høgsberg 2009-03-05 14:30:32 UTC
Please try this with the latest kernel.  On vt switching back to an X server, we now use API that would deadlock in older kernels.  I've added a requires to the intel driver so this shouldn't happen again.

Comment 3 darrell pfeifer 2009-03-05 16:35:56 UTC
Is there another way to do the requires? The 207 kernel is installed won't run until intel-2.6.0-13 is installed, but the driver looks like it can't be installed unless the kernel is running. I can do the install at runlevel 3 but I'm not sure that will work for most users.

uname -r
2.6.29-0.159.rc6.git3.fc11.i686.PAE

yum list kern\*
Loaded plugins: dellsysidplugin2, refresh-packagekit
Installed Packages
kernel-PAE.i686                     2.6.29-0.159.rc6.git3.fc11               installed
kernel-PAE.i686                     2.6.29-0.203.rc7.fc11                    installed
kernel-PAE.i686                     2.6.29-0.207.rc7.fc11                    installed
kernel-PAE-devel.i686               2.6.29-0.159.rc6.git3.fc11               installed
kernel-PAE-devel.i686               2.6.29-0.203.rc7.fc11                    installed
kernel-PAE-devel.i686               2.6.29-0.207.rc7.fc11                    installed
kernel-firmware.noarch              2.6.29-0.207.rc7.fc11                    installed
kernel-headers.i586                 2.6.29-0.207.rc7.fc11                    installed
kerneloops.i586                     0.12-3.fc11                              installed

rpm -Uvh xorg-x11-drv-intel-2.6.0-13.fc11.i586.rpm
error: Failed dependencies:
	kernel < 2.6.29-0.207.rc7.fc11 conflicts with xorg-x11-drv-intel-2.6.0-13.fc11.i586
	kernel-PAE < 2.6.29-0.207.rc7.fc11 conflicts with xorg-x11-drv-intel-2.6.0-13.fc11.i586

Comment 4 darrell pfeifer 2009-03-05 17:04:36 UTC
Booted at runlevel 3 with the 207 kernel. The rpm install of the intel driver still fails with the same message.

Comment 5 darrell pfeifer 2009-03-05 23:31:22 UTC
Created attachment 334235 [details]
xorg.log

Comment 6 darrell pfeifer 2009-03-05 23:34:03 UTC
Booted with kernel 207 and xorg-x11-drv-intel-2.6.0-14.fc11. Still get a black screen when gdm runs. Can't switch to a vt.

Mar  5 15:13:31 localhost kernel: [drm] Initialized drm 1.1.0 20060810
Mar  5 15:13:31 localhost kernel: pci 0000:00:02.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
Mar  5 15:13:31 localhost kernel: mtrr: type mismatch for e0000000,10000000 old: write-back new: write-combining
Mar  5 15:13:31 localhost kernel: [drm] MTRR allocation failed
Mar  5 15:13:31 localhost kernel: .  Graphics performance may suffer.
Mar  5 15:13:31 localhost kernel: [drm] Initialized i915 1.6.0 20080730 for 0000:00:02.0 on minor 0
--------------------------------------------

Mar  5 15:13:55 localhost kernel: mtrr: type mismatch for e0000000,10000000 old: write-back new: write-combining
Mar  5 15:13:55 localhost kernel:
Mar  5 15:13:55 localhost kernel: =======================================================
Mar  5 15:13:55 localhost kernel: [ INFO: possible circular locking dependency detected ]
Mar  5 15:13:55 localhost kernel: 2.6.29-0.207.rc7.fc11.i686.PAE #1
Mar  5 15:13:55 localhost kernel: -------------------------------------------------------
Mar  5 15:13:55 localhost kernel: Xorg/2871 is trying to acquire lock:
Mar  5 15:13:55 localhost kernel: (&mm->mmap_sem){----}, at: [<c04997fe>] might_fault+0x48/0x85
Mar  5 15:13:55 localhost kernel:
Mar  5 15:13:55 localhost kernel: but task is already holding lock:
Mar  5 15:13:55 localhost kernel: (&dev->struct_mutex){--..}, at: [<f86b2d0f>] i915_gem_pwrite_ioctl+0x160/0x347 [i915]
Mar  5 15:13:55 localhost kernel:
Mar  5 15:13:55 localhost kernel: which lock already depends on the new lock.
Mar  5 15:13:55 localhost kernel:
Mar  5 15:13:55 localhost kernel:
Mar  5 15:13:55 localhost kernel: the existing dependency chain (in reverse order) is:
Mar  5 15:13:55 localhost kernel:
Mar  5 15:13:55 localhost kernel: -> #1 (&dev->struct_mutex){--..}:
Mar  5 15:13:55 localhost kernel:       [<c0458db0>] __lock_acquire+0x96a/0xac8
Mar  5 15:13:55 localhost kernel:       [<c0458f69>] lock_acquire+0x5b/0x81
Mar  5 15:13:55 localhost kernel:       [<c07008a4>] __mutex_lock_common+0xdd/0x338
Mar  5 15:13:55 localhost kernel:       [<c0700ba6>] mutex_lock_nested+0x33/0x3b
Mar  5 15:13:55 localhost kernel:       [<f85bc6c5>] drm_gem_mmap+0x36/0xfe [drm]
Mar  5 15:13:55 localhost kernel:       [<c04a0af6>] mmap_region+0x269/0x3fb
Mar  5 15:13:55 localhost kernel:       [<c04a0edd>] do_mmap_pgoff+0x255/0x2a5
Mar  5 15:13:55 localhost kernel:       [<c040c804>] sys_mmap2+0x5f/0x80
Mar  5 15:13:55 localhost kernel:       [<c0409676>] syscall_call+0x7/0xb
Mar  5 15:13:55 localhost kernel:       [<ffffffff>] 0xffffffff
Mar  5 15:13:55 localhost kernel:
Mar  5 15:13:55 localhost kernel: -> #0 (&mm->mmap_sem){----}:
Mar  5 15:13:55 localhost kernel:       [<c0458c7d>] __lock_acquire+0x837/0xac8
Mar  5 15:13:55 localhost kernel:       [<c0458f69>] lock_acquire+0x5b/0x81
Mar  5 15:13:55 localhost kernel:       [<c049981b>] might_fault+0x65/0x85
Mar  5 15:13:55 localhost kernel:       [<f86b2e0d>] i915_gem_pwrite_ioctl+0x25e/0x347 [i915]
Mar  5 15:13:55 localhost kernel:       [<f85bb81c>] drm_ioctl+0x1b7/0x236 [drm]
Mar  5 15:13:55 localhost kernel:       [<c04be50e>] vfs_ioctl+0x5c/0x76
Mar  5 15:13:55 localhost kernel:       [<c04beac2>] do_vfs_ioctl+0x48b/0x4c9
Mar  5 15:13:55 localhost kernel:       [<c04beb46>] sys_ioctl+0x46/0x66
Mar  5 15:13:55 localhost kernel:       [<c0409676>] syscall_call+0x7/0xb
Mar  5 15:13:55 localhost kernel:       [<ffffffff>] 0xffffffff
Mar  5 15:13:55 localhost kernel:
Mar  5 15:13:55 localhost kernel: other info that might help us debug this:
Mar  5 15:13:55 localhost kernel:
Mar  5 15:13:55 localhost kernel: 1 lock held by Xorg/2871:
Mar  5 15:13:55 localhost kernel: #0:  (&dev->struct_mutex){--..}, at: [<f86b2d0f>] i915_gem_pwrite_ioctl+0x160/0x347 [i915]
Mar  5 15:13:55 localhost kernel:
Mar  5 15:13:55 localhost kernel: stack backtrace:
Mar  5 15:13:55 localhost kernel: Pid: 2871, comm: Xorg Not tainted 2.6.29-0.207.rc7.fc11.i686.PAE #1
Mar  5 15:13:55 localhost kernel: Call Trace:
Mar  5 15:13:55 localhost kernel: [<c06ff719>] ? printk+0x14/0x1b
Mar  5 15:13:55 localhost kernel: [<c0458231>] print_circular_bug_tail+0x5d/0x68
Mar  5 15:13:55 localhost kernel: [<c0458c7d>] __lock_acquire+0x837/0xac8
Mar  5 15:13:55 localhost kernel: [<c0458f69>] lock_acquire+0x5b/0x81
Mar  5 15:13:55 localhost kernel: [<c04997fe>] ? might_fault+0x48/0x85
Mar  5 15:13:55 localhost kernel: [<c049981b>] might_fault+0x65/0x85
Mar  5 15:13:55 localhost kernel: [<c04997fe>] ? might_fault+0x48/0x85
Mar  5 15:13:55 localhost kernel: [<f86b2e0d>] i915_gem_pwrite_ioctl+0x25e/0x347 [i915]
Mar  5 15:13:55 localhost kernel: [<f85bb81c>] drm_ioctl+0x1b7/0x236 [drm]
Mar  5 15:13:55 localhost kernel: [<f85bb81c>] ? drm_ioctl+0x1b7/0x236 [drm]
Mar  5 15:13:55 localhost kernel: [<f86b2baf>] ? i915_gem_pwrite_ioctl+0x0/0x347 [i915]
Mar  5 15:13:55 localhost kernel: [<c04be50e>] vfs_ioctl+0x5c/0x76
Mar  5 15:13:55 localhost kernel: [<c04beac2>] do_vfs_ioctl+0x48b/0x4c9
Mar  5 15:13:55 localhost kernel: [<c04b3abf>] ? fsnotify_modify+0x54/0x5f
Mar  5 15:13:55 localhost kernel: [<c0475f3a>] ? audit_syscall_entry+0x16b/0x191
Mar  5 15:13:55 localhost kernel: [<c04beb46>] sys_ioctl+0x46/0x66
Mar  5 15:13:55 localhost kernel: [<c04beb46>] ? sys_ioctl+0x46/0x66
Mar  5 15:13:55 localhost kernel: [<c0409676>] syscall_call+0x7/0xb
Mar  5 15:13:58 localhost gdm-simple-slave[2870]: DEBUG(+): GdmSignalHandler: handling signal 10
Mar  5 15:13:58 localhost gdm-simple-slave[2870]: DEBUG(+): GdmSignalHandler: Found 2 callbacks
Mar  5 15:13:58 localhost gdm-simple-slave[2870]: DEBUG(+): GdmSignalHandler: running 10 handler: 0x8053b60
Mar  5 15:13:58 localhost gdm-simple-slave[2870]: DEBUG(+): GdmSignalHandler: running 10 handler: 0x804d020
Mar  5 15:13:58 localhost gdm-simple-slave[2870]: DEBUG(+): Got callback for signal 10
Mar  5 15:13:58 localhost gdm-simple-slave[2870]: DEBUG(+): Got USR1 signal
Mar  5 15:13:58 localhost gdm-simple-slave[2870]: DEBUG(+): GdmSignalHandler: Done handling signals
Mar  5 15:13:58 localhost gdm-simple-slave[2870]: DEBUG(+): GdmServer: Got USR1 from X server - emitting READY
Mar  5 15:13:58 localhost gdm-simple-slave[2870]: DEBUG(+): GdmSlave: Server is ready - opening display :0

Comment 7 darrell pfeifer 2009-04-01 05:26:50 UTC
My machine has 4 gig of memory installed.

Booting with mem=3G is a workaround that allowed me to run with a newer kernel like 2.6.29-21.fc11.i686.PAE

Comment 8 darrell pfeifer 2009-04-01 06:53:33 UTC
With the mem=3G the locking error still appears (but the system will function, at 4G it was just hanging at gdm)

Comment 9 darrell pfeifer 2009-04-07 16:52:27 UTC
Installed 2.6.99.902-2.fc11. It doesn't solve the 4 gig problem. mem=3G continues to work.

Comment 10 Adam Williamson 2009-04-16 23:40:02 UTC
That same set of locking errors has been mentioned in several bug reports with quite a lot of different symptoms; I'm not sure it's actually what's *causing* any of the problems, exactly. Just for information.

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 11 Warren Togami 2009-04-17 01:30:43 UTC
Why is this set to MODIFIED?  It seems not fixed?

Comment 12 Adam Williamson 2009-04-17 16:30:14 UTC
Kristian set it to MODIFIED when he posted comment #2. I'll set back to ASSIGNED, since obviously people have continued to see these errors since then.

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 13 Shannon McMackin 2009-05-20 01:44:05 UTC
With the latest PAE kernels, I can add mem=4096M as a kernel append and the machine will boot and function, but only see 3gb of RAM.  If I boot the PAE kernel to runlevel 3 without adding the append, I can see all 4gb.

Comment 14 Bug Zapper 2009-06-09 11:52:55 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 11 development cycle.
Changing version to '11'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 15 Adam Williamson 2009-06-15 21:27:15 UTC

*** This bug has been marked as a duplicate of bug 481687 ***