Bug 185108

Summary: x11 can't start or resume an Intel 915GM/GMS/910GML after pm-hibernate
Product: [Fedora] Fedora Reporter: Paul Dickson <paul>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: rawhideCC: jensk.maps, ncunning, wtogami
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-05-14 23:02:37 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
/var/log/messages during pm-hibernate and resuming
none
JPEG image of dmesg with results from resuming
none
This patch fixes the problem
none
This is the .config used in comment #10 none

Description Paul Dickson 2006-03-10 18:14:55 UTC
Description of problem:
After using pm-hibernate, X11 can't restart (or even start) on my Dell Inspiron
6000 notebook.  pm-suspend works though.

Version-Release number of selected component (if applicable):
pm-utils-0.13-1
kernel-2.6.15-1.2032_FC5
xorg-x11-server-Xorg-1.0.1-8

How reproducible:
Always

Steps to Reproduce:
1. goto VT1 console, login as root.
2. init 3
3. pm-hibernate
4. resume system (everything appears normal at this point)
5. init 5  

Actual results:
X11 stuggles to start GDM.  At one point is successfully displays the busy mouse
cursor.  But eventually the screen goes blank with the backlight turned on.

After this point nothing is displayed on the screen.  I can still switch console
to VT1 and use it blindly.  I can even do pm-suspend and pm-hibernate, but still
 without the screen (I get a display while resuming, but nothing afterwards).

Expected results:
GDM login screen to be displayed.

Additional info:
00:02.0 VGA compatible controller: Intel Corporation Mobile 915GM/GMS/910GML
Express Graphics Controller (rev 03)
00:02.1 Display controller: Intel Corporation Mobile 915GM/GMS/910GML Express
Graphics Controller (rev 03)

I had a similar problem with pm-suspend a few weeks ago so I gradually tested my
system with pm-suspend using the above test case.  When that worked, I tested it
from the console while still at init 5.  And after success with that, I tested
again from within X11.  With pm-suspend, all of these tests were successful.

On the otherhand, pm-hibernate seemed to work while using the console.  But
after resuming, I could not get a display from X11, nor could I get the console
to display.

This might be just X11 not being able to completely initialize/reinitialize the
display after resuming.

Comment 1 Paul Dickson 2006-03-10 22:22:39 UTC
Created attachment 125965 [details]
/var/log/messages during pm-hibernate and resuming

Comment 2 Paul Dickson 2006-03-10 22:36:20 UTC
After rebooting from the above pm-hibernate, pm-suspend started acting the same
as pm-hibernate.  I have spent the last 4 hours attempting to characterize the
problem.

Xorg would start using 100% of the CPU.  /proc/interrupts would add an entry for
i915@pci:0000:00:02.0.  This entry does not appear when X works normally.

Attaching gdb to Xorg would give this backtrace:

#0  0x0052c552 in I830CheckModeSupport () from
/usr/lib/xorg/modules/drivers/i810_drv.so
#1  0x003f8df2 in XAAGetPixmapIndex () from /usr/lib/xorg/modules/libxaa.so
#2  0x00c770fb in xf86ForceHWCursor () from /usr/lib/xorg/modules/libramdac.so
#3  0x080bfc1c in xf86InitFBManagerArea ()
#4  0x080cc014 in xf86XVScreenInit ()
#5  0x080b9a03 in xf86Wakeup ()
#6  0x0808c9c9 in WakeupHandler ()
#7  0x081a73d9 in WaitForSomething ()
#8  0x080887dd in Dispatch ()
#9  0x080701d7 in main ()

I played around by reverting to:

udev-084-11
xorg-x11-server-Xorg-1.0.1-7
xorg-x11-xkbdata-1.0.1-6

I was able to get pm-suspend to resume correctly ONCE.  But trying to reduce the
rpm set to the problem one proved impossible.  I could not reproduce it even
with the same packages that worked.

I believe the only working setup I had was with xorg-x11-server-Xorg-1.0.1-7 and
xorg-x11-xkbdata-1.0.1-6 reverted (using the current udev).

Current packages for the above are:
 udev                    i386       084-13           development       631 k
 xorg-x11-server-Xorg    i386       1.0.1-8          development       3.3 M
 xorg-x11-xkbdata        noarch     1.0.1-7          development       329 k


Comment 3 Paul Dickson 2006-03-11 21:19:48 UTC
Seems I wrote incorrectly, i915@pci:0000:00:02.0 does appear when X is behaving
correctly.

Comment 4 Paul Dickson 2006-04-01 16:18:03 UTC
kernel versions:
2064 Works completely for pm-suspend.

2102 Does not work.  Screen stays dark for about a minute, then the backlight
turns on with garbage across the top 50 pixels.

2106 Apparently restarts, but has a journal I/O error.  "ls -l" lists files, but
there's garbage for some of the stats for random files.  Other comands won't run
from the terminal window (eg df resulted in a bus error).

Comment 5 Paul Dickson 2006-04-01 17:10:56 UTC
pm-hibernate also works in 2064.

Comment 6 Phil Knirsch 2006-05-10 14:06:55 UTC
This looks much more like a kernel bug. Reassigning to correct component.

Read ya, Phil

Comment 7 Paul Dickson 2006-05-12 19:59:21 UTC
Still doesn't work in 2200 and (possibly, did not check version before first
attempting a pm-suspend) 2196.  2064 still works.

Comment 8 Paul Dickson 2006-05-27 22:01:44 UTC
I want to try a git bisect to see where the problem started.  Is the .config
used for 2064 the concatenation of the config-generic and config-i686 files?  I
got these from:

  http://cvs.fedora.redhat.com/viewcvs/rpms/kernel/devel/configs/

2064 happens between 2.6.16-rc6 and 2.6.16.  But the rpms I've used (still in my
yum development packages) places the upper bound at 2.6.16-git15 (pre 2.6.17-rc1).


Comment 9 Paul Dickson 2006-05-28 19:56:35 UTC
Created attachment 130160 [details]
JPEG image of dmesg with results from resuming

Comment 10 Paul Dickson 2006-05-28 20:00:08 UTC
It takes me 56 to 58 minutes to generate a kernel...

2.6.16 is good
2.6.17-rc1 is bad

first bisect: good 
  Bisecting: 2118 revisions left to test after this
  [a3ea9b584ed2acdeae817f0dc91a5880e0828a05] Merge
    master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6

second bisect: good
  Bisecting: 1045 revisions left to test after this
  [f4d1749e9570d3984800c371c6e06eb35b9718b1] powerpc: add hvc backend for rtas

third bisect: would not boot (unrelated to my problem) selected bad
  Bisecting: 528 revisions left to test after this
  [256414dee44eaa0983f5ab1a71877de23c4e9ce7] Merge branch 'upstream-linus' of 
    master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev

fouth bisect: root volume becomes RO on resume, unknown if this is related.
I initially though it was, but screen unblanked and I was able to peek around
a bit.  See the just previous attached JPEG for the problem announced in dmesg.
  Bisecting: 264 revisions left to test after this
  [50fc9999ec27ad66ce6db31ebb03759f77962bc1] Docs update: missing files and 
    descriptions for filesystems/00-INDEX

These are the results so far with 18 hours of work in the last 22 hours.

I'm going to attempt to continue with the assumption that the 4th bisect is bad...

Comment 11 Paul Dickson 2006-05-28 20:22:07 UTC
Some additional clarification of #10:

The third bisect started having trouble while running the initrd.  I believe
compiling had problems with unresolved symbols in ext3 and reiserfs.  The
results were the root FS was not found.

On the Fourth bisect, my initial though that it was identical to my problem, but
after a few seconds the screen did unblank.  I did not actually attempt any
reads of the hard drive (what I used might have all resided in RAM already).  I
could not shutdown without using the power switch though.  So I'm proceed with
the assumption that this is the problem, just that it wasn't masked by a blank
screen.

Comment 12 Paul Dickson 2006-05-29 00:16:26 UTC
Created attachment 130163 [details]
This patch fixes the problem

Mark Lord sent this patch to LKML.  It fixes the problem for me.

There is still a BUG() message that is reported in comment #9.

Comment 13 Paul Dickson 2006-05-30 13:10:29 UTC
The above patch is in 2.6.17-rc5-git5.

kernel-2.6.16-1.2230_FC6 claims to contain 2.6.17-rc5-git5.

This kernel does not resume from a suspend.  It's as though the patch was
reverted.  Is the patch really in this version?

Comment 14 Paul Dickson 2006-06-02 10:16:13 UTC
Created attachment 130393 [details]
This is the .config used in comment #10

Comment 15 Jeremy Katz 2006-09-21 17:25:35 UTC
Is this better with newer kernels?   I know I've hibernated since then...

Comment 16 Nigel Cunningham 2007-05-14 23:02:37 UTC
No feedback on this for nearly 8 months, so insufficient data to be able to do
anything. Assuming the issue has been addressed by newer kernels. Please reopen
this bug if that's not the case.