Bug 629291 - last few F13 kernels fail to suspend on W510 Thinkpad w/nVidia graphics
Summary: last few F13 kernels fail to suspend on W510 Thinkpad w/nVidia graphics
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 13
Hardware: x86_64
OS: Linux
high
medium
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-09-01 14:16 UTC by Clark Williams
Modified: 2010-09-07 21:45 UTC (History)
20 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2010-09-07 00:27:43 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
output of lspci on W510 Thinkpad (16.40 KB, text/plain)
2010-09-01 14:16 UTC, Clark Williams
no flags Details
output of lspci -vv on IBM Thinkpad T41p (10.32 KB, text/plain)
2010-09-01 20:04 UTC, Holger Arnold
no flags Details
Output of lspci -vv on HP EliteBook 6930p (32.39 KB, text/plain)
2010-09-02 10:00 UTC, Stephen J. Gowdy
no flags Details
dmesg output after pm_trace=1 (59.38 KB, text/plain)
2010-09-02 12:43 UTC, Stephen J. Gowdy
no flags Details

Description Clark Williams 2010-09-01 14:16:05 UTC
Created attachment 442420 [details]
output of lspci on W510 Thinkpad

Description of problem:

When attempting to Suspend-to-Ram, the laptop hangs while suspending. The graphics screen is at the blue screen with the small fedora laptop and the small crecent moon light on the Thinkpad (which indicates suspend state) is continually blinking). No response from the keyboard, so only option is to power cycle. 

The last kernel that properly suspends/resumes on this laptop is:

     kernel-2.6.33.5-112.fc13.x86_64

Version-Release number of selected component (if applicable):

     kernel-2.6.34.6-47.fc13.x86_64

How reproducible:

always

Steps to Reproduce:
1. Boot into latest F13 kernel
2. close lid or initiate suspend through a power-manager application
3. system starts to suspend but never completes
  
Actual results:

system hang while suspending

Expected results:

transition to suspend state

Comment 1 Holger Arnold 2010-09-01 20:04:57 UTC
Created attachment 442487 [details]
output of lspci -vv on IBM Thinkpad T41p

The problem is not specific to nVidia; it also occurs on an IBM Thinkpad T41p with ATI Mobility FireGL T2 (RV350) graphics and when using an i686 kernel.

When suspending, the screen goes black and the sleep LED keeps blinking.  Suspend to disk is also affected: the system freezes before anything is written to disk.  The system cannot be rebooted using sysrq.

Suspend worked without problems using kernels up to 2.6.33.8-149.fc13.i686 and stopped working after the update to 2.6.34.6-47.fc13.i686.

Comment 2 Holger Arnold 2010-09-01 20:20:39 UTC
Why does this bug have low priority (which literally means "not very important")?  Should breaking supend on a laptop not get at least average importance (medium priority)?

Comment 3 Clark Williams 2010-09-01 20:43:21 UTC
It was that just because I didn't actually set it when I created the bug. Changed to "high"

Comment 4 Chuck Ebbert 2010-09-01 23:48:54 UTC
Can you try debugging what causes it to hang?

  https://fedoraproject.org/wiki/Common_kernel_problems#Suspend.2FResume_failure

Comment 5 Stephen J. Gowdy 2010-09-02 09:55:21 UTC
Same is true for HP EliteBook 6930p with Radeon;

01:00.0 VGA compatible controller: ATI Technologies Inc Mobility Radeon HD 3400 Series

LCD screen blanks but all LEDs stay on. There is a sleep LED on this laptop. Last known good kernel was;

kernel-2.6.33.8-149.fc13.x86_64

first bad was;

kernel-2.6.34.6-47.fc13.x86_64

Using standard video driver;

[root@antonia ~]# lsmod|grep radeon
radeon                711359  3 
ttm                    54819  1 radeon
drm_kms_helper         24738  1 radeon
drm                   176953  5 radeon,ttm,drm_kms_helper
i2c_algo_bit            5045  1 radeon
i2c_core               25709  4 radeon,drm_kms_helper,drm,i2c_algo_bit

I'll try the pm_trace option after I post this.

I also turned on VMX in the BIOS so will also make sure that somehow didn't cause this too.

Comment 6 Stephen J. Gowdy 2010-09-02 10:00:11 UTC
Created attachment 442591 [details]
Output of lspci -vv on HP EliteBook 6930p

Comment 7 Stephen J. Gowdy 2010-09-02 12:43:31 UTC
Created attachment 442610 [details]
dmesg output after pm_trace=1

(In reply to comment #5)

> I'll try the pm_trace option after I post this.

So after doing the failed suspect and reboot dmesg didn't have the suggested information in it... I've attached the output in case I missed something.

Is there some command that needs run to produce this output?

Comment 8 Stephen J. Gowdy 2010-09-02 13:39:15 UTC
(In reply to comment #5)

> I also turned on VMX in the BIOS so will also make sure that somehow didn't
> cause this too.

Okay, turning this off again didn't make any difference.

Comment 9 Chuck Ebbert 2010-09-02 17:14:45 UTC
(In reply to comment #5)
> Same is true for HP EliteBook 6930p with Radeon;
> 

Just because you have the same symptoms doesn't mean it's the same bug. Radeon machines may possibly be fixed in 2.6.34.6-49.

Comment 10 Stephen J. Gowdy 2010-09-02 17:51:55 UTC
(In reply to comment #9)

> Just because you have the same symptoms doesn't mean it's the same bug.

Of course, but until there is evidence to the contrary it is a fair guess.

> Radeon machines may possibly be fixed in 2.6.34.6-49.

I don't see that kernel as available yet.

Comment 11 Clark Williams 2010-09-02 18:08:03 UTC
Booted 2.6.34.6-47.fc13.x86_64 on my laptop and did:

# echo 1 >/sys/power/pm_trace
# pm-suspend

System hung while suspending (as expected), power cycled and rebooted into the
same kernel. 

# grep "hash matches" /var/log/dmesg
 hash matches drivers/base/power/main.c:520

I cloned the fedora kernel package and this turns out to be the TRACE_DEVICE
macro invocation in device_resume(). Now I need to figure out how to turn on
the pm_dev_dbg prints to see if we got any further than the dpm_wait() call.

Any advice on that?

Comment 12 hcgpalm 2010-09-02 19:13:59 UTC
Same problem on my IBM ThinkPad Z60m (2.6.34.6-47.fc13.i686.PAE #1 SMP).

# echo 1 >/sys/power/pm_trace
# pm-suspend
[Hangs with blinking moon, power cycle]
# grep -C2 "hash matches" /var/log/dmesg
No TPM chip found, activating TPM-bypass!
  Magic number: 10:966:947
tty tty4: hash matches
rtc_cmos 00:06: setting system clock to 2010-09-02 18:54:21 UTC (1283453661)
Initalizing network drop monitor service

Comment 13 Clark Williams 2010-09-02 19:22:46 UTC
I cloned the -tip tree (2.6.35-rc3-tip+), built it (using the latest fedora config) and booted it. Has the same behavior as the latest F13 kernel when you run pm-suspend (or suspend from xfce4-power-manager).

Comment 14 Holger Arnold 2010-09-02 21:58:49 UTC
On my system, suspending with pm_trace enabled produces no output of the form "hash matches ..." in /var/log/dmesg.  What else can be done to debug this problem?

Comment 15 hcgpalm 2010-09-03 01:04:33 UTC
FWIW;
# echo freezer > /sys/power/pm_test; echo mem > /sys/power/state
works, but
# echo devices > /sys/power/pm_test; echo mem > /sys/power/state
hangs.

It also hangs when trying to suspend from a minimum configuration (i.e. booting with init=/bin/bash).

Also, to clarify, I suppose the Magic Number stuff I posted above is bogus as the RTC still appears to have the correct time after the reboot as can be observed in the line after the "hash matches" stuff(?)

Comment 16 Sebastian Vahl 2010-09-03 07:52:58 UTC
(In reply to comment #14)
> On my system, suspending with pm_trace enabled produces no output of the form
> "hash matches ..." in /var/log/dmesg.  What else can be done to debug this
> problem?

Having the same problem here. System doesn't power off and there appears no "hash matches" in dmesg. Former 2.6.33 kernels suspend fine.
I'm using a R420 radeon card on F13-x86_64 on this system: http://www.smolts.org/client/show/pub_a3894d9e-1e2c-4506-94f6-679d3ee2bb7c

Comment 17 Chuck Ebbert 2010-09-03 09:33:06 UTC
(In reply to comment #10)
> 
> > Radeon machines may possibly be fixed in 2.6.34.6-49.
> 
> I don't see that kernel as available yet.

http://koji.fedoraproject.org/koji/buildinfo?buildID=193317

Comment 18 hcgpalm 2010-09-03 09:50:30 UTC
I can confirm that 2.6.34.6-49 indeed fixes the problem on the TP Z60m. Thanks!

Comment 19 Stephen J. Gowdy 2010-09-03 10:22:43 UTC
Yip!  2.6.34.6-49 does also fix it for the HP EliteBook 2930p. Thanks for the pointer!

Comment 20 Clark Williams 2010-09-03 15:14:58 UTC
As expected, the  2.6.34.6-49 did *not* affect my problem, but I just noticed an update available for xorg-x11-drv-nouveau, so I'll try that next.

Comment 21 Clark Williams 2010-09-03 19:05:43 UTC
No joy with the new nouveau driver (makes sense, the indications I've seen show that the suspend failure happens before we actually get to suspending devices). 

I've run into a snag with my git bisect of the problem; I've got an encrypted root and one of the bisection point kernels fails to boot because it throws a traceback after reading my passphrase. I'm going to try and manually find a new bisection point.

Comment 22 Mathias Védrines 2010-09-04 11:08:44 UTC
2.6.34.6-49 kernel also fixed the problem for my Dell Studio 1737 (radeon). Thanks.

Comment 23 Chuck Ebbert 2010-09-04 16:42:27 UTC
(In reply to comment #21)
> No joy with the new nouveau driver (makes sense, the indications I've seen show
> that the suspend failure happens before we actually get to suspending devices). 
> 

Can you try kernel -52 from koji? And before trying to suspend/resume run this command as root:

  echo 0 > /sys/power/pm_async

Comment 24 Jonathan Yu 2010-09-04 22:44:49 UTC
(In reply to comment #23)
> (In reply to comment #21)
> > No joy with the new nouveau driver (makes sense, the indications I've seen show
> > that the suspend failure happens before we actually get to suspending devices). 
> > 
> 
> Can you try kernel -52 from koji? And before trying to suspend/resume run this
> command as root:
> 
>   echo 0 > /sys/power/pm_async

I had the same problem where suspending would freeze on the blue screen on my T500.  I installed that kernel and ran this command, and suspend worked.  Thanks!

Comment 25 Clark Williams 2010-09-05 13:45:53 UTC
(In reply to comment #23)
> (In reply to comment #21)
> > No joy with the new nouveau driver (makes sense, the indications I've seen show
> > that the suspend failure happens before we actually get to suspending devices). 
> > 
> 
> Can you try kernel -52 from koji? And before trying to suspend/resume run this
> command as root:
> 
>   echo 0 > /sys/power/pm_async

Chuck, 

Booted -52 and tried the above with no luck. I then rebooted and turned on /sys/power/pm_trace and saw the same message in the dmesg file:

$ grep 'hash match' kernel-52.dmesg 
  hash matches drivers/base/power/main.c:520

Comment 26 Peter Bloomfield 2010-09-05 19:32:36 UTC
(In reply to comment #17)
> (In reply to comment #10)
> > 
> > > Radeon machines may possibly be fixed in 2.6.34.6-49.
> > 
> > I don't see that kernel as available yet.
> 
> http://koji.fedoraproject.org/koji/buildinfo?buildID=193317

I confirm that my ThinkPad T43 (ATI Technologies Inc M22 [Mobility Radeon X300]) suspends with -49.

After the first suspend/resume,  the NetworkManager icon showed only "networking disabled" (or some such), grayed out, although System:Administration:Services showed both NetworkManager and network as enabled and running.  I restarted NetworkManager,  and promptly got a wifi connection.  After subsequent suspend/resumes, wifi has come up promptly with no need to mess with NetworkManager.

Comment 27 Chuck Ebbert 2010-09-07 00:27:43 UTC
We are closing all of the reports of suspend / hibernate failure against 2.6.34.6-47 as fixed by 2.6.34.6-54.

If you are the original reporter and kernel -54 does not fix it, re-open the bug.

If you are *not* the original reporter and that kernel fails for you, open a new bug. We need to keep separate the reports of failure on different kinds of hardware.

Comment 28 Peter Bloomfield 2010-09-07 01:22:59 UTC
On my radeon h/w (ATI Technologies Inc M22 [Mobility Radeon X300]), -54 suspends successfully and, so far, I've seen no NetworkManager issues.

Comment 29 Clark Williams 2010-09-07 21:45:35 UTC
Looks like -54 did the trick. I have to unload the xhci_hcd driver to successfully suspend, but after adding SUSPEND_MODULES=xhci_hcd to a file in /etc/pm/config.d, I'm back to being able to suspend and resume. 

Many thanks!


Note You need to log in before you can comment on or make changes to this bug.