Bug 989373

Summary: suspend to RAM broken with 3.10 kernel update on T500
Product: [Fedora] Fedora Reporter: Stefan Assmann <sassmann>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: high    
Version: 19CC: 04mvs89, alick9188, balay, chris, christof, gansalmon, itamar, jonathan, kernel-maint, klinux, kparal, lionghostshop, madhu.chinakonda, mikhail.zabaluev, nils.smeds, ondra.pelech, pereira.vitor.manuel, peterbloomfield, samuel-rhbugs, smconvey, zaitcev
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: kernel-3.10.10-100.fc18 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-09-01 22:58:05 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
dmesg from clean boot
none
screenshot from console with tail -f /var/log/messages none

Description Stefan Assmann 2013-07-29 07:36:14 UTC
Created attachment 779658 [details]
dmesg from clean boot

Description of problem:
With the update to kernel 3.10 suspend to RAM is broken on my Lenovo T500. Screen wakes up but system freezes immediately. No relevant messages are printed.

Going back to 3.9.9 and suspend works again.

Version-Release number of selected component (if applicable):
kernel-3.10.3-300.fc19.x86_64

How reproducible:
always

Steps to Reproduce:
- close lid
or
- echo mem > /sys/power/state

Actual results:
screen wakes up but the system is frozen. No more input is accepted via keyboard, not even sysrq works.

Expected results:
working suspend to RAM.

Comment 1 Stefan Assmann 2013-07-29 07:42:52 UTC
Created attachment 779660 [details]
screenshot from console with tail -f /var/log/messages

Comment 2 Satish Balay 2013-07-30 15:53:17 UTC
perhaps a dup of bug 989163

Comment 3 Josh Boyer 2013-07-30 16:14:36 UTC
Please test the 3.10.4-300 kernel in koji to see if the fix for i915 graphics solves your issue.

Comment 4 Kamil Páral 2013-07-30 17:52:42 UTC
(In reply to Stefan Assmann from comment #0)
> With the update to kernel 3.10 suspend to RAM is broken on my Lenovo T500.
> Screen wakes up but system freezes immediately. No relevant messages are
> printed.

Confirmed. Same problem with my T500.

(In reply to Josh Boyer from comment #3)
> Please test the 3.10.4-300 kernel in koji to see if the fix for i915
> graphics solves your issue.

No it doesn't, same behavior.

I have built my own kernel-3.11.0-0.rc3.git0.1.fc19.x86_64 (recompiled from fc20 version) and resume works again. Please push kernel 3.11 into F19.

Comment 5 Josh Boyer 2013-07-30 18:02:40 UTC
(In reply to Kamil Páral from comment #4)
> (In reply to Stefan Assmann from comment #0)
> > With the update to kernel 3.10 suspend to RAM is broken on my Lenovo T500.
> > Screen wakes up but system freezes immediately. No relevant messages are
> > printed.
> 
> Confirmed. Same problem with my T500.
> 
> (In reply to Josh Boyer from comment #3)
> > Please test the 3.10.4-300 kernel in koji to see if the fix for i915
> > graphics solves your issue.
> 
> No it doesn't, same behavior.

OK.

> I have built my own kernel-3.11.0-0.rc3.git0.1.fc19.x86_64 (recompiled from
> fc20 version) and resume works again. Please push kernel 3.11 into F19.

Um, no.  It isn't even released yet, and won't be for at least another month.

What would be nice is if someone could reverse bisect the issue from 3.10 to 3.11-rc3 to see which commit there fixes it.  It should probably get into 3.10.y stable.

Comment 6 Josh Boyer 2013-07-31 16:04:28 UTC
So to be clear, is the machine failing to go into suspend at all (e.g. hanging when suspend is executed and the screen is still on), or is the problem when it's trying to resume from a suspend?

Comment 7 Kamil Páral 2013-07-31 17:19:32 UTC
In my case it's same as in description, actual suspend works fine, but when it wakes up and a lock screen is displayed, it freezes immediately (no mouse, no keyboard, no VT switch, no sysrq).

I tried different kernels, 3.10.0-1.fc20 is broken, and 3.11.0-0.rc0.git2.1.fc20 works fine.

Comment 8 Josh Boyer 2013-08-01 00:07:46 UTC
Hm.  There's 6005 commits between 3.10.0 and 3.11-git2.  We established on IRC that it's a bit difficult to bisect between those for some people.  Maybe we can poke at which commit _broke_ 3.10.

http://koji.fedoraproject.org/koji/packageinfo?packageID=8 will have all the kernels we built.  Could someone start going through the 3.10.0-0.rc0-gitXX builds and seeing which is the first that breaks suspend?

Comment 9 Josh Boyer 2013-08-01 12:20:15 UTC
*** Bug 991019 has been marked as a duplicate of this bug. ***

Comment 10 Josh Boyer 2013-08-01 13:55:41 UTC
*** Bug 989163 has been marked as a duplicate of this bug. ***

Comment 11 Kamil Páral 2013-08-01 19:10:17 UTC
(In reply to Josh Boyer from comment #8)
> http://koji.fedoraproject.org/koji/packageinfo?packageID=8 will have all the
> kernels we built.  Could someone start going through the 3.10.0-0.rc0-gitXX
> builds and seeing which is the first that breaks suspend?

OK.

... an hour of constant reboots later...

kernel-3.10.0-0.rc0.git2.1.fc20.x86_64 -- resumes fine

from kernel-3.10.0-0.rc0.git3.1.fc20.x86_64 to kernel-3.10.0-0.rc0.git13.1.fc20.x86_64 -- resumes into a console (I see a text cursor) and freezes

kernel-3.10.0-0.rc0.git15.1.fc20.x86_64 -- resumes into a graphical session and freezes

So, it seems that 3.10.0-0.rc0.git3.1 introduced the resume bug, and then something got improved (but not fully fixed) in 3.10.0-0.rc0.git15.1. Those two changesets might be worth to look at.

Comment 12 Josh Boyer 2013-08-03 00:47:13 UTC
*** Bug 991619 has been marked as a duplicate of this bug. ***

Comment 13 Nils Smeds 2013-08-08 09:36:19 UTC
I have a lenovo W500 and have seen the same problem with suspend to memory.
The kernel-3.10.5-201.fc19.x86_64 from http://koji.fedoraproject.org/koji/buildinfo?buildID=454939 seems to have fixed the suspend to memory issue for me. 

Still, I have problems with suspend to disk with this kernel. However this problem is quite different as in this case the machine fails to shutoff at the end of the hibernation process. Not enough information at this stage to file a real bug report on this though.

Comment 14 Satish Balay 2013-08-08 13:44:03 UTC
I had trouble with kernel-PAE-3.10.5-201.fc19.i686 on a Thinkpad T400

With 3.10.3 the machine froze after the first suspend/resume. However with 3.10.5 - it worked for a few suspend resumes. But then eventually [around 6-8th atttempt] it froze at resume [after restoring the x display].

I did try 3.11.0.rc4 kernels briefly - During my first try [after a few suspend resumes] - the machine wouldn't suspend anymore [with fn-f4 or 'suspend' menu link -but would suspend with pm-suspend command]. Then I noticed messages indicating systemd crashed. With my next attempt on trying out 3.11.0.rc4 - the machine froze in suspend mode. I couldn't resume it back from suspend. [However 3.11.0.rc.x86_64 kernels are working well on a different Thinkpad T420s - so this is disconcerting]

The T400 is back on kernel-PAE-3.9

Comment 15 Peter Bloomfield 2013-08-11 02:15:10 UTC
kernel-3.10.5-201.fc19.x86_64 still fails to resume on my ThinkPad W500.  3.9.9-302.fc19.x86_64 is the last kernel that works for me.

Comment 16 Peter Bloomfield 2013-08-12 22:09:14 UTC
Ditto kernel-3.10.6-200.fc19.x86_64.

Comment 17 Nils Smeds 2013-08-14 17:38:27 UTC
Kernel 3.10.5-201.fc19.x86_64  works fine for me both suspend to RAM and suspend to disk and wake-up from suspend (the few times I have tested it after upgrading). I did see some strange behaviour that the system wouldn't shut off after hibernation, but that appears to have gone away after some repeated hibernations so the reason for that is unclear. I thus _think_ I have a stable and good system now. 

What I wanted to add to the mixture here is if the issue is BIOS related? I am on a fairly old BIOS.

$ sudo biosdecode 
[sudo] password for nsmeds: 
# biosdecode 2.12
VPD present.
	BIOS Build ID: 6FET64WW 
	Box Serial Number: XXXXXXXX
	Motherboard Serial Number: XXXXXXXXXXX
	Machine Type/Model: 4061AD4
[...]

This is a Lenovo W500.

C.f. http://download.lenovo.com/ibmdl/pub/pc/pccbbs/mobiles/6fuj46uc.txt

Comment 18 Nils Smeds 2013-08-15 08:52:47 UTC
Update: I do still see problem going into suspend to disk. Writing that I had no problem obviously made it appear....
Still I have not seen any problem suspending to RAM (tried numerous times now) and then waking up.

When suspending to disk and having problems I get an infinite loop of (what I think says):

mei_me: unexpected enumeration response hbm
mei_me: wrong host start response

The machine gets unresponsive and needs a hard power off. Then it boots up (with minor disk fixes) but into a fresh boot - not back into the old boot.

The lines fly by on the screen so it is hard to read correctly, but I think this it what it says. But letters such as e and c and so on are hard to discriminate at this speed.

/Nils

Comment 19 Josh Boyer 2013-08-15 14:41:40 UTC
mei_me issues are being tracked in bug 917081.  However, these two bugs may be related.

Does everyone impacted by the failure to resume in this bug have mei_me messages in dmesg after a fresh boot?

Comment 20 Nils Smeds 2013-08-15 14:51:48 UTC
Hi Josh,

Just to be clear. I did once or twice experience that /var/log/messages filled up as described in bug 917081. But not with this kernel 3.10.5-201.fc19.x86_64.

The message I quoted above is only on the console (after the Xserver has shut down) and appears on suspend to disk and never makes it into /var/log/messages or dmesg.

Just to be clear - not sure if it makes a difference on making the connection to the bug you mention.

/Nils

Comment 21 Peter Bloomfield 2013-08-16 04:30:05 UTC
(In reply to Josh Boyer from comment #19)
> mei_me issues are being tracked in bug 917081.  However, these two bugs may
> be related.
> 
> Does everyone impacted by the failure to resume in this bug have mei_me
> messages in dmesg after a fresh boot?

With kernel 3.10.6-200.fc19.x86_64, only these:

$ dmesg | grep -i mei
[    9.633226] mei_me 0000:00:03.0: setting latency timer to 64
[    9.633256] mei_me 0000:00:03.0: irq 47 for MSI/MSI-X

This is on a Lenovo W500, but with a different BIOS from Nils:

$ sudo biosdecode 
# biosdecode 2.12
VPD present.
	BIOS Build ID: 6FET87WW 
	Box Serial Number: L3AGK8Z
	Motherboard Serial Number: VF29395F10T
	Machine Type/Model: 4061B13

This laptop appears to suspend successfully, and on resume the gnome-shell session is visible, but with screen brightness at its lowest level and with keyboard, track pad, and power button all unresponsive.  Same is true for all 3.10 kernels that I've tried.

Comment 22 Peter Bloomfield 2013-08-16 04:43:54 UTC
This may be irrelevant: I have disabled the discrete graphics chip, and use only the integrated Intel controller.

Comment 23 fednuc 2013-08-16 10:04:08 UTC
I've seen the same symptoms as many of those above on a Dell E4300 laptop, which has the same generation of components as the T500 (Core 2 Duo P9600/GMA 4500MHD graphics).

All 3.10.* kernels up to 3.10.6-200.fc19.x86_64 fail to resume from suspend, showing the Gnome Shell lock screen when trying to resume, but no response to VT Ctrl-Alt-F* switching, mouse or keyboard, and no lock screen animation. Suspend/resume worked perfectly in all previous fc19 kernels up to/including 3.9.*.

I'm fairly sure that the clock on the lock screen also shows the time at suspend, but will have to check.

Comment 24 Josh Boyer 2013-08-16 13:38:03 UTC
*** Bug 995651 has been marked as a duplicate of this bug. ***

Comment 25 Josh Boyer 2013-08-16 13:38:28 UTC
*** Bug 993607 has been marked as a duplicate of this bug. ***

Comment 26 Chris Schumann 2013-08-16 15:39:40 UTC
Same issue on ThinkPad T400.
# biosdecode 2.12
VPD present.
	BIOS Build ID: 7UET91WW 
	Box Serial Number: R8FNXK2
	Motherboard Serial Number: VQ0UR98R1L5
	Machine Type/Model: 2764CTO

Broken on all 3.10.x up to at least 3.10.6-200.fc19.x86_64.
My bug (995651) was closed as a dup of this.

Comment 27 Waclaw Sierek 2013-08-19 10:01:21 UTC
For a reference: T500 with 3.10.7-200.fc19.x86_64 still hangs on resume.
Seing a refence to bug 917081 I did a:
#rmmod mei_me

After that I can suspend and resume successfully

Comment 28 Ondra Pelech 2013-08-19 10:28:28 UTC
# rmmod mei_me

solves the problem for me

Comment 29 fednuc 2013-08-19 10:58:06 UTC
Also still see resume failure on 3.10.7-200.fc19.x86_64 (Dell E4300).

rmmod mei_me also WFM to work around this issue.

Comment 30 Kleber Rocha 2013-08-19 22:30:47 UTC
rmmod mei_me solves the problem for me too, lenovo thinkpad T400.

Comment 31 Samuel Sieb 2013-08-19 23:29:59 UTC
Here's an openSUSE bug for the same issue:
https://bugzilla.novell.com/show_bug.cgi?id=822927

Comment 32 Satish Balay 2013-08-22 02:28:58 UTC
I'm with kernel-PAE-3.10.7-200.fc19.i686 on the T400 with mei_me module blacklisted. It worked fine for the past couple of days - but now - it crashed.

It went into suspend mode fine . But then I couldn't restore it from suspend. It just ignored key presses [usually Fn - key would restore from suspend].

[per comment 14 - I had the same crash previously with 3.11.0.rc4]

I had to hard reboot the machine. The last entry in /var/log/messages is a successful suspend.

[root@nemo ~]# last reboot |head -2
reboot   system boot  3.10.7-200.fc19. Wed Aug 21 21:13 - 21:25  (00:12)    
reboot   system boot  3.10.7-200.fc19. Mon Aug 19 10:16 - 21:25 (2+11:08)   
[root@nemo ~]# cat /etc/modprobe.d/blacklist-mei_me.conf 
blacklist mei_me
[root@nemo ~]# lsmod |grep mei_me
[root@nemo ~]# 


/var/log/messages:

>>>>>>>>>>>>
Aug 21 21:03:07 nemo NetworkManager[420]: <info> (wlan0): cleaning up...
Aug 21 21:03:07 nemo NetworkManager[420]: <info> (wlan0): taking down device.
Aug 21 21:03:08 nemo systemd[1]: Starting Sleep.
Aug 21 21:03:08 nemo systemd[1]: Reached target Sleep.
Aug 21 21:03:08 nemo systemd[1]: Starting Suspend...
Aug 21 21:03:08 nemo systemd-sleep[19592]: Suspending system...
Aug 21 21:13:23 nemo rsyslogd: [origin software="rsyslogd" swVersion="7.2.6" x-pid="376" x-info="http://www.rsyslog.com"] start
Aug 21 21:13:23 nemo kernel: [    0.000000] Initializing cgroup subsys cpuset
Aug 21 21:13:23 nemo kernel: [    0.000000] Initializing cgroup subsys cpu
<<<<<<<<<<<<<<

Comment 33 Josh Boyer 2013-08-23 21:30:27 UTC
Here's a scratch build with all known mei patches backported.  Please test and let us know if your issue is resolved (without the mei modules blacklisted).

http://koji.fedoraproject.org/koji/taskinfo?taskID=5847415

Comment 34 Kleber Rocha 2013-08-24 13:20:33 UTC
On Lenovo T400 with mei module on, resume fail, on kernel 3.10.9-200.fc19.x86_64. With mei_me blocklisted works fine.

Comment 35 Satish Balay 2013-08-24 18:44:01 UTC
(In reply to Kleber Rocha from comment #34)
> On Lenovo T400 with mei module on, resume fail, on kernel
> 3.10.9-200.fc19.x86_64. With mei_me blocklisted works fine.

Looks like what you've tried is not the scratch build kernel.


I just tired 3.10.9-200.2.fc19.i686.PAE from the scratch build on the T400 [without blacklisting mei_me] - and it went through an order of 30 suspend resume cycles without a crash.

However at one point - it just prompted about 'critical battery hibernating' and immediately hibernated. [even though the battery is fully charged.]. Will keep monitoring if that occurs again [and if the machine is stable..]

Comment 36 Vitor 2013-08-25 03:40:02 UTC
I'm on a Lenovo T500 and was facing this problem as well since updating the kernel to the 3.10.* series (F18). Suspend to RAM works fine with all the 3.9.* kernels though, as reported above.

The "rmmod mei_me" workaroud, or blackisting the module, solved the freezes after resuming from suspend-to-ram.

Comment 37 Waclaw Sierek 2013-08-25 13:41:11 UTC
T500 3.10.9-200.2.fc19.x86_64 - works for me. mei_me loaded, but resumes without issues so far.

Comment 38 Chris Schumann 2013-08-26 02:45:39 UTC
T400 3.10.9-200.fc19.x86_64 still hangs upon resume unless I rmmod mei_me.

Comment 39 Josh Boyer 2013-08-26 13:37:00 UTC
OK, one more scratch build to test, this time with just the 4 patches submitted for the 3.10.y stable kernel.  Thanks for testing, we appreciate it:

http://koji.fedoraproject.org/koji/taskinfo?taskID=5854464

Comment 40 Kamil Páral 2013-08-26 15:56:28 UTC
(In reply to Josh Boyer from comment #39)

Works well for me with my T500. Resume functional.

Comment 41 Waclaw Sierek 2013-08-26 17:17:25 UTC
T500 3.10.9-200.3.fc19.x86_64 - works for me

Comment 42 Peter Bloomfield 2013-08-26 22:56:08 UTC
(In reply to Josh Boyer from comment #39)

ditto

Works for me with a W500

Comment 43 fednuc 2013-08-27 09:57:45 UTC
3.10.9-200.3.fc19.x86_64 WFM, Dell E4300

Comment 44 Josh Boyer 2013-08-28 18:35:13 UTC
I've added the patches to Fedora git.  Should be fixed in the next update.  Thank you for testing.

Comment 45 Fedora Update System 2013-08-29 21:34:01 UTC
kernel-3.10.10-200.fc19 has been submitted as an update for Fedora 19.
https://admin.fedoraproject.org/updates/kernel-3.10.10-200.fc19

Comment 46 Waclaw Sierek 2013-08-30 06:54:30 UTC
T500 3.10.10-200.fc19.x86_64 - works for me

Comment 47 Fedora Update System 2013-08-30 14:20:38 UTC
kernel-3.10.10-100.fc18 has been submitted as an update for Fedora 18.
https://admin.fedoraproject.org/updates/kernel-3.10.10-100.fc18

Comment 48 Fedora Update System 2013-08-30 22:54:51 UTC
Package kernel-3.10.10-200.fc19:
* should fix your issue,
* was pushed to the Fedora 19 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing kernel-3.10.10-200.fc19'
as soon as you are able to, then reboot.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2013-15545/kernel-3.10.10-200.fc19
then log in and leave karma (feedback).

Comment 49 Fedora Update System 2013-09-01 22:58:05 UTC
kernel-3.10.10-200.fc19 has been pushed to the Fedora 19 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 50 Fedora Update System 2013-09-01 23:06:09 UTC
kernel-3.10.10-100.fc18 has been pushed to the Fedora 18 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 51 Josh Boyer 2013-09-02 15:14:34 UTC
*** Bug 1003594 has been marked as a duplicate of this bug. ***

Comment 52 Chris Schumann 2013-09-03 18:12:56 UTC
3.10.10-200.fc19.x86_64 works on my T400. Thank you very much folks!