Bug 650525

Summary: Failed to resume after suspend (worked in F12)
Product: [Fedora] Fedora Reporter: Venca <vabibiz>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED WONTFIX QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: 14CC: dougsland, gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda, mschmidt
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-08-16 18:43:09 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Dmesg output
none
Lsmod output
none
Lspci output
none
early TRACE_RESUME points
none
Dmesg output grabbed after passed "core" pm_test (also with pm_trace enabled).
none
Dmesg output grabbed after failed resume with pm_trace=1 none

Description Venca 2010-11-06 21:30:32 UTC
Created attachment 458377 [details]
Dmesg output

Description of problem:

The system successfully suspends. Suspend LED indicates suspended state. After resume event (by pressing power button) the laptop does not resume. The CAPS/NUM lock LEDs does not respond. The display does not comes up.
Not accessible through network. The laptop must be turned off/on to reboot the system.

Version-Release number of selected component (if applicable):

Linux nocouz 2.6.35.6-48.fc14.x86_64 #1 SMP Fri Oct 22 15:36:08 UTC 2010 x86_64 x86_64 x86_64 GNU/Linux


How reproducible:

Easy. Install 64bit Fedora 14 from Live CD.

Steps to Reproduce:
1.Go to main menu. Shutdown->Suspend.
2.Wait for the system to suspend..
3.Press resume button/key.
  
Actual results:
System does not resume. No response. Has to be turned off/on to reboot it.

Expected results:
System should resume and it should be possible to work on resumed laptop.

Additional info:

This bug happens on three of  my computers where I installed the Fedora 14 64bit from the live cd. The same problem happened when I installed Fedora 13 64bit also from live cd.

With Fedora 12 64bit the suspend worked very well for about 1 year on all my three computers (i did regular updates and no problem occurs).

I did lots of exploration of this problem based on recommendations I found on the web. Even when I try to suspend from single user mode the system does not resume.

I do not use any proprietary drivers. I have NVIDIA and ATI hardware on the computers. 

For a little bit HW info about my Megabook S271 laptop see attached txt files.

Comment 1 Venca 2010-11-06 21:31:21 UTC
Created attachment 458379 [details]
Lsmod output

Comment 2 Venca 2010-11-06 21:32:21 UTC
Created attachment 458380 [details]
Lspci output

Comment 3 Venca 2010-11-06 21:47:19 UTC
I forgot to mention that the resume after hibernate works good.

Comment 4 Venca 2010-11-09 16:52:05 UTC
I did more explorations and test the latest F12 kernel what was working in my previous Fedora installation.

With this kernel:

Linux nocouz 2.6.32.23-170.fc12.x86_64 #1 SMP Mon Sep 27 17:23:59 UTC 2010 x86_64 x86_64 x86_64 GNU/Linux

It also does not resume. It looks that the problem is not directly related to specific kernel but to other piece of software used to suspend/resume.

Comment 5 Venca 2010-11-09 22:00:37 UTC
Just to make sure that I did not made a mistake I reinstalled the system and the problem is still there.

In the meantime I did another tests:
  - when I turn CAPS LOCK led ON, then suspend and try resume, the led will not return to the ON state
  - I played more with almost all recommendations here http://fedoraproject.org/wiki/Common_kernel_problems#Suspend.2FResume_failure without any good results
  - pm_trace = 1 does not show any "hash marks"

Finally I found that somebody else has a very similar problem here https://bugzilla.redhat.com/show_bug.cgi?id=628897

I am still trying and ready to do more test however any guidance is welcomed as in next days I will run off ideas and will need to work (means switch back to F12 what works well).

Comment 6 Michal Schmidt 2011-02-11 10:12:08 UTC
[ Continuing the discussion we had on http://www.abclinuxu.cz/poradna/linux/show/327304 about this bug (in Czech)...]

So to reiterate some of the findings:
 - the bug is not specific to 64bit, it happens on 32bit too.
 - it is reproducible with the Rawhide kernel 2.6.38-0.rc4.git0.1.fc15.x86_64.
 - acpi_osi="!Windows 2009" acpi_osi="!Windows 2006 SP2" acpi_osi="!Windows 2006 SP1" acpi_osi="!Windows 2006.1" did not help.
 - pm_test "core" passed.
 - pm_trace did not modify the RTC at all => the hang at resume must happen very early.
 - the beep test using "echo 4 > /proc/sys/kernel/acpi_video_flags" did not produce any sound after resume, but this is inconclusive, because the PC speaker on the laptop is routed through the sound card which is almost certainly powered down at that point.

Could you download this kernel I built and do the pm_trace test with it?:
http://koji.fedoraproject.org/koji/taskinfo?taskID=2829714
(I have added extra TRACE_RESUME() calls to early points in resume code. Hopefully they will be hit at resume time.)

Comment 7 Michal Schmidt 2011-02-11 10:16:00 UTC
Created attachment 478205 [details]
early TRACE_RESUME points

For the record, this is the patch I added to the kernel build referenced above.

Comment 8 Venca 2011-02-11 11:09:25 UTC
Created attachment 478215 [details]
Dmesg output grabbed after passed "core" pm_test (also with pm_trace enabled).

Dmesg output grabbed after passed "core" pm_test (also with pm_trace enabled).

I grabbed this log just before I did the real suspend. i expected I will find some trace messages in there but nothing found. At least this is the proof that this kernel version I able to pass the "core" test.

Comment 9 Venca 2011-02-11 11:15:38 UTC
Created attachment 478217 [details]
Dmesg output grabbed after failed resume with pm_trace=1

(In reply to comment #6)
This is the requested output after failed resume with pm_trace=1

Made on kernel:
"Linux nocouz 2.6.35.11-84.resume.fc14.x86_64 #1 SMP Thu Feb 10 13:45:56 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux"

Briefly by comparing previous dmesg and this one I did not find any trace messages.

Comment 10 Michal Schmidt 2011-02-11 15:03:36 UTC
> Briefly by comparing previous dmesg and this one I did not find any trace
> messages.

There were not supposed to be any additional trace messages. The point of the TRACE_RESUME() calls is to store the stamps in the RTC. So I expected there'd be something interesting stored in the Magic value. You have:

  [    0.761469]   Magic number: 15:749:931
  [    0.761669] rtc_cmos 00:02: setting system clock to 2011-02-11 10:55:14 UTC 
  (1297421714)

Since the RTC setting obviously survived undamaged (the date is correct), not even the additional TRACE_RESUME() points had been reached during the resume attempt. That's sad, because we're getting very close to the point where the kernel is supposed to regain control after resume.

Perhaps I can add one more trace point to the real-mode ACPI resume code. That will completely rule out the possibility that the kernel is doing something bad on resume and I guess it will mean we're confusing the BIOS already during suspending.
But I cannot simply add TRACE_RESUME() there...

Comment 11 Venca 2011-02-11 16:59:35 UTC
Ok, I think I get the point. So would it be possible to do "some" enhanced introspection/checking during the suspend phase?

Most likely it is not connected but I have another observation I already mentioned in the Czech discussion and did not mentioned here. I know that this may not be related but if it will point you to the right direction it is worth of it. Since the F14 I also have problem  with the system restart. The symptoms are following: The system goes down without problems as I would expect and just at the moment when the system should do the real restart and do the BIOS POST, the system hangs. The laptop display goes to black screen and the IDE/SATA devices do something what looks like HW initialization (the HD LED indicator flashes several times and the CDROM drive flashes once => I guess that the IDE/SATA subsystem do something and then it freezes ).
The same behaviour I described in previous paragraph is also happening when I try to resume the system (when suspended). Please note that I already tried the kernel reboot=b/w option to solve this problem and it did not help.

Comment 12 Fedora End Of Life 2012-08-16 18:43:12 UTC
This message is a notice that Fedora 14 is now at end of life. Fedora 
has stopped maintaining and issuing updates for Fedora 14. It is 
Fedora's policy to close all bug reports from releases that are no 
longer maintained.  At this time, all open bugs with a Fedora 'version'
of '14' have been closed as WONTFIX.

(Please note: Our normal process is to give advanced warning of this 
occurring, but we forgot to do that. A thousand apologies.)

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, feel free to reopen 
this bug and simply change the 'version' to a later Fedora version.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we were unable to fix it before Fedora 14 reached end of life. If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora, you are encouraged to click on 
"Clone This Bug" (top right of this page) and open it against that 
version of Fedora.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping