Bug 155980

Summary: Laptop suspend no longer works (Toshiba Satellite)
Product: [Fedora] Fedora Reporter: Eli <elicarter>
Component: kernelAssignee: Dave Jones <davej>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 3CC: intel-linux-acpi, pfrields
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-08-09 20:36:34 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Eli 2005-04-26 13:03:58 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.7) Gecko/20050416 Fedora/1.0.3-1.3.1 Firefox/1.0.3

Description of problem:
kernel-2.6.10-1.770_FC3 worked great; the notebook would reliably suspend, and even resume without problems.  Life was good. :)
kernel-2.6.11-1.14_FC3 won't suspend.


Version-Release number of selected component (if applicable):
kernel-2.6.11-1.14_FC3

How reproducible:
Always

Steps to Reproduce:
1. Close lid
2. Wait for suspend light indication

  

Actual Results:  3. Notebook did not suspend

Expected Results:  3. Notebook should have gone to sleep.

Additional info:

I'll have to reboot into the new kernel and get the /var/log/acpid messages.  Anything in particular I should look for?

Comment 1 Eli 2005-04-27 01:48:53 UTC
Doing some more testing of system's behavior tonight.  The first time I closed
the lid, the system suspended, and when I opened the lid, it resumed.
However, the second time I closed the lid, it did not suspend.

Comment 2 Eli 2005-04-27 01:53:03 UTC
/var/log/acpid is interesting...
# grep -v BAT1 /var/log/acpid
[Tue Apr 26 04:02:09 2005] starting up
[Tue Apr 26 04:02:09 2005] 0 rules loaded
[Tue Apr 26 20:32:47 2005] exiting
[Tue Apr 26 20:34:51 2005] starting up
[Tue Apr 26 20:34:51 2005] 0 rules loaded
[Tue Apr 26 20:46:05 2005] received event "button/lid LID 00000080 00000001"
[Tue Apr 26 20:46:05 2005] completed event "button/lid LID 00000080 00000001"
[Tue Apr 26 20:46:22 2005] received event "button/power PWRF 00000080 00000001"
[Tue Apr 26 20:46:22 2005] completed event "button/power PWRF 00000080 00000001"
[Tue Apr 26 20:47:02 2005] received event "button/lid LID 00000080 00000002"
[Tue Apr 26 20:47:02 2005] completed event "button/lid LID 00000080 00000002"
[Tue Apr 26 20:47:25 2005] received event "button/lid LID 00000080 00000003"
[Tue Apr 26 20:47:25 2005] completed event "button/lid LID 00000080 00000003"

I booted the notebook and logged in, closed the lid, it suspended, I opened the
lid, it resumed, I closed the lid, it did nothing, I opened the lid.


Comment 3 shaohua li 2005-04-27 01:57:31 UTC
Any dmesage printed out after your close the lid again?

Comment 4 Eli 2005-04-27 02:19:29 UTC
Rebooted to 2.6.10-1.770_FC3.
# tail -f /var/log/acpid | grep -v BAT1 
(closing lid)
[Tue Apr 26 21:16:11 2005] received event "button/lid LID 00000080 00000001"
[Tue Apr 26 21:16:11 2005] completed event "button/lid LID 00000080 00000001"
[Tue Apr 26 21:16:34 2005] received event "button/power PWRF 00000080 00000001"
[Tue Apr 26 21:16:34 2005] completed event "button/power PWRF 00000080 00000001"
(resumed)
(closing lid)
[Tue Apr 26 21:17:17 2005] received event "button/lid LID 00000080 00000002"
[Tue Apr 26 21:17:17 2005] completed event "button/lid LID 00000080 00000002"
[Tue Apr 26 21:17:38 2005] received event "button/power PWRF 00000080 00000002"
[Tue Apr 26 21:17:38 2005] completed event "button/power PWRF 00000080 00000002"
(resumed)

It suspended after I closed the lid both times.

BTW, I'm willing to test kernel patches...

Comment 5 Eli 2005-04-27 02:37:01 UTC
(In reply to comment #3)
> Any dmesage printed out after your close the lid again?

I rebooted to 2.6.11-1.14_FC3.  This time, when I closed the lid the first time,
it did not suspend.  Here is before and after dmesg:

# diff -u dmesg.2.6.11-1.14_FC3.logged-in-before-suspend
dmesg.2.6.11-1.14_FC3.after-suspend-try-1-failed
--- dmesg.2.6.11-1.14_FC3.logged-in-before-suspend      2005-04-26
21:32:37.000000000 -0500
+++ dmesg.2.6.11-1.14_FC3.after-suspend-try-1-failed    2005-04-26
21:33:56.000000000 -0500
@@ -328,3 +328,7 @@
 [drm] Initialized i915 1.1.0 20040405 on minor 0: 
 [drm] Initialized i915 1.1.0 20040405 on minor 1: 
 mtrr: base(0xd8020000) is not aligned on a size(0x300000) boundary
+Stopping tasks:
====================================================================
+ stopping tasks failed (1 tasks remaining)
+Restarting tasks...<6> Strange, mDNSResponder not stopped
+ done


Comment 6 shaohua li 2005-04-27 02:43:43 UTC
+ stopping tasks failed (1 tasks remaining)
+Restarting tasks...<6> Strange, mDNSResponder not stopped
+ done
Looks like a known bug - some processes can't be into refrigerator. kill the 
process will work around it. The reason is some processes depend on other 
processes and Linux refrigerator can't handle the relationship. No solution 
yet.

Comment 7 Eli 2005-04-27 02:48:03 UTC
(In reply to comment #6)
> + stopping tasks failed (1 tasks remaining)
> +Restarting tasks...<6> Strange, mDNSResponder not stopped
> + done
> Looks like a known bug - some processes can't be into refrigerator. kill the 
> process will work around it. The reason is some processes depend on other 
> processes and Linux refrigerator can't handle the relationship. No solution 
> yet.

Um... 2.6.10-1.770_FC3 suspends just fine, every time.  Why is 2.6.11-1.14_FC3
different?  This is the same machine, just booting different kernels.


Comment 8 shaohua li 2005-04-27 02:54:01 UTC
Some times you are luck and sometimes not.
A similar report is here: http://bugme.osdl.org/show_bug.cgi?id=3964

Comment 9 Eli 2005-04-27 03:00:20 UTC
dmesg output for the 2.6.10-1.770_FC3 successful suspend:
# diff -u dmesg.2.6.10-1.770_FC3.logged-in-before-suspend
dmesg.2.6.10-1.770_FC3.after-suspend-1-worked
--- dmesg.2.6.10-1.770_FC3.logged-in-before-suspend	2005-04-26
21:53:43.609805142 -0500
+++ dmesg.2.6.10-1.770_FC3.after-suspend-1-worked	2005-04-26 21:54:53.116983362
-0500
@@ -326,3 +326,82 @@
 [drm] Initialized i915 1.1.0 20040405 on minor 0: 
 [drm] Initialized i915 1.1.0 20040405 on minor 1: 
 mtrr: base(0xd8020000) is not aligned on a size(0x300000) boundary
+Stopping tasks:
========================================================================|
+Back to C!
+Debug: sleeping function called from invalid context at mm/slab.c:2061
+in_atomic():0, irqs_disabled():1
+ [<c01188af>] __might_sleep+0x7b/0x85
+ [<c0147f09>] __kmalloc+0x40/0x76
+ [<c01efcb5>] acpi_os_allocate+0xa/0xb
+ [<c020367f>] acpi_ut_callocate+0x30/0x79
+ [<c02035be>] acpi_ut_initialize_buffer+0x4a/0x89
+ [<c0200434>] acpi_rs_create_byte_stream+0x23/0x3b
+ [<c02018e6>] acpi_rs_set_srs_method_data+0x1b/0x9d
+ [<c0116e1d>] recalc_task_prio+0x128/0x133
+ [<c0116e1d>] recalc_task_prio+0x128/0x133
+ [<c0208d84>] acpi_pci_link_set+0xfe/0x176
+ [<c020910a>] irqrouter_resume+0x1c/0x24
+ [<c023ede7>] sysdev_resume+0x3e/0xa5
+ [<c0241ec2>] device_power_up+0x5/0xa
+ [<c013928a>] suspend_enter+0x25/0x2d
+ [<c01392f0>] enter_state+0x37/0x53
+ [<c0206473>] acpi_suspend+0x28/0x35
+ [<c0206542>] acpi_system_write_sleep+0x5a/0x6b
+ [<c0160e06>] vfs_write+0xb6/0xe2
+ [<c0160ed0>] sys_write+0x3c/0x62
+ [<c0103443>] syscall_call+0x7/0xb
+ACPI: PCI interrupt 0000:00:02.0[A] -> GSI 10 (level, low) -> IRQ 10
+PCI: Setting latency timer of device 0000:00:1d.0 to 64
+PCI: Setting latency timer of device 0000:00:1d.1 to 64
+PCI: cache line size of 128 is not supported by device 0000:00:1d.7
+ehci_hcd 0000:00:1d.7: USB 2.0 restarted, EHCI 1.00, driver 26 Oct 2004
+ACPI: PCI interrupt 0000:00:1f.1[A] -> GSI 11 (level, low) -> IRQ 11
+ACPI: PCI interrupt 0000:00:1f.5[B] -> GSI 11 (level, low) -> IRQ 11
+PCI: Setting latency timer of device 0000:00:1f.5 to 64
+ACPI: PCI interrupt 0000:00:1f.6[B] -> GSI 11 (level, low) -> IRQ 11
+PCI: Setting latency timer of device 0000:00:1f.6 to 64
+e100: eth0: e100_watchdog: link up, 100Mbps, full-duplex
+Restarting tasks... done
+drivers/usb/input/hid-core.c: input irq status -84 received
+drivers/usb/input/hid-core.c: input irq status -84 received
+drivers/usb/input/hid-core.c: input irq status -84 received
+sda : READ CAPACITY failed.
+sda : status=0, message=00, host=7, driver=00 
+sda : sense not available. 
+sda: Write Protect is off
+sda: Mode Sense: 00 00 00 00
+sda: assuming drive cache: write through
+drivers/usb/input/hid-core.c: input irq status -84 received
+drivers/usb/input/hid-core.c: input irq status -84 received
+sda : READ CAPACITY failed.
+sda : status=0, message=00, host=7, driver=00 
+sda : sense not available. 
+drivers/usb/input/hid-core.c: input irq status -84 received
+sda: Write Protect is off
+sda: Mode Sense: 00 00 00 00
+sda: assuming drive cache: write through
+ sda:<3>Buffer I/O error on device sda, logical block 0
+Buffer I/O error on device sda, logical block 0
+Buffer I/O error on device sda, logical block 0
+ unable to read partition table
+usb 2-2: USB disconnect, address 2
+usb 2-2.3: USB disconnect, address 3
+drivers/usb/input/hid-core.c: input irq status -84 received
+drivers/usb/input/hid-core.c: can't resubmit intr, 0000:00:1d.0-2.4/input0,
status -19
+usb 2-2.4: USB disconnect, address 4
+usb 2-2: new full speed USB device using uhci_hcd and address 5
+hub 2-2:1.0: USB hub found
+hub 2-2:1.0: 4 ports detected
+usb 2-2.3: new full speed USB device using uhci_hcd and address 6
+scsi1 : SCSI emulation for USB Mass Storage devices
+usb-storage: device found at 6
+usb-storage: waiting for device to settle before scanning
+usb 2-2.4: new low speed USB device using uhci_hcd and address 7
+input: USB HID v1.10 Mouse [Logitech USB-PS/2 Optical Mouse] on
usb-0000:00:1d.0-2.4
+  Vendor: SanDisk   Model: ImageMate CF-SD1  Rev: 0100
+  Type:   Direct-Access                      ANSI SCSI revision: 00
+Attached scsi removable disk sda at scsi1, channel 0, id 0, lun 0
+  Vendor: SanDisk   Model: ImageMate CF-SD3  Rev: 0100
+  Type:   Direct-Access                      ANSI SCSI revision: 00
+Attached scsi removable disk sdb at scsi1, channel 0, id 0, lun 1
+usb-storage: device scan complete

Looks like there is an error in this as well, even though it works reliably.


Comment 10 Eli 2005-04-27 03:06:43 UTC
(In reply to comment #8)
> Some times you are luck and sometimes not.
> A similar report is here: http://bugme.osdl.org/show_bug.cgi?id=3964

If that's the case, I can't upgrade my kernel past 2.6.10-1.770_FC3...
suspending a laptop is pretty crucial functionality.

Comment 11 shaohua li 2005-04-27 03:14:04 UTC
I have some rough idea about how to solve the issue and will let you know when 
I figure out a debug patch.

Comment 12 Eli 2005-04-27 03:21:16 UTC
(In reply to comment #11)
> I have some rough idea about how to solve the issue and will let you know when 
> I figure out a debug patch.

I'll be looking forward to it.  Thanks.

Comment 13 shaohua li 2005-04-27 06:31:19 UTC
What's the status of the process mDNSResponder after a failed suspend? ps x 
will tell you. Is it in uninterruptable sleep state?

Comment 14 Eli 2005-04-30 13:40:24 UTC
(In reply to comment #13)
> What's the status of the process mDNSResponder after a failed suspend? ps x 
> will tell you. Is it in uninterruptable sleep state?

Doesn't look like it to me...

[Sat Apr 30 08:37:03 2005] received event "button/lid LID 00000080 00000003"
[Sat Apr 30 08:37:03 2005] completed event "button/lid LID 00000080 00000003"
[Sat Apr 30 08:37:15 2005] received event "battery BAT1 00000080 00000001"
[Sat Apr 30 08:37:15 2005] completed event "battery BAT1 00000080 00000001"
[Sat Apr 30 08:37:16 2005] received event "button/lid LID 00000080 00000004"
[Sat Apr 30 08:37:16 2005] completed event "button/lid LID 00000080 00000004"

[root@pegasus ~]# ps auxww | grep DNS
nobody    3888  4.6  0.1 13376 1028 ?        Ssl  08:34   0:09 mDNSResponder


Comment 15 petrosyan 2005-05-04 02:44:03 UTC
this is a duplicate of bug #142301

Comment 16 Eli 2005-05-27 02:54:54 UTC
From /var/log/messages:
May 26 21:46:29 pegasus kernel: Stopping tasks:
============================================================================
May 26 21:46:29 pegasus kernel:  stopping tasks failed (1 tasks remaining)
May 26 21:46:29 pegasus kernel: Restarting tasks...<6> Strange, mDNSResponder
not stopped
May 26 21:46:29 pegasus kernel:  done

# ps auxww | grep [m]DNS
nobody    3908  0.8  0.1 13376 1028 ?        Ssl  21:44   0:04 mDNSResponder
# uname -r
2.6.11-1.27_FC3

And the notebook does not suspend.


Comment 17 shaohua li 2005-05-27 03:02:51 UTC
>Restarting tasks...<6> Strange, mDNSResponder
Nigel has patch in his suspend2, which should fix this issue. I think Pavel 
will merge it soon.

Comment 18 petrosyan 2005-06-16 17:33:00 UTC
This bug has been fixed in Fedora Core 4.

Comment 19 Dave Jones 2005-07-15 19:08:13 UTC
An update has been released for Fedora Core 3 (kernel-2.6.12-1.1372_FC3) which
may contain a fix for your problem.   Please update to this new kernel, and
report whether or not it fixes your problem.

If you have updated to Fedora Core 4 since this bug was opened, and the problem
still occurs with the latest updates for that release, please change the version
field of this bug to 'fc4'.

Thank you.

Comment 20 Eli 2005-08-06 16:43:15 UTC
This specific suspend bug is fixed with kernel-2.6.12-1.1372_FC3.
Now it disables IRQ10 on resume, and the usb mouse stops working.  I'll file a
separate bug report for that.