Bug 164708

Summary: "sleeping function called from invalid context" when resuming from S3 sleep
Product: [Fedora] Fedora Reporter: Thomas M Steenholdt <tmus>
Component: kernelAssignee: Dave Jones <davej>
Status: CLOSED DUPLICATE QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 4CC: billcrawford1970, error27, pfrields, stefan.zechmeister, sundaram, wtogami
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-10-06 21:50:12 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 165150    

Description Thomas M Steenholdt 2005-07-30 18:52:24 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.10) Gecko/20050720 Fedora/1.0.6-1.1.fc3 Firefox/1.0.6

Description of problem:
Resuming from S3 sleep, my IBM ThinkPad T30 spits the following message...

---
Back to C!
Debug: sleeping function called from invalid context at mm/slab.c:2126
in_atomic():0, irqs_disabled():1
 [<c011d3a4>] __might_sleep+0x9c/0xaa
 [<c0159762>] kmem_cache_alloc+0x3c/0x4c
 [<c023dc0c>] acpi_pci_link_set+0x4a/0x1a2
 [<c023e085>] irqrouter_resume+0x1c/0x24
 [<c027cb02>] sysdev_resume+0x5c/0xac
 [<c0280adb>] device_power_up+0x5/0xa
 [<c014878b>] suspend_enter+0x2d/0x46
 [<c0148727>] suspend_prepare+0x55/0x8c
 [<c0148814>] enter_state+0x39/0x55
 [<c0148931>] state_store+0x92/0xa5
 [<c014889f>] state_store+0x0/0xa5
 [<c01c655b>] subsys_attr_store+0x1f/0x25
 [<c01c6754>] flush_write_buffer+0x25/0x2c
 [<c01c679b>] sysfs_write_file+0x40/0x60
 [<c0177075>] vfs_write+0xaf/0x10a
 [<c017717b>] sys_write+0x41/0x6a
 [<c010392d>] syscall_call+0x7/0xb
---

I'm using a pretty basic script to make the system suspend... unload USB drivers and some NIC drivers, stopping a few problematic daemons, then time sync and echo -n "mem" >/sys/power/state

I guess the backtrace should speak for itself, but in case you need anything, please let me know!



Version-Release number of selected component (if applicable):
kernel-2.6.12-1.1372_FC3

How reproducible:
Always

Steps to Reproduce:
1. Suspend to RAM (at least with IBM ThinkPad T30, i guess others as well)
2. Resume
3. dmesg
  

Actual Results:  I see an unexpected stacktrace in the dmesg output

Expected Results:  no stacktraces should show up!

Additional info:

Comment 1 Dan Carpenter 2005-07-31 10:44:34 UTC
This is the same as bug 142598, bug 134905 and bug 140254.

The problem is the irqrouter_resume() is called with interrupts off by the power
management code.  See the comment next to sysdev_resume() in drivers/base/sys.c.

irqrouter_resume => acpi_pci_link_resume => acpi_pci_link_set => kmalloc => BOOM!

It could be that the coder just made a typo and used GFP_KERNEL instead of
GFP_ATOMIC, but another possibility is that there are other problems with the
code.  I don't understand the code well enough to fix the problem myself.





Comment 2 Bill Crawford 2005-08-13 22:09:30 UTC
Same problem here with desktop machine.
I noticed it first just after install a new NIC, and I've had other problems
related to that (card apparently "hanging" after resume) that seem to have been
fixed by kernel updates.
I can add another Debug: ... trace from /var/log/messages and lspci output.


Comment 3 Dave Jones 2005-08-26 06:20:14 UTC
This should be fixed in the errata kernel currently in updates-testing


Comment 4 Thomas M Steenholdt 2005-10-06 05:12:16 UTC
The FC4 kernel is doing the exact same thing!

I suspect the fix would be easy to copy to the FC4 kernel?

This is from kernel-2.6.13-1.1526_FC4 :

-----
Back to C!
Debug: sleeping function called from invalid context at mm/slab.c:2129
in_atomic():0, irqs_disabled():1
 [<c0176913>] kmem_cache_alloc+0x3c/0x4e
 [<c029927a>] acpi_pci_link_set+0x3f/0x17f
 [<c02997e2>] irqrouter_resume+0x1e/0x3c
 [<c02e792d>] sysdev_resume+0x3d/0xb5
 [<c02ec261>] device_power_up+0x5/0xa
 [<c015eaf8>] suspend_enter+0x44/0x46
 [<c015ea5a>] suspend_prepare+0x63/0xbd
 [<c015eb6e>] enter_state+0x49/0x54
 [<c015ec69>] state_store+0x81/0x8f
 [<c015ebe8>] state_store+0x0/0x8f
 [<c020f24a>] subsys_attr_store+0x1e/0x22
 [<c020f454>] flush_write_buffer+0x22/0x28
 [<c020f4a8>] sysfs_write_file+0x4e/0x73
 [<c020f45a>] sysfs_write_file+0x0/0x73
 [<c01a1017>] vfs_write+0xa2/0x15a
 [<c01a117a>] sys_write+0x41/0x6a
 [<c0104465>] syscall_call+0x7/0xb
-----


Comment 5 Dave Jones 2005-10-06 21:50:12 UTC

*** This bug has been marked as a duplicate of 154046 ***