154046 – sleeping function from invalid context during suspend. (acpi_pci_link_set calls kmem_cache_alloc)

Bug 154046 - sleeping function from invalid context during suspend. (acpi_pci_link_set calls kmem_cache_alloc)

Summary: sleeping function from invalid context during suspend. (acpi_pci_link_set cal...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	kernel
Sub Component:
Version:	5
Hardware:	i686
OS:	Linux
Priority:	medium
Severity:	high
Target Milestone:	---
Assignee:	Dave Jones
QA Contact:	Brian Brock
Docs Contact:
URL:
Whiteboard:
Duplicates (5):	134905 136993 154610 164708 167394 (view as bug list)
Depends On:
Blocks:	FCMETA_ACPI
TreeView+	depends on / blocked

Reported:	2005-04-06 20:22 UTC by Thomas J. Baker
Modified:	2015-01-04 22:18 UTC (History)
CC List:	12 users (show)
Fixed In Version:	2.6.12-1398
Clone Of:
Environment:
Last Closed:	2006-11-12 07:20:01 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Thomas J. Baker 2005-04-06 20:22:44 UTC

From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.6) Gecko/20050323 Firefox/1.0.2 Fedora/1.0.2-1.3.1

Description of problem:
When S3 suspending, I get the following kernel panic:

codec_semaphore: semaphore is not ready [0x1][0x700300]
codec_write 1: semaphore is not ready for register 0x26
Back to C!
Debug: sleeping function called from invalid context at mm/slab.c:2090
in_atomic():0, irqs_disabled():1
 [<c015beb4>] kmem_cache_alloc+0x63/0x78
 [<c02480b2>] acpi_pci_link_set+0x3f/0x17f
 [<c02484e9>] irqrouter_resume+0x14/0x28
 [<c029155e>] sysdev_resume+0x3d/0xb5
 [<c02957dd>] device_power_up+0x5/0xa
 [<c014a104>] suspend_enter+0x2a/0x32
 [<c014a17e>] enter_state+0x49/0x54
 [<c0245596>] acpi_system_write_sleep+0x5a/0x6c
 [<c024553c>] acpi_system_write_sleep+0x0/0x6c
 [<c017b878>] vfs_write+0x9e/0x110
 [<c017b995>] sys_write+0x41/0x6a
 [<c0103a1d>] syscall_call+0x7/0xb

Suspending was working perfectly under FC3 so something is new in FC4. Bug #136993 seems to be similiar but was posted last year for FC3T3 so I don't know if this is a duplicate or not.

Version-Release number of selected component (if applicable):
kernel-2.6.11-1.1226_FC4

How reproducible:
Always

Steps to Reproduce:
1. have an up to date FC4+devel laptop (mine is a Dell Inspiron 600m)
2. suspend it
3. resume it
  

Actual Results:  Kernel error in the logs. Everything seems to be working most of the time, although I do get resumes that just hang.

Additional info:

Comment 1 Dave Jones 2005-08-04 19:29:31 UTC

*** Bug 134905 has been marked as a duplicate of this bug. ***

Comment 2 Thomas J. Baker 2005-08-04 20:18:53 UTC

With the 2.6.12-1398 kernel, I no longer have this problem. I'd forgotton about
this bug and I'm not sure how many releases back it worked without the error but
1398 definitely works.

Comment 3 Dan Carpenter 2005-08-13 22:25:41 UTC

This bug is still active.  A couple other people have reported it with the
2.6.12-1398_FC4 kernel and a look through the source shows that it hasn't been
fixed.

Comment 4 Dave Jones 2005-10-02 21:42:08 UTC

*** Bug 136993 has been marked as a duplicate of this bug. ***

Comment 5 Dave Jones 2005-10-06 20:38:15 UTC

*** Bug 167394 has been marked as a duplicate of this bug. ***

Comment 6 Dave Jones 2005-10-06 21:50:36 UTC

*** Bug 164708 has been marked as a duplicate of this bug. ***

Comment 7 Michal Jaegermann 2005-10-24 21:23:35 UTC

Well, here is the one from 2.6.13-1.1532_FC4 on i686 and with somewhat
different backtrace

Debug: sleeping function called from invalid context at mm/slab.c:2129
in_atomic():0, irqs_disabled():1
 [<c0176913>] kmem_cache_alloc+0x3c/0x4e
 [<c029951e>] acpi_pci_link_set+0x3f/0x17f
 [<c0299a86>] irqrouter_resume+0x1e/0x3c
 [<c02e7bcd>] sysdev_resume+0x3d/0xb5
 [<c02ec501>] device_power_up+0x5/0xa
 [<c015eaf8>] suspend_enter+0x44/0x46
 [<c015ea5a>] suspend_prepare+0x63/0xbd
 [<c015eb6e>] enter_state+0x49/0x54
 [<c015ec69>] state_store+0x81/0x8f
 [<c015ebe8>] state_store+0x0/0x8f
 [<c020f24a>] subsys_attr_store+0x1e/0x22
 [<c020f454>] flush_write_buffer+0x22/0x28
 [<c020f4a8>] sysfs_write_file+0x4e/0x73
 [<c020f45a>] sysfs_write_file+0x0/0x73
 [<c01a1017>] vfs_write+0xa2/0x15a
 [<c01a117a>] sys_write+0x41/0x6a
 [<c0104465>] syscall_call+0x7/0xb

Comment 8 Pekka Savola 2005-11-04 16:46:16 UTC

The same backtrace as mentioned by Michal in #7 happens on my Thinkpad X30
laptop as well -- I've "acpi_sleep=s3_bios" in kernel options, and when I resume
from suspend I get that backtrace.  Everything seems to continue to work, though.

Comment 9 Thomas M Steenholdt 2005-11-07 21:28:26 UTC

Debug: sleeping function called from invalid context at mm/slab.c:2486
in_atomic():0, irqs_disabled():1
 [<c0143cdc>] kmem_cache_alloc+0x40/0x56
 [<c02067ae>] acpi_pci_link_set+0x3f/0x17f
 [<c0206d16>] irqrouter_resume+0x1e/0x3c
 [<c02398fa>] __sysdev_resume+0x11/0x6b
 [<c0239bb8>] sysdev_resume+0x34/0x52
 [<c023db51>] device_power_up+0x5/0xa
 [<c0136458>] suspend_enter+0x44/0x46
 [<c01363ba>] suspend_prepare+0x63/0xbd
 [<c01364ce>] enter_state+0x49/0x54
 [<c01365c9>] state_store+0x81/0x8f
 [<c0136548>] state_store+0x0/0x8f
 [<c0193c6a>] subsys_attr_store+0x1e/0x22
 [<c0193e77>] flush_write_buffer+0x22/0x28
 [<c0193ece>] sysfs_write_file+0x51/0x76
 [<c0193e7d>] sysfs_write_file+0x0/0x76
 [<c0158c76>] vfs_write+0xa2/0x15a
 [<c0158dd9>] sys_write+0x41/0x6a
 [<c0102edd>] syscall_call+0x7/0xb

just an update showing that this problem is still unresolved in FC4... This is
2.6.14-1636 from davej's test kernel repo...

Comment 10 Thomas M Steenholdt 2005-11-10 19:46:54 UTC

Debug: sleeping function called from invalid context at mm/slab.c:2486
in_atomic():0, irqs_disabled():1
 [<c0143e9c>] kmem_cache_alloc+0x40/0x56
 [<c020696e>] acpi_pci_link_set+0x3f/0x17f
 [<c0206ed6>] irqrouter_resume+0x1e/0x3c
 [<c0239aba>] __sysdev_resume+0x11/0x6b
 [<c0239d78>] sysdev_resume+0x34/0x52
 [<c023dd11>] device_power_up+0x5/0xa
 [<c0136618>] suspend_enter+0x44/0x46
 [<c013657a>] suspend_prepare+0x63/0xbd
 [<c013668e>] enter_state+0x49/0x54
 [<c0136789>] state_store+0x81/0x8f
 [<c0136708>] state_store+0x0/0x8f
 [<c0193e2a>] subsys_attr_store+0x1e/0x22
 [<c0194037>] flush_write_buffer+0x22/0x28
 [<c019408e>] sysfs_write_file+0x51/0x76
 [<c019403d>] sysfs_write_file+0x0/0x76
 [<c0158e36>] vfs_write+0xa2/0x15a
 [<c0158f99>] sys_write+0x41/0x6a
 [<c0102edd>] syscall_call+0x7/0xb


addresses changed a little but otherwise all is the same...

This is 2.6.14-1637_FC4

Comment 11 Dave Jones 2005-11-10 20:30:01 UTC

2.6.14-1.1637_FC4 has been released as an update for FC4.
Please retest with this update, as a large amount of code has been changed in
this release, which may have fixed your problem.

Thank you.

Comment 12 Thomas M Steenholdt 2005-11-10 21:00:56 UTC

Please see comment #10

Comment 13 Dave Jones 2005-12-01 06:56:31 UTC

Marking as 'devel', as the same bug still appears there, and will get fixed
there first, and propagate back to FC4 eventually.

Comment 14 Dave Jones 2005-12-01 06:57:22 UTC

*** Bug 154610 has been marked as a duplicate of this bug. ***

Comment 15 Pekka Savola 2006-01-09 12:11:27 UTC

FWIW, still happens on kernel-2.6.15-1.1823_FC4 as well.

Comment 16 Wade Mealing 2006-01-17 03:00:59 UTC

Still happening on: 2.6.15-1.1854_FC5

Comment 17 Michal Jaegermann 2006-01-20 01:42:05 UTC

The same "sleeping function called from invalid context" happens with
2.6.14-1.1653_FC4 and 2.6.14-1.1656_FC4 but there is another nasty regress
here.

With 2.6.14-1.1653_FC4 I have to do 'modprobe -r button' on a laptop suspend
or a box goes into an infinite loop of suspend/wake-up cycles.  If I will do
'modprobe -r button' then I have to do 'modprobe button' on a wake-up or
I will be not able anymore to suspend with a power button or by closing a lid.
This can be automated in scripts used by "action" in /etc/acpi/even/*.conf
files so this is bearable if hacky.

2.6.14-1.1656_FC4 does not require the above but something worse happens.  Every
time I tried then on the at most second wake-up, and sometimes on the first one,
my laptop simply reboots instead of returning from a suspend.  Not very
useful.  It is better just to shutdown cleanly.  A wakeup after the first
suspend usually works but that is about it.

I do not need to mention that I do not have any useful log traces from such
reboots.  It appears that a laptop starts waking and then its screen blinks
and you are seeing a boot-from-scratch sequence.

Comment 18 Michal Jaegermann 2006-02-06 05:30:28 UTC

A regress described in comment #17 seems to be gone with 2.6.15-1.1830_FC4. Yay!
But here is another sample of "sleeping function called from invalid context";
this time from 2.6.15-1.1830_FC4 (i686).  It is more detailed in an acpi part.

Debug: sleeping function called from invalid context at mm/slab.c:2499
in_atomic():0, irqs_disabled():1
 [<c01462f3>] kmem_cache_alloc+0x40/0x4f    
 [<c0202c85>] acpi_os_acquire_object+0xb/0x3c
 [<c02171b1>] acpi_ut_allocate_object_desc_dbg+0x13/0x49    
 [<c021704b>] acpi_ut_create_internal_object_dbg+0xf/0x5e
 [<c02136d4>] acpi_rs_set_srs_method_data+0x3d/0xb9    
 [<c021aa3d>] acpi_pci_link_set+0x102/0x17b
 [<c021aecb>] irqrouter_resume+0x1e/0x3c    
 [<c024d921>] __sysdev_resume+0x11/0x6b
 [<c024dbde>] sysdev_resume+0x34/0x52    
 [<c0251cb7>] device_power_up+0x5/0xa
 [<c0138787>] suspend_enter+0x44/0x46    
 [<c01386e5>] suspend_prepare+0x63/0xc1
 [<c0138813>] enter_state+0x5e/0x7c    
 [<c013894c>] state_store+0x81/0x8f
 [<c01388cb>] state_store+0x0/0x8f    
 [<c0196a0a>] subsys_attr_store+0x1e/0x22
 [<c0196c12>] flush_write_buffer+0x22/0x28    
 [<c0196c64>] sysfs_write_file+0x4c/0x71
 [<c0196c18>] sysfs_write_file+0x0/0x71    
 [<c015b2c9>] vfs_write+0xa2/0x15a
 [<c015b42c>] sys_write+0x41/0x6a    
 [<c0102e75>] syscall_call+0x7/0xb

Comment 19 Quinn Eliot Minor 2006-02-11 06:11:15 UTC

I have a Dell Latitude D610, and I notice the same thing: when suspending on the
2.6.14-1.1656_FC4 kernel, it says "sleeping function called from invalid
context" and then hangs after suspend.

However, there is an even stranger twist. When I patched this kernel with
Software Suspend 2 (swsusp2) (note: this is for suspend-to-disk), and THEN tried
suspend-to-RAM, it worked just fine (although the same debug message occurred).
This seems incredibly strange to me, and I don't see how this patch could affect
ACPI functionality at all.

(Note: for anyone who wishes to try this, swsusp2-patched Fedora kernel RPM's
are available at http://mhensler.de/swsusp/   ... the 2.6.14-1.1656 patched
kernel worked for me.)

Even stranger: when I got the swsusp2-patched 2.6.15-1.1825 kernel,
suspend-to-RAM no longer works. The debug message is gone, but the system hangs
after suspend. Very, very strange.

(In reply to comment #17)
> The same "sleeping function called from invalid context" happens with
> 2.6.14-1.1653_FC4 and 2.6.14-1.1656_FC4 but there is another nasty regress
> here.
> 
> With 2.6.14-1.1653_FC4 I have to do 'modprobe -r button' on a laptop suspend
> or a box goes into an infinite loop of suspend/wake-up cycles.  If I will do
> 'modprobe -r button' then I have to do 'modprobe button' on a wake-up or
> I will be not able anymore to suspend with a power button or by closing a lid.
> This can be automated in scripts used by "action" in /etc/acpi/even/*.conf
> files so this is bearable if hacky.
> 
> 2.6.14-1.1656_FC4 does not require the above but something worse happens.  Every
> time I tried then on the at most second wake-up, and sometimes on the first one,
> my laptop simply reboots instead of returning from a suspend.  Not very
> useful.  It is better just to shutdown cleanly.  A wakeup after the first
> suspend usually works but that is about it.
> 
> I do not need to mention that I do not have any useful log traces from such
> reboots.  It appears that a laptop starts waking and then its screen blinks
> and you are seeing a boot-from-scratch sequence.
>

Comment 20 Pekka Savola 2006-03-24 14:00:24 UTC

This seems to have gotten worse in kernel-2.6.16-1.2066_FC4: when resuming from
suspend, the kernel gets a NULL pointer deref and doesn't recover:

<4<1> Unable to handle NULL pointer dereference at virtual address 00000038
printing eip:

(but eip printing is empty, at least on console.)

So, I'm raising severity to 'high'...

Comment 21 Dave Jones 2006-10-17 00:37:19 UTC

A new kernel update has been released (Version: 2.6.18-1.2200.fc5)
based upon a new upstream kernel release.

Please retest against this new kernel, as a large number of patches
go into each upstream release, possibly including changes that
may address this problem.

This bug has been placed in NEEDINFO state.
Due to the large volume of inactive bugs in bugzilla, if this bug is
still in this state in two weeks time, it will be closed.

Should this bug still be relevant after this period, the reporter
can reopen the bug at any time. Any other users on the Cc: list
of this bug can request that the bug be reopened by adding a
comment to the bug.

In the last few updates, some users upgrading from FC4->FC5
have reported that installing a kernel update has left their
systems unbootable. If you have been affected by this problem
please check you only have one version of device-mapper & lvm2
installed.  See bug 207474 for further details.

If this bug is a problem preventing you from installing the
release this version is filed against, please see bug 169613.

If this bug has been fixed, but you are now experiencing a different
problem, please file a separate bug for the new problem.

Thank you.

Comment 22 Paul Ionescu 2006-10-18 19:17:23 UTC

Tested with kernel 2.6.18-1.2200.fc5
and here is the trace message:

Stopping tasks:
==============================================================================================================================|
pnp: Device 00:09 disabled.
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
Back to C!
BUG: sleeping function called from invalid context at kernel/rwsem.c:20
in_atomic():0, irqs_disabled():1
 [<c0403f10>] dump_trace+0x69/0x1af
 [<c040406e>] show_trace_log_lvl+0x18/0x2c
 [<c04045e9>] show_trace+0xf/0x11
 [<c0404673>] dump_stack+0x15/0x17
 [<c042ef78>] down_read+0x12/0x1f
 [<c0427923>] blocking_notifier_call_chain+0xe/0x29
 [<c05970df>] cpufreq_resume+0x118/0x13f
 [<c053f3d4>] __sysdev_resume+0x20/0x53
 [<c053f515>] sysdev_resume+0x16/0x47
 [<c05435a1>] device_power_up+0x5/0xa
 [<c0436861>] suspend_enter+0x3b/0x44
 [<c0436997>] enter_state+0x12d/0x14f
 [<c0436a3e>] state_store+0x85/0x99
 [<c04969ea>] subsys_attr_store+0x1e/0x22
 [<c0496adc>] sysfs_write_file+0xa6/0xcc
 [<c0461292>] vfs_write+0xa8/0x159
 [<c04617d8>] sys_write+0x41/0x67
 [<c0402d9b>] syscall_call+0x7/0xb
DWARF2 unwinder stuck at syscall_call+0x7/0xb
Leftover inexact backtrace:

Comment 23 Dave Jones 2006-11-12 07:20:01 UTC

That's a different bug to the one filed here. (That one is tracked in bug 211590)

The original bug tracked here should be fixed now.

Note You need to log in before you can comment on or make changes to this bug.