Red Hat Bugzilla – Bug 190729
sleeping function called from invalid context at kernel/workqueue.c
Last modified: 2007-11-30 17:07:24 EST
Description of problem:
While under heavy load (significant numbers of asynchronous direct I/Os to
multipe storage devices), I've seen this oops a handful of times (5?) over the
past year (on varous versions of RHEL4).
Version-Release number of selected component (if applicable): 2.6.9-34.EL
Not very - as noted above, only seen it a handful of times.
Steps to Reproduce:
Additional info: Oops information:
Debug: sleeping function called from invalid context at kernel/workqueue.c:264
in_atomic():1[expected: 0], irqs_disabled():0
[<a000000200078f10>] scsi_end_request+0x50/0x2e0 [scsi_mod]
[<a000000200079610>] scsi_io_completion+0x2b0/0xa00 [scsi_mod]
[<a000000200026710>] sd_rw_intr+0x110/0x700 [sd_mod]
[<a00000020006c230>] scsi_finish_command+0x2d0/0x300 [scsi_mod]
[<a00000020006c520>] scsi_softirq+0x2c0/0x300 [scsi_mod]
Do you actually get a system crash, or anything going wrong? or just the messages?
Nope - messages just logged, and the system continues onwards. I _believe_
might_sleep is just a warning mechanism: meaning that one should _not_ be
sleeping in this context, and the fact that we _might_ sleep means things aren't
quite right. [Meaning: somebody above me in the call stack is doing something
This is indeed a corner case. The last user of the ioctx is the I/O path
(meaning that the calling process either closed the context or went away before
the I/O completed).
I'll give this some thought.
It looks like someone ran into this on a 126.96.36.199 kernel. See the thread at:
Created attachment 144812 [details]
aio_complete should not drop the last reference to an ioctx
This is the fix Kenneth Chen posted for this problem. Please try it out if you
get the chance.
This request was evaluated by Red Hat Kernel Team for inclusion in a Red
Hat Enterprise Linux maintenance release, and has moved to bugzilla
committed in stream U6 build 55.10. A test kernel with this patch is available
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.