Bug 157642

Summary: scsi: sleeping function called from invalid context at mm/slab.c:2110
Product: [Fedora] Fedora Reporter: John Ellson <john.ellson>
Component: kernelAssignee: Dave Jones <davej>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 4CC: error27, goeran, jorton, pfrields, wtogami
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-11-23 15:43:11 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
dmesg output none

Description John Ellson 2005-05-13 12:48:49 UTC
Created attachment 114331 [details]
dmesg output

Comment 1 John Ellson 2005-05-13 12:48:49 UTC
Description of problem:
dmesg shows some debug output that looks bad, but system runs OK.
Seems to be related to SCSI drivers.  Messages begin with:
  Debug: sleeping function called from invalid context at mm/slab.c:2110
  in_atomic():0, irqs_disabled():1
Full dmesg output attached.

Version-Release number of selected component (if applicable):
kernel-2.6.11-1.1303_FC4

How reproducible:
100%

Steps to Reproduce:
1. reboot
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 2 Gordon Ewasiuk 2005-07-17 08:19:27 UTC
Observed same error message when a network card in an FC4 server went offline:

Jul 17 04:03:22 amanda2 kernel: NETDEV WATCHDOG: eth2: transmit timed out
Jul 17 04:03:22 amanda2 kernel: Debug: sleeping function called from invalid
context at mm/slab.c:2126
Jul 17 04:03:22 amanda2 kernel: in_atomic():1, irqs_disabled():0
Jul 17 04:03:22 amanda2 kernel:  [<c014cc1c>] __kmalloc+0x89/0x8b
Jul 17 04:03:22 amanda2 kernel:  [<c019445d>] proc_create+0x84/0xe6
Jul 17 04:03:22 amanda2 kernel:  [<c019456f>] proc_mkdir_mode+0x22/0x57
Jul 17 04:03:22 amanda2 kernel:  [<c014456e>] register_handler_proc+0x7b/0x89
Jul 17 04:03:22 amanda2 kernel:  [<c0143dbf>] setup_irq+0x8e/0xe2
Jul 17 04:03:22 amanda2 kernel:  [<f89944cf>] e100_intr+0x0/0x14b [e100]
Jul 17 04:03:22 amanda2 kernel:  [<c0143ffa>] request_irq+0x85/0x9b
Jul 17 04:03:22 amanda2 kernel:  [<f89951ac>] e100_up+0xe7/0x1bd [e100]
Jul 17 04:03:22 amanda2 kernel:  [<c02b8bc2>] dev_watchdog+0x0/0xa9
Jul 17 04:03:22 amanda2 kernel:  [<c02b8c63>] dev_watchdog+0xa1/0xa9
Jul 17 04:03:22 amanda2 kernel:  [<c012a28d>] run_timer_softirq+0xe5/0x1b9
Jul 17 04:03:22 amanda2 kernel:  [<c0126602>] __do_softirq+0x72/0xdc
Jul 17 04:03:22 amanda2 kernel:  [<c0106753>] do_softirq+0x4b/0x4f
Jul 17 04:03:23 amanda2 kernel:  =======================
Jul 17 04:03:23 amanda2 kernel:  [<c0104ae8>] apic_timer_interrupt+0x1c/0x24
Jul 17 04:03:23 amanda2 kernel:  [<c010201a>] default_idle+0x0/0x29
Jul 17 04:03:23 amanda2 kernel:  [<c0102040>] default_idle+0x26/0x29
Jul 17 04:03:23 amanda2 kernel:  [<c01020c4>] cpu_idle+0x4e/0x63
Jul 17 04:03:23 amanda2 kernel: e100: eth2: e100_watchdog: link up, 100Mbps,
full-duplex

Server was up and running and the other NIC (another e100) was online.  Didn't
try an ifdown/ifup.  Didn't think about it before I rebooted the box.  Sorry.  :(

Comment 3 Dan Carpenter 2005-07-17 08:26:39 UTC
  John:  Can you upgrade to a newer kernel?
Gordon:  What kernel are you running?



Comment 4 Dan Carpenter 2005-07-18 06:09:12 UTC
I found the bug.  I don't know what the fix is.

In drivers/scsi/scsi_transport_spi.c in the funciton spi_dv_device()
spi_dv_device_internal(sdev, buffer) is called with BKL held.

The problematic call tree is:
spi_dv_device_internal()
spi_dv_device_compare_inquiry()
spi_execute()
scsi_execute()
blk_rq_map_kern(sdev->request_queue, req, buffer, bufflen, __GFP_WAIT))

The __GFP_WAIT is passed to a kmalloc later.


Comment 5 Dan Carpenter 2005-07-18 08:47:09 UTC
Crap.  I'm an idiot.  I thought down() was BKL but really it's just a semafor...  

Gordon, you're bug is probably something different from John's.  Can you open a
new bug for the issue you see?

bug 160673 is probably a duplicate of this (John's) bug.



Comment 6 Joe Orton 2005-07-18 09:23:11 UTC
*** Bug 160673 has been marked as a duplicate of this bug. ***

Comment 7 Joe Orton 2005-07-18 09:24:09 UTC
Still seen in 2.6.12-1.1398_FC4smp, i686

Comment 8 Dan Carpenter 2005-07-30 09:42:18 UTC
There is a fix upstream.
http://kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=53222b906903fd861dc24ebccfa07ee125941313


The allocation in sym_alloc_lcb_tags() becomes GFP_ATOMIC.



Comment 9 Dave Jones 2005-09-30 06:14:35 UTC
Mass update to all FC4 bugs:

An update has been released (2.6.13-1.1526_FC4) which rebases to a new upstream
kernel (2.6.13.2). As there were ~3500 changes upstream between this and the
previous kernel, it's possible your bug has been fixed already.

Please retest with this update, and update this bug if necessary.

Thanks.


Comment 10 Dave Jones 2005-11-10 19:12:01 UTC
2.6.14-1.1637_FC4 has been released as an update for FC4.
Please retest with this update, as a large amount of code has been changed in
this release, which may have fixed your problem.

Thank you.


Comment 11 Joe Orton 2005-11-23 15:43:11 UTC
I've not seen this recur after the last few updates, marking as fixed.