Bug 157642
Summary: | scsi: sleeping function called from invalid context at mm/slab.c:2110 | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | John Ellson <john.ellson> | ||||
Component: | kernel | Assignee: | Dave Jones <davej> | ||||
Status: | CLOSED ERRATA | QA Contact: | Brian Brock <bbrock> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 4 | CC: | error27, goeran, jorton, pfrields, wtogami | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | i386 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2005-11-23 15:43:11 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description of problem: dmesg shows some debug output that looks bad, but system runs OK. Seems to be related to SCSI drivers. Messages begin with: Debug: sleeping function called from invalid context at mm/slab.c:2110 in_atomic():0, irqs_disabled():1 Full dmesg output attached. Version-Release number of selected component (if applicable): kernel-2.6.11-1.1303_FC4 How reproducible: 100% Steps to Reproduce: 1. reboot 2. 3. Actual results: Expected results: Additional info: Observed same error message when a network card in an FC4 server went offline: Jul 17 04:03:22 amanda2 kernel: NETDEV WATCHDOG: eth2: transmit timed out Jul 17 04:03:22 amanda2 kernel: Debug: sleeping function called from invalid context at mm/slab.c:2126 Jul 17 04:03:22 amanda2 kernel: in_atomic():1, irqs_disabled():0 Jul 17 04:03:22 amanda2 kernel: [<c014cc1c>] __kmalloc+0x89/0x8b Jul 17 04:03:22 amanda2 kernel: [<c019445d>] proc_create+0x84/0xe6 Jul 17 04:03:22 amanda2 kernel: [<c019456f>] proc_mkdir_mode+0x22/0x57 Jul 17 04:03:22 amanda2 kernel: [<c014456e>] register_handler_proc+0x7b/0x89 Jul 17 04:03:22 amanda2 kernel: [<c0143dbf>] setup_irq+0x8e/0xe2 Jul 17 04:03:22 amanda2 kernel: [<f89944cf>] e100_intr+0x0/0x14b [e100] Jul 17 04:03:22 amanda2 kernel: [<c0143ffa>] request_irq+0x85/0x9b Jul 17 04:03:22 amanda2 kernel: [<f89951ac>] e100_up+0xe7/0x1bd [e100] Jul 17 04:03:22 amanda2 kernel: [<c02b8bc2>] dev_watchdog+0x0/0xa9 Jul 17 04:03:22 amanda2 kernel: [<c02b8c63>] dev_watchdog+0xa1/0xa9 Jul 17 04:03:22 amanda2 kernel: [<c012a28d>] run_timer_softirq+0xe5/0x1b9 Jul 17 04:03:22 amanda2 kernel: [<c0126602>] __do_softirq+0x72/0xdc Jul 17 04:03:22 amanda2 kernel: [<c0106753>] do_softirq+0x4b/0x4f Jul 17 04:03:23 amanda2 kernel: ======================= Jul 17 04:03:23 amanda2 kernel: [<c0104ae8>] apic_timer_interrupt+0x1c/0x24 Jul 17 04:03:23 amanda2 kernel: [<c010201a>] default_idle+0x0/0x29 Jul 17 04:03:23 amanda2 kernel: [<c0102040>] default_idle+0x26/0x29 Jul 17 04:03:23 amanda2 kernel: [<c01020c4>] cpu_idle+0x4e/0x63 Jul 17 04:03:23 amanda2 kernel: e100: eth2: e100_watchdog: link up, 100Mbps, full-duplex Server was up and running and the other NIC (another e100) was online. Didn't try an ifdown/ifup. Didn't think about it before I rebooted the box. Sorry. :( John: Can you upgrade to a newer kernel? Gordon: What kernel are you running? I found the bug. I don't know what the fix is. In drivers/scsi/scsi_transport_spi.c in the funciton spi_dv_device() spi_dv_device_internal(sdev, buffer) is called with BKL held. The problematic call tree is: spi_dv_device_internal() spi_dv_device_compare_inquiry() spi_execute() scsi_execute() blk_rq_map_kern(sdev->request_queue, req, buffer, bufflen, __GFP_WAIT)) The __GFP_WAIT is passed to a kmalloc later. Crap. I'm an idiot. I thought down() was BKL but really it's just a semafor... Gordon, you're bug is probably something different from John's. Can you open a new bug for the issue you see? bug 160673 is probably a duplicate of this (John's) bug. *** Bug 160673 has been marked as a duplicate of this bug. *** Still seen in 2.6.12-1.1398_FC4smp, i686 There is a fix upstream. http://kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=53222b906903fd861dc24ebccfa07ee125941313 The allocation in sym_alloc_lcb_tags() becomes GFP_ATOMIC. Mass update to all FC4 bugs: An update has been released (2.6.13-1.1526_FC4) which rebases to a new upstream kernel (2.6.13.2). As there were ~3500 changes upstream between this and the previous kernel, it's possible your bug has been fixed already. Please retest with this update, and update this bug if necessary. Thanks. 2.6.14-1.1637_FC4 has been released as an update for FC4. Please retest with this update, as a large amount of code has been changed in this release, which may have fixed your problem. Thank you. I've not seen this recur after the last few updates, marking as fixed. |
Created attachment 114331 [details] dmesg output