Description of problem: When booting kernel-rt-debug-3.0.25-rt44.57.el6rt on some boxes with the qla2xxx adapter, the following splat can be observed: [ 10.349737] BUG: sleeping function called from invalid context at kernel/rtmutex.c:646 [ 10.349744] in_atomic(): 0, irqs_disabled(): 1, pid: 2845, name: work_for_cpu [ 10.349753] Pid: 2845, comm: work_for_cpu Not tainted 3.0.25-rt44.57.el6rt.x86_64.debug #1 [ 10.349759] Call Trace: [ 10.349786] [<ffffffff8103dbde>] __might_sleep+0xce/0xf0 [ 10.349801] [<ffffffff814d0734>] rt_spin_lock+0x24/0x50 [ 10.349856] [<ffffffffa02163fb>] qla24xx_intr_handler+0x5b/0x370 [qla2xxx] [ 10.349871] [<ffffffff814ce644>] ? wait_for_common+0x144/0x1a0 [ 10.349884] [<ffffffff814d3efd>] ? sub_preempt_count+0x9d/0xd0 [ 10.349920] [<ffffffffa0208da4>] qla2x00_poll+0x44/0x50 [qla2xxx] [ 10.349952] [<ffffffffa0209173>] qla2x00_mailbox_command+0x3c3/0x8c0 [qla2xxx] [ 10.349986] [<ffffffffa020be55>] qla2x00_mbx_reg_test+0x65/0xf0 [qla2xxx] [ 10.350000] [<ffffffff81307c8a>] ? __dev_printk+0x3a/0x90 [ 10.350006] [<ffffffff81307fc5>] ? dev_printk+0x45/0x50 [ 10.350006] [<ffffffffa0201494>] qla24xx_chip_diag+0x64/0xc0 [qla2xxx] [ 10.350006] [<ffffffffa020660d>] qla2x00_initialize_adapter+0x2fd/0x3a0 [qla2xxx] [ 10.350006] [<ffffffffa01f9e34>] ? kzalloc+0x14/0x20 [qla2xxx] [ 10.350006] [<ffffffffa0234043>] qla2x00_probe_one+0xd5e/0x1d1b [qla2xxx] [ 10.350006] [<ffffffff814d0ad3>] ? _raw_spin_lock+0x23/0x30 [ 10.350006] [<ffffffff8126251f>] local_pci_probe+0x5f/0xd0 [ 10.350006] [<ffffffff8106d4b0>] ? cpumask_weight+0x20/0x20 [ 10.350006] [<ffffffff8106d4c8>] do_work_for_cpu+0x18/0x30 [ 10.350006] [<ffffffff81075ee6>] kthread+0xa6/0xb0 [ 10.350006] [<ffffffff810419fc>] ? finish_task_switch+0x6c/0xf0 [ 10.350006] [<ffffffff814d8f34>] kernel_thread_helper+0x4/0x10 [ 10.350006] [<ffffffff81075e40>] ? kthreadd+0x180/0x180 [ 10.350006] [<ffffffff814d8f30>] ? gs_change+0xb/0xb Version-Release number of selected component (if applicable): 3.0.25-rt44.57.el6rt How reproducible: Always on affected hardware. Steps to Reproduce: 1. Install kernel-rt-debug-3.0.25-rt44.57.el6rt 2. Boot kernel without 'quiet' and 'rhgb' in the kernel command line 3. Observer console output, or dmesg after logging in. Additional info: This core issue is also present on all other kernel variants, but kernel-rt-debug is the one which complains about sleeping function being called from wrong context.
Verified by booting kernel-rt-debug kernels on a box with a qla2xxx adapter. Double checked that 3.0.25-rt44.57.el6rt.x86_64.debug provides the backtrace, which still was the issue. Upgraded to 3.0.30-rt50.62.el6rt.x86_64.debug and this issue is solved. -> VERIFIED
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Cause: Function qla2x00_poll does local_irq_save() before calling qla24xx_intr_hand which has a spinlock. Since spinlocks are sleepable on rt, it is not allowed to call them with interrupts disabled. Consequence: BUG: sleeping function called from invalid context at kernel/rtmutex.c:646 reported multiple times. Fix: Use local_irq_save_nort(flags) to save flags without disabling interrupts. Result: Potential deadlock is avoided, and the error message goes away.
Technical note updated. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. Diffed Contents: @@ -1,4 +1,4 @@ -Cause: Function qla2x00_poll does local_irq_save() before calling qla24xx_intr_hand which has a spinlock. Since spinlocks are sleepable on rt, it is not allowed to call them with interrupts disabled. +Cause: Function qla2x00_poll does local_irq_save() before calling qla24xx_intr_handler which has a spinlock. Since spinlocks are sleepable on rt, it is not allowed to call them with interrupts disabled. Consequence: BUG: sleeping function called from invalid context at kernel/rtmutex.c:646 reported multiple times.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2012-0670.html
Technical note updated. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. Diffed Contents: @@ -1,7 +1 @@ -Cause: Function qla2x00_poll does local_irq_save() before calling qla24xx_intr_handler which has a spinlock. Since spinlocks are sleepable on rt, it is not allowed to call them with interrupts disabled. +Previously, the qla2x00_poll() function did the local_irq_save() call before calling qla24xx_intr_handler(), which had a spinlock. Since spinlocks are sleepable in the real-time kernel, it is not allowed to call them with interrupts disabled. This scenario produced error messages and could cause a system deadlock. With this update, the local_irq_save_nort(flags) function is used to save flags without disabling interrupts, which prevents potential deadlocks and removes the error messages.- -Consequence: BUG: sleeping function called from invalid context at kernel/rtmutex.c:646 reported multiple times. - -Fix: Use local_irq_save_nort(flags) to save flags without disabling interrupts. - -Result: Potential deadlock is avoided, and the error message goes away.