Bug 245535

Summary: wait_for_completion scheduling with irqs disabled
Product: Red Hat Enterprise MRG Reporter: IBM Bug Proxy <bugproxy>
Component: realtime-kernelAssignee: Steven Rostedt <srostedt>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: medium Docs Contact:
Priority: low    
Version: 1.0   
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: 2.6.21-35.el5rt Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-07-26 14:41:17 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
second patch for fixing migrating softirqs
none
Patch to not run a softirq from a hardirq directly none

Description IBM Bug Proxy 2007-06-25 08:30:46 UTC
LTC Owner is: jstultz.com
LTC Originator is: jstultz.com


Problem description:

Running overnight tests with 2.6.21-31.el5rt on rt-apple, I came across a number
of the following messages:

BUG: scheduling with irqs disabled: IRQ-28/0x00000000/2290
caller is wait_for_completion+0x82/0xc1

Call Trace:
 [<ffffffff8026d4e9>] dump_trace+0xaa/0x32a
 [<ffffffff8026d7aa>] show_trace+0x41/0x5c
 [<ffffffff8026d7da>] dump_stack+0x15/0x17
 [<ffffffff802646bc>] schedule+0x82/0x102
 [<ffffffff802647be>] wait_for_completion+0x82/0xc1
 [<ffffffff8028eaf1>] set_cpus_allowed+0xa1/0xc8
 [<ffffffff802958ed>] do_softirq_from_hardirq+0x105/0x12d
 [<ffffffff802c3965>] do_irqd+0x2a4/0x32b
 [<ffffffff80233d76>] kthread+0xf5/0x128
 [<ffffffff8025ff68>] child_rip+0xa/0x12

BUG: scheduling with irqs disabled: IRQ-36/0x00000000/434
caller is wait_for_completion+0x82/0xc1

Call Trace:
 [<ffffffff8026d4e9>] dump_trace+0xaa/0x32a
 [<ffffffff8026d7aa>] show_trace+0x41/0x5c
 [<ffffffff8026d7da>] dump_stack+0x15/0x17
 [<ffffffff802646bc>] schedule+0x82/0x102
 [<ffffffff802647be>] wait_for_completion+0x82/0xc1
 [<ffffffff8028eaf1>] set_cpus_allowed+0xa1/0xc8
 [<ffffffff802958ed>] do_softirq_from_hardirq+0x105/0x12d
 [<ffffffff802c3965>] do_irqd+0x2a4/0x32b
 [<ffffffff80233d76>] kthread+0xf5/0x128
 [<ffffffff8025ff68>] child_rip+0xa/0x12


11 messages in total over the night. Most on IRQ-36 but a few on IRQ-28.

The box was still up and seemingly running fine in the morning, however the
network connection was dropped (not sure if that's internal firewall junk).

Not sure if this was the issue Steven brought up w/ his softirq fix.

While testing for ltc bug 35584 / RH bug 244819, I saw this BUG message in
2.6.21.5-rt15 but it was fixed 2.6.21.5-rt17.

I saw this bug message when I was doing testing on an LS21 (elm3b198 to be
specific).

rostedt claims this issue is already fixed. Will probably show up in the next RH
kernel (but need to confirm). Redhat, can you please confirm ?

Comment 1 IBM Bug Proxy 2007-06-28 15:10:15 UTC
----- Additional Comments From sripathi.com (prefers email at sripathik.com)  2007-06-28 11:05 EDT -------
Seen on 2.6.21-31.el5rt too. 

Comment 2 Guy Streeter 2007-06-28 18:20:37 UTC
I see this on -31 also:

 BUG: scheduling with irqs disabled: IRQ-24/0x00

 caller is wait_for_completion+0x82/0xc1
 
 Call Trace:
  [<ffffffff8106d4e9>] dump_trace+0xaa/0x32a
  [<ffffffff8106d7aa>] show_trace+0x41/0x5c
  [<ffffffff8106d7da>] dump_stack+0x15/0x17
  [<ffffffff810646bc>] schedule+0x82/0x102
  [<ffffffff810647be>] wait_for_completion+0x82/

  [<ffffffff8108e9ba>] set_cpus_allowed+0xa1/0xc

  [<ffffffff810957b6>] do_softirq_from_hardirq+0

  [<ffffffff810c3837>] do_irqd+0x2a4/0x32b
  [<ffffffff81033d76>] kthread+0xf5/0x128
  [<ffffffff8105ff68>] child_rip+0xa/0x12

This is the IRQ:
 24:      14058         69   IO-APIC-fasteoi   eth0


Comment 3 Clark Williams 2007-07-26 14:41:17 UTC
Patch from srostedt applied; fixed

Comment 4 IBM Bug Proxy 2007-09-26 23:15:25 UTC
------- Comment From dvhltc.com 2007-09-26 19:10 EDT-------
We managed to get a patch for the other issue (from peter z), let's push redhat
for this patch as well.  I'd like to close this by next week, marking required
date accordingly.

Comment 5 Clark Williams 2007-09-27 14:15:59 UTC
Created attachment 208481 [details]
second patch for fixing migrating softirqs

This is the first of two patches that address this bug

Comment 6 Clark Williams 2007-09-27 14:17:02 UTC
Created attachment 208491 [details]
Patch to not run a softirq from a hardirq directly

Second patch

Comment 7 IBM Bug Proxy 2008-01-31 10:24:40 UTC
------- Comment From sripathi.com 2008-01-31 05:23 EDT-------
Closing.