237463 – BUG: scheduling with irqs disabled: strace/0x00000000/2011

Bug 237463 - BUG: scheduling with irqs disabled: strace/0x00000000/2011

Summary: BUG: scheduling with irqs disabled: strace/0x00000000/2011

Keywords:
Status:	CLOSED NEXTRELEASE
Alias:	None
Product:	Red Hat Enterprise MRG
Classification:	Red Hat
Component:	realtime-kernel
Sub Component:
Version:	1.0
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	urgent
Target Milestone:	---
Target Release:	---
Assignee:	Arnaldo Carvalho de Melo
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2007-04-23 12:14 UTC by IBM Bug Proxy
Modified:	2008-02-27 19:58 UTC (History)
CC List:	1 user (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2007-05-22 01:38:15 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
IBM Linux Technology Center	33926	0	None	None	None	Never

Description IBM Bug Proxy 2007-04-23 12:14:46 UTC

LTC Owner is: sripathi.com
LTC Originator is: sripathi.com


I was running strace on pthread_cond_many testcase when I saw a number of the
following BUG messages in dmesg. I have not yet tested this with default
(non-RT) RHEL5.

BUG: scheduling with irqs disabled: strace/0x00000000/2011
caller is rt_spin_lock_slowlock+0x102/0x1af

Call Trace:
 [<ffffffff8026d828>] dump_trace+0xbd/0x3d8
 [<ffffffff8026db87>] show_trace+0x44/0x6d
 [<ffffffff8026ddc8>] dump_stack+0x13/0x15
 [<ffffffff80264dc6>] schedule+0x87/0x10b
 [<ffffffff80265b06>] rt_spin_lock_slowlock+0x102/0x1af
 [<ffffffff802661af>] rt_spin_lock+0x1f/0x21
 [<ffffffff8029af0c>] force_sig_info+0x26/0xb5
 [<ffffffff8029b018>] force_sig_specific+0x11/0x13
 [<ffffffff80298659>] ptrace_attach+0xdf/0x10b
 [<ffffffff802986d7>] sys_ptrace+0x52/0xb8
 [<ffffffff8025f42c>] tracesys+0x151/0x1be
 [<00000034ecec71c9>]

---------------------------
| preempt count: 00000000 ]
| 0-level deep critical section nesting:
----------------------------------------

Kernel: 2.6.20-0119.rt8
glibc: glibc-2.5-12
Hardware: LS21
Kernel cmdline: ro root=LABEL=/1 rhgb quiet acpi=noirq
Recreation steps:
Start pthread_cond_many testcase. From another terminal, attach strace to
pthread_cond_many process. Very soon, BUGs appear in dmesg.

I think I know the root cause of this problem. I'll post details soon.

In ptrace_attach, this is what happens:

task_lock
local_irq_disable
write_lock(tasklist_lock) Using trylocks.
	Some work 
	__ptrace_link
	Send SIGSTOP to target thread
write_unlock_irq(tasklist_lock)
task_unlock

local_irq_disable + write_lock will work as write_lock_irq and write_unlock_irq
will re-enable interrupts. However, on -rt, write_unlock_irq doesn't do
local_irq_enable. But since we have explicitly called local_irq_disable,
interrupts remain blocked!

To fix the problem, I think we should call write_unlock(tasklist_lock) and
local_irq_enable() instead of write_unlock_irq. Also, we should call them BEFORE
sending SIGSTOP to the target thread. I think there is no need to hold the
tasklist lock during sending of SIGSTOP.

For the vanilla kernel too, I think we should do write_unlock_irq(tasklist_lock)
before sending SIGSTOP.

The following patch solves the problem on 2.6.20-rt8. I want to send this to
LKML/Ingo soon. Does anyone have comments?

--- linux-2.6.20.x86_64_org/kernel/ptrace.c     2007-04-19 18:19:37.000000000 +0530
+++ linux-2.6.20.x86_64/kernel/ptrace.c 2007-04-19 16:43:32.000000000 +0530
@@ -205,10 +205,16 @@ repeat:

        __ptrace_link(task, current);

+       write_unlock(&tasklist_lock);
+       local_irq_enable();
+
        force_sig_specific(SIGSTOP, task);
+       goto out2;

 bad:
-       write_unlock_irq(&tasklist_lock);
+       write_unlock(&tasklist_lock);
+       local_irq_enable();
+out2:
        task_unlock(task);
 out:
        return retval;



What if some other process is reading the task_list at the time you are sending
it the stop signal? Will the code in force_sig_specific take care of that by its
own locking?

Sripathi, thanks for clarifying it offline!

I have posted this to LKML/Ingo: http://lkml.org/lkml/2007/04/20/41

Comment 1 IBM Bug Proxy 2007-05-11 14:55:35 UTC

----- Additional Comments From sripathi.com (prefers email at sripathik.com)  2007-05-11 10:49 EDT -------
I got no reply from Ingo/anyone else about my earlier mail (Apr 20). Hence I
tried to fix it in another way by introducing write_trylock_irqsave API in
mainline and -rt. Mainline patches are at http://lkml.org/lkml/2007/05/09/76 and
http://lkml.org/lkml/2007/05/09/79 . -rt patches are at
http://lkml.org/lkml/2007/05/10/47 and http://lkml.org/lkml/2007/05/10/48.

The mainline patches have been accepted into -mm. I am awaiting response for -rt
patches.

Comment 2 Arnaldo Carvalho de Melo 2007-05-15 15:11:55 UTC

Unable to reproduce this with 2.6.21-4.el5rtdebug and 2.6.21-3.el5rt. Checked
kernel/ptrace.c, it doesn't have your patches. I'm using
http://www.kernel.org/pub/linux/kernel/people/dvhart/realtime/tests/tests.tar.bz2
with './run.sh all', wait for the "./pthread_cond_many --broadcast 400 5000"
processes to start, ran strace on them, no BUG messages. Machine is a Dell
PowerEdge 1950 with to dual core Xeon processors. Will try now with the same
kernel as you used (2.6.20-0119.rt8).

Comment 3 Arnaldo Carvalho de Melo 2007-05-15 15:49:11 UTC

Tried with
http://people.redhat.com/mingo/realtime-preempt/yum/x86_64/kernel-rt-2.6.20-0119.rt8.x86_64.rpm:

[root@mica ~]# uname -r
2.6.20-0119.rt8

And couldn't reproduce with it either.

I'm running it now with this patch:

[root@mica latency]# diff -u pthread_cond_many.sh.orig pthread_cond_many.sh
--- pthread_cond_many.sh.orig   2007-05-15 12:47:11.000000000 -0300
+++ pthread_cond_many.sh        2007-05-15 12:48:27.000000000 -0300
@@ -9,11 +9,11 @@
 nproc=5

 i=0
-./pthread_cond_many $1 --broadcast $iter $nthread > 2100.$i.out &
+strace -f ./pthread_cond_many $1 --broadcast $iter $nthread > 2100.$i.out 2>
/dev/null &
 i=1
 while test $i -lt $nproc
 do
-       ./pthread_cond_many --broadcast $iter $nthread > 2100.$i.out &
+       strace -f ./pthread_cond_many --broadcast $iter $nthread > 2100.$i.out
2> /dev/null &
        i=`expr $i + 1`
 done
 wait
[root@mica latency]# pwd
/home/acme/rt/IBM/rtlinux-tests/perf/latency
[root@mica latency]#

and running it like this:

[root@mica latency]# pwd
/home/acme/rt/IBM/rtlinux-tests/perf/latency
[root@mica latency]# ./pthread_cond_many.sh --realtime

Comment 4 IBM Bug Proxy 2007-05-16 05:50:33 UTC

------- Additional Comments From sripathi.com (prefers email at sripathik.com)  2007-05-16 01:44 EDT -------
(In reply to comment #13)
> ----- Additional Comments From acme  2007-05-15 11:11 EST -------
> Unable to reproduce this with 2.6.21-4.el5rtdebug and 2.6.21-3.el5rt. Checked
> kernel/ptrace.c, it doesn't have your patches. I'm using
> http://www.kernel.org/pub/linux/kernel/people/dvhart/realtime/tests/tests.tar.bz2
> with './run.sh all', wait for the "./pthread_cond_many --broadcast 400 5000"
> processes to start, ran strace on them, no BUG messages. Machine is a Dell
> PowerEdge 1950 with to dual core Xeon processors. Will try now with the same
> kernel as you used (2.6.20-0119.rt8).

I tried exactly the same just now and reproduced the problem on 2.6.21-2.el5rt.
I pulled down the tests from kernel.org, started the tests by hand using 
"./pthread_cond_many --broadcast 400 5000" 
and ran 
"strace -f -v -o strace.out <pid of first pthread_cond_many process>"

and immediately I see a bunch of BUGs in dmesg.

My hardware is LS20 blade, but I don't think the problem is hardware dependent.

Comment 5 Arnaldo Carvalho de Melo 2007-05-16 12:11:48 UTC

Tried now with 2.6.21-4.el5rt using exactly the same sequence described in your
latest entry in this ticket: got the BUGs. Will now apply your patches to the rt
kernel rpm and retest. Strange, the only difference from my test is to run
./pthread_cond_many directly instead of running it thru the shell script,
anyway, reproduced, rebuilding the rpm with your patches, thanks.

Comment 6 Arnaldo Carvalho de Melo 2007-05-16 19:40:24 UTC

Did it, the BUGs are over and from my perspective the patches are OK, will talk
with Steven Rostedt for a second opinion and the ask Clark to put those patches
in our 2.6.21-rt kernel rpms and ask Ingo to consider them for upstream
rt-preempt acceptance, thanks!

Comment 7 Arnaldo Carvalho de Melo 2007-05-22 01:38:15 UTC

Patches were merged, at least the 2.6.21-rt6 patch has it, thanks a lot for
submitting them! It is already merged in the internal repo for kernel-rt and
should be included in the 2.6.21-11.el5rt kernel-rt rpm release.

Comment 8 IBM Bug Proxy 2007-05-24 16:40:23 UTC

----- Additional Comments From sripathi.com (prefers email at sripathik.com)  2007-05-24 12:37 EDT -------
I have tested this with 2.6.21-14.el5rt kernel (which I believe contains Ingo's
patch-2.6.21-rt7) and the problem is no more seen. strace does not produce any
BUGs now. Thanks!
-Sripathi.

Note You need to log in before you can comment on or make changes to this bug.