LTC Owner is: email@example.com
LTC Originator is: firstname.lastname@example.org
I was running strace on pthread_cond_many testcase when I saw a number of the
following BUG messages in dmesg. I have not yet tested this with default
BUG: scheduling with irqs disabled: strace/0x00000000/2011
caller is rt_spin_lock_slowlock+0x102/0x1af
| preempt count: 00000000 ]
| 0-level deep critical section nesting:
Kernel cmdline: ro root=LABEL=/1 rhgb quiet acpi=noirq
Start pthread_cond_many testcase. From another terminal, attach strace to
pthread_cond_many process. Very soon, BUGs appear in dmesg.
I think I know the root cause of this problem. I'll post details soon.
In ptrace_attach, this is what happens:
write_lock(tasklist_lock) Using trylocks.
Send SIGSTOP to target thread
local_irq_disable + write_lock will work as write_lock_irq and write_unlock_irq
will re-enable interrupts. However, on -rt, write_unlock_irq doesn't do
local_irq_enable. But since we have explicitly called local_irq_disable,
interrupts remain blocked!
To fix the problem, I think we should call write_unlock(tasklist_lock) and
local_irq_enable() instead of write_unlock_irq. Also, we should call them BEFORE
sending SIGSTOP to the target thread. I think there is no need to hold the
tasklist lock during sending of SIGSTOP.
For the vanilla kernel too, I think we should do write_unlock_irq(tasklist_lock)
before sending SIGSTOP.
The following patch solves the problem on 2.6.20-rt8. I want to send this to
LKML/Ingo soon. Does anyone have comments?
--- linux-2.6.20.x86_64_org/kernel/ptrace.c 2007-04-19 18:19:37.000000000 +0530
+++ linux-2.6.20.x86_64/kernel/ptrace.c 2007-04-19 16:43:32.000000000 +0530
@@ -205,10 +205,16 @@ repeat:
+ goto out2;
What if some other process is reading the task_list at the time you are sending
it the stop signal? Will the code in force_sig_specific take care of that by its
Sripathi, thanks for clarifying it offline!
I have posted this to LKML/Ingo: http://lkml.org/lkml/2007/04/20/41
----- Additional Comments From email@example.com (prefers email at firstname.lastname@example.org) 2007-05-11 10:49 EDT -------
I got no reply from Ingo/anyone else about my earlier mail (Apr 20). Hence I
tried to fix it in another way by introducing write_trylock_irqsave API in
mainline and -rt. Mainline patches are at http://lkml.org/lkml/2007/05/09/76 and
http://lkml.org/lkml/2007/05/09/79 . -rt patches are at
http://lkml.org/lkml/2007/05/10/47 and http://lkml.org/lkml/2007/05/10/48.
The mainline patches have been accepted into -mm. I am awaiting response for -rt
Unable to reproduce this with 2.6.21-4.el5rtdebug and 2.6.21-3.el5rt. Checked
kernel/ptrace.c, it doesn't have your patches. I'm using
with './run.sh all', wait for the "./pthread_cond_many --broadcast 400 5000"
processes to start, ran strace on them, no BUG messages. Machine is a Dell
PowerEdge 1950 with to dual core Xeon processors. Will try now with the same
kernel as you used (2.6.20-0119.rt8).
[root@mica ~]# uname -r
And couldn't reproduce with it either.
I'm running it now with this patch:
[root@mica latency]# diff -u pthread_cond_many.sh.orig pthread_cond_many.sh
--- pthread_cond_many.sh.orig 2007-05-15 12:47:11.000000000 -0300
+++ pthread_cond_many.sh 2007-05-15 12:48:27.000000000 -0300
@@ -9,11 +9,11 @@
-./pthread_cond_many $1 --broadcast $iter $nthread > 2100.$i.out &
+strace -f ./pthread_cond_many $1 --broadcast $iter $nthread > 2100.$i.out 2>
while test $i -lt $nproc
- ./pthread_cond_many --broadcast $iter $nthread > 2100.$i.out &
+ strace -f ./pthread_cond_many --broadcast $iter $nthread > 2100.$i.out
2> /dev/null &
i=`expr $i + 1`
[root@mica latency]# pwd
and running it like this:
[root@mica latency]# pwd
[root@mica latency]# ./pthread_cond_many.sh --realtime
------- Additional Comments From email@example.com (prefers email at firstname.lastname@example.org) 2007-05-16 01:44 EDT -------
(In reply to comment #13)
> ----- Additional Comments From email@example.com 2007-05-15 11:11 EST -------
> Unable to reproduce this with 2.6.21-4.el5rtdebug and 2.6.21-3.el5rt. Checked
> kernel/ptrace.c, it doesn't have your patches. I'm using
> with './run.sh all', wait for the "./pthread_cond_many --broadcast 400 5000"
> processes to start, ran strace on them, no BUG messages. Machine is a Dell
> PowerEdge 1950 with to dual core Xeon processors. Will try now with the same
> kernel as you used (2.6.20-0119.rt8).
I tried exactly the same just now and reproduced the problem on 2.6.21-2.el5rt.
I pulled down the tests from kernel.org, started the tests by hand using
"./pthread_cond_many --broadcast 400 5000"
"strace -f -v -o strace.out <pid of first pthread_cond_many process>"
and immediately I see a bunch of BUGs in dmesg.
My hardware is LS20 blade, but I don't think the problem is hardware dependent.
Tried now with 2.6.21-4.el5rt using exactly the same sequence described in your
latest entry in this ticket: got the BUGs. Will now apply your patches to the rt
kernel rpm and retest. Strange, the only difference from my test is to run
./pthread_cond_many directly instead of running it thru the shell script,
anyway, reproduced, rebuilding the rpm with your patches, thanks.
Did it, the BUGs are over and from my perspective the patches are OK, will talk
with Steven Rostedt for a second opinion and the ask Clark to put those patches
in our 2.6.21-rt kernel rpms and ask Ingo to consider them for upstream
rt-preempt acceptance, thanks!
Patches were merged, at least the 2.6.21-rt6 patch has it, thanks a lot for
submitting them! It is already merged in the internal repo for kernel-rt and
should be included in the 2.6.21-11.el5rt kernel-rt rpm release.
----- Additional Comments From firstname.lastname@example.org (prefers email at email@example.com) 2007-05-24 12:37 EDT -------
I have tested this with 2.6.21-14.el5rt kernel (which I believe contains Ingo's
patch-2.6.21-rt7) and the problem is no more seen. strace does not produce any
BUGs now. Thanks!