Bug 242422
| Summary: | BUG: at kernel/softirq.c:639 __tasklet_action() | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise MRG | Reporter: | IBM Bug Proxy <bugproxy> | ||||||
| Component: | realtime-kernel | Assignee: | Steven Rostedt <srostedt> | ||||||
| Status: | CLOSED DUPLICATE | QA Contact: | |||||||
| Severity: | medium | Docs Contact: | |||||||
| Priority: | low | ||||||||
| Version: | 1.0 | CC: | sshtylyov | ||||||
| Target Milestone: | --- | ||||||||
| Target Release: | --- | ||||||||
| Hardware: | x86_64 | ||||||||
| OS: | Linux | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2007-06-07 00:48:50 UTC | Type: | --- | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Attachments: |
|
||||||||
|
Description
IBM Bug Proxy
2007-06-04 09:35:44 UTC
----- Additional Comments From ankigarg.com (prefers email at ankita.com) 2007-06-04 09:40 EDT ------- Seems to be some locking issue..digging further. ----- Additional Comments From ankigarg.com (prefers email at ankita.com) 2007-06-04 11:09 EDT ------- Ok, I think I know the reason of the warning messages. Will generate a patch and post for review. Created attachment 156165 [details]
rostedt.patch
----- Additional Comments From jstultz.com (prefers email at johnstul.com) 2007-06-04 21:12 EDT ------- test patch from Steven Rostedt Here's the test patch from Steven Rostedt that I tested against RH's 2.6.21-23 kernel. Runninng kernbench easily triggered the issue on a 4 way LS21, so it didn't work, but I wanted to attach it so folks know what has been tried. ----- Additional Comments From jstultz.com (prefers email at johnstul.com) 2007-06-04 21:37 EDT ------- Other interesting tidbit, I had a hard time reproducing this until I set HZ=1000 in my .config. ----- Additional Comments From jstultz.com (prefers email at johnstul.com) 2007-06-04 23:20 EDT -------
So I added some debug code to print out the function name of the affected
tasklet and I got:
tasklet_bug: t=0xffffffff81ffd4d0 t->func=0xffffffff810d70b0
BUG: at kernel/softirq.c:641 __tasklet_action()
Call Trace:
[<ffffffff810a45d0>] __tasklet_action+0xb0/0x150
[<ffffffff810a4fdd>] ksoftirqd+0x17d/0x2a0
[<ffffffff810a4e60>] ksoftirqd+0x0/0x2a0
[<ffffffff810364ba>] kthread+0x12a/0x170
[<ffffffff8102a992>] schedule_tail+0x82/0x130
[<ffffffff81065f68>] child_rip+0xa/0x12
[<ffffffff81036390>] kthread+0x0/0x170
[<ffffffff81065f5e>] child_rip+0x0/0x12
Looking up t->func=0xffffffff810d70b0 in System.map (and w/ gdb vmlinux list)
give me: rcu_process_callbacks
Paul, does that help narrow it down at all?
----- Additional Comments From jstultz.com (prefers email at johnstul.com) 2007-06-05 00:12 EDT ------- Since nothing calls tasklet_kill on the rcu_tasklet, I'm not sure if tasklet_kill's manipulation of t->state is the issue, as some have theorized. Instead I wonder if the "t->state = TASKLET_STATEF_PENDING;" line is the problem. It seems to be the only spot where we don't take care to use a test_and_set/clear type accessor and overwrite other bits. It seems the interaction of tasklet_schedule() in rcu_check_callbacks() which is called from the sched_tick_timer is likely suspect. ------- Additional Comments From ankigarg.com (prefers email at ankita.com) 2007-06-05 01:05 EDT ------- (In reply to comment #13) > Instead I wonder if the "t->state = TASKLET_STATEF_PENDING;" line is the > problem. It seems to be the only spot where we don't take care to use a > test_and_set/clear type accessor and overwrite other bits. The intention is to overwrite all the bits and have only the bit indicating that the tasklet is pending. Right after setting that bit, the code continues with the next tasklet on the list. On a subsequent tasklet_enable(), the TASKLET_STATE_SCHED bit is set again. So, this might not be the issue. ----- Additional Comments From ankigarg.com (prefers email at ankita.com) 2007-06-05 09:41 EDT -------
Ok, some thoughts.
The only two places where the TASKLET_STATE_SCHED bit is being cleared are in
tasklet_kill and __tasklet_action itself. The clearing of the
TASKLET_STATE_SCHED in __tasklet_action should be done while the
TASKLET_STATE_RUN is set so that the t->state is not messed up.
while (!tasklet_tryunlock(t)) {
/*
* If it got disabled meanwhile, bail out:
*/
if (atomic_read(&t->count))
goto out_disabled;
/*
* If it got scheduled meanwhile, re-execute
* the tasklet function:
*/
if (test_and_clear_bit(TASKLET_STATE_SCHED, &t->state))
goto again;
if (!--loops) {
printk("hm, tasklet state: %08lx
", t->state);
WARN_ON(1);
tasklet_unlock(t);
break;
}
}
I think the problem lies in the above loop. I plan to try basically the
following by suitably modifying the locking:
if (test_and_clear_bit(TASKLET_STATE_SCHED, &t->state))
goto again;
while (!tasklet_tryunlock(t)) {
/*
* If it got disabled meanwhile, bail out:
*/
if (atomic_read(&t->count))
goto out_disabled;
/*
* If it got scheduled meanwhile, re-execute
* the tasklet function:
*/
if (!--loops) {
printk("hm, tasklet state: %08lx
", t->state);
WARN_ON(1);
tasklet_unlock(t);
break;
}
}
I am not sure if that will work.
----- Additional Comments From ankigarg.com (prefers email at ankita.com) 2007-06-05 09:43 EDT -------
> It seems the interaction of tasklet_schedule() in rcu_check_callbacks() which is
> called from the sched_tick_timer is likely suspect.
>
Then this would imply that the locking while setting/resetting the state is not
proper...
----- Additional Comments From dvhltc.com 2007-06-05 10:48 EDT ------- rostedt asked for the loaded module list on irc today: This is from an LS20 running 2.6.21-23.el5rt that readily repeats this trace. dvhart@ltcrt3:~/linux-rt-tests/internal/stress/kernbench/kernbench-0.30$ /sbin/lsmod Module Size Used by autofs4 59784 2 hidp 88832 2 rfcomm 110888 0 l2cap 92672 10 hidp,rfcomm bluetooth 129028 5 hidp,rfcomm,l2cap sunrpc 234472 1 nf_conntrack_netbios_ns 36608 0 ipt_REJECT 37888 1 nf_conntrack_ipv4 55056 3 xt_state 35968 3 iptable_filter 36480 1 ip_tables 56672 1 iptable_filter ip6t_REJECT 38656 1 xt_tcpudp 36864 12 ip6table_filter 36224 1 ip6_tables 50496 1 ip6table_filter x_tables 56328 6 ipt_REJECT,xt_state,ip_tables,ip6t_REJECT,xt_tcpudp,ip6_tables ipv6 493664 23 ip6t_REJECT dm_mirror 57408 0 dm_mod 104208 1 dm_mirror video 54288 0 sbs 51544 0 i2c_ec 39552 1 sbs dock 45640 0 button 42912 0 battery 45320 0 asus_acpi 53548 0 ac 39688 0 parport_pc 65064 0 lp 49040 0 parport 77708 2 parport_pc,lp sg 73256 0 amd_rng 36360 0 pcspkr 36992 0 i2c_amd756 41604 0 i2c_core 60160 2 i2c_ec,i2c_amd756 shpchp 70556 0 k8temp 39808 0 hwmon 37128 1 k8temp tg3 151044 0 rtc_cmos 41896 0 rtc_core 45832 1 rtc_cmos rtc_lib 37248 1 rtc_core serio_raw 41348 0 mptspi 54416 2 mptscsih 57728 1 mptspi mptbase 92896 2 mptspi,mptscsih scsi_transport_spi 63360 1 mptspi sd_mod 56704 3 scsi_mod 203576 5 sg,mptspi,mptscsih,scsi_transport_spi,sd_mod ext3 177808 1 jbd 101616 1 ext3 ehci_hcd 70412 0 ohci_hcd 56708 0 uhci_hcd 60960 0 ------- Additional Comments From Paul.McKenney.com (prefers email at paulmck.com) 2007-06-05 10:53 EDT ------- (In reply to comment #13) > Since nothing calls tasklet_kill on the rcu_tasklet, I'm not sure if > tasklet_kill's manipulation of t->state is the issue, as some have theorized. > > Instead I wonder if the "t->state = TASKLET_STATEF_PENDING;" line is the > problem. It seems to be the only spot where we don't take care to use a > test_and_set/clear type accessor and overwrite other bits. > > It seems the interaction of tasklet_schedule() in rcu_check_callbacks() which is > called from the sched_tick_timer is likely suspect. Calling tasklet_schedule() from rcu_check_callbacks() might be an issue -- but tasklet_disable() is not called from RCU, so the TASKLET_STATEF_PENDING should not ever happen for RCU. Will look some more... ------- Additional Comments From Paul.McKenney.com (prefers email at paulmck.com) 2007-06-05 12:20 EDT ------- (In reply to comment #15) > Ok, some thoughts. > > The only two places where the TASKLET_STATE_SCHED bit is being cleared are in > tasklet_kill and __tasklet_action itself. The clearing of the > TASKLET_STATE_SCHED in __tasklet_action should be done while the > TASKLET_STATE_RUN is set so that the t->state is not messed up. > > > while (!tasklet_tryunlock(t)) { > /* > * If it got disabled meanwhile, bail out: > */ > if (atomic_read(&t->count)) > goto out_disabled; > /* > * If it got scheduled meanwhile, re-execute > * the tasklet function: > */ > if (test_and_clear_bit(TASKLET_STATE_SCHED, &t->state)) > goto again; > if (!--loops) { > printk("hm, tasklet state: %08lx ", t->state); > WARN_ON(1); > tasklet_unlock(t); > break; > } > } > > I think the problem lies in the above loop. I plan to try basically the > following by suitably modifying the locking: > > if (test_and_clear_bit(TASKLET_STATE_SCHED, &t->state)) > goto again; > > while (!tasklet_tryunlock(t)) { > /* > * If it got disabled meanwhile, bail out: > */ > if (atomic_read(&t->count)) > goto out_disabled; > /* > * If it got scheduled meanwhile, re-execute > * the tasklet function: > */ > > if (!--loops) { > printk("hm, tasklet state: %08lx ", t->state); > WARN_ON(1); > tasklet_unlock(t); > break; > } > } > > I am not sure if that will work. This approach has a race -- suppose that the tasklet is rescheduled just after the test_and_clear_bit(). Because rescheduling the tasklet sets the TASKLET_STATE_SCHED bit, the tasklet_tryunlock() can never succeed. One approach would be to replace the straight stores with cmpxchg, allowing better error checking. Another approach would be to have tasklet_kill() set a bit in the ->state word, allowing the other operations to fail when performed on a zombie tasklet. ------- Additional Comments From Paul.McKenney.com (prefers email at paulmck.com) 2007-06-05 12:24 EDT ------- (In reply to comment #18) > (In reply to comment #13) > > Since nothing calls tasklet_kill on the rcu_tasklet, I'm not sure if > > tasklet_kill's manipulation of t->state is the issue, as some have theorized. > > > > Instead I wonder if the "t->state = TASKLET_STATEF_PENDING;" line is the > > problem. It seems to be the only spot where we don't take care to use a > > test_and_set/clear type accessor and overwrite other bits. > > > > It seems the interaction of tasklet_schedule() in rcu_check_callbacks() which is > > called from the sched_tick_timer is likely suspect. > > Calling tasklet_schedule() from rcu_check_callbacks() might be an issue -- but > tasklet_disable() is not called from RCU, so the TASKLET_STATEF_PENDING should > not ever happen for RCU. Will look some more... And, since RCU's tasklets are statically allocated, this failure cannot possibly be due to premature recycling of data structures involving tasklets. But the only thing that is done to the RCU tasklets is tasklet_schedule() and tasklet_init()... So, this was using CONFIG_PREEMPT_RCU, right? ----- Additional Comments From jstultz.com (prefers email at johnstul.com) 2007-06-05 12:52 EDT ------- Yes. CONFIG_PREEMPT_RCU is being used (required w/ PREEMPT_RT) ----- Additional Comments From jstultz.com (prefers email at johnstul.com) 2007-06-05 12:53 EDT ------- Also I've reproduced the issue a few times and every time it is the rcu tasklet. ----- Additional Comments From jstultz.com (prefers email at johnstul.com) 2007-06-05 20:27 EDT -------
So I think I've found the issue:
So we have two main call paths:
tasklet_schedule()
if (!test_and_set_bit(TASKLET_STATE_SCHED, &t->state)) {
tasklet_trylock()
[add tasklet to per cpu list]
}
then later..
__tasklet_action()
tasklet_trylock()
test_and_clear_bit(TASKLET_STATE_SCHED...
again:
t->func()
while (!tasklet_tryunlock(t)) {
if (test_and_clear_bit(TASKLET_STATE_SCHED, &t->state))
goto again;
}
So imagine if you will, a call to tasklet_schedule runs as normal, we add the
tasklet to the list. Then __tasklet_action runs, clears the sched bit and runs
func().
At that moment, on another cpu, we call tasklet_schedule on the same tasklet. We
set the bit and, before we call tasklet_trylock, are then preempted.
Back in __tasklet_action, we try to unlock, realize the sched bit is set, clear
it and jump back to again, where we re-run the function, tryunlock succeeds and
we return.
Then the call to tasklet_schedule is un-preempted, and we add the tasklet (now
however the sched bit is clear) to the list.
Then the next time __tasklet_action runs, the tasklet is on the list, but does
not have the sched bit set and flags a warning.
----- Additional Comments From jstultz.com (prefers email at johnstul.com) 2007-06-05 20:29 EDT ------- I'm now testing a patch that tries to verify my theory. Created attachment 156301 [details]
tasklet-fix.patch
----- Additional Comments From jstultz.com (prefers email at johnstul.com) 2007-06-05 21:47 EDT ------- potential fix for tasklet race I haven't yet been able to verify that this is the real issue, but I wanted to push this patch out to the bug so others could review (its getting close to quitting time). I'll continue testing and will provide additional feedback when I'm more confident this will resolve the issue. ----- Additional Comments From jstultz.com (prefers email at johnstul.com) 2007-06-05 21:58 EDT ------- Ok. I've verified that the issue is where I think it is. Now I just have to verify the fix resolves it. changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|ASSIGNED |FIXEDAWAITINGTEST
Resolution| |FIX_BY_IBM
------- Additional Comments From jstultz.com (prefers email at johnstul.com) 2007-06-05 22:19 EDT -------
I'm pretty sure this is the fix, so I'm marking it fixedawatingtest and will
follow up with overnight test results tomorrow.
------- Additional Comments From Paul.McKenney.com (prefers email at paulmck.com) 2007-06-05 23:39 EDT ------- (In reply to comment #23) > So I think I've found the issue: > > So we have two main call paths: > > tasklet_schedule() > if (!test_and_set_bit(TASKLET_STATE_SCHED, &t->state)) { > tasklet_trylock() > [add tasklet to per cpu list] > } > > then later.. > __tasklet_action() > tasklet_trylock() > test_and_clear_bit(TASKLET_STATE_SCHED... > again: > t->func() > while (!tasklet_tryunlock(t)) { > if (test_and_clear_bit(TASKLET_STATE_SCHED, &t->state)) > goto again; > } > > > > > So imagine if you will, a call to tasklet_schedule runs as normal, we add the > tasklet to the list. Then __tasklet_action runs, clears the sched bit and runs > func(). > > At that moment, on another cpu, we call tasklet_schedule on the same tasklet. We > set the bit and, before we call tasklet_trylock, are then preempted. > > Back in __tasklet_action, we try to unlock, realize the sched bit is set, clear > it and jump back to again, where we re-run the function, tryunlock succeeds and > we return. > > Then the call to tasklet_schedule is un-preempted, and we add the tasklet (now > however the sched bit is clear) to the list. > > Then the next time __tasklet_action runs, the tasklet is on the list, but does > not have the sched bit set and flags a warning. Good catch!!! And good show finding it!!! There is at least one suspicious case of doing tasklet_kill() while a timer that does a tasklet_schedule() is still active in drivers/isdn/gigaset/common.c line 424 of kernel.org 2.6.21-rt8, BTW. ----- Additional Comments From ankigarg.com (prefers email at ankita.com) 2007-06-06 01:51 EDT ------- John, I ran kernbench on a 8G machine with the patch applied. There was no occurence of this BUG message (without the patch, there are a number of such messages seen). *** This bug has been marked as a duplicate of 239819 *** (In reply to comment #19) > Created an attachment (id=156301) [edit] > tasklet-fix.patch Couldn't we just write: if (tasklet_trylock(t) && test_bit(TASKLET_STATE_SCHED, &t->state)) { /* whatever was there*/ } which will boil down to one-liner? ----- Additional Comments From jstultz.com (prefers email at johnstul.com) 2007-06-07 14:10 EDT ------- No, that one-liner will not work, because we still have to release the lock should we acquire it. (In reply to comment #27) > ----- Additional Comments From jstultz.com (prefers email at johnstul.com) 2007-06-07 14:10 EDT ------- > No, that one-liner will not work, because we still have to release the lock > should we acquire it. Ah, sorry -- overlooked that. Have you submitted it upstream yet? :-) Well, we could do:
if (tasklet_trylock(t) && (test_bit(TASKLET_STATE_SCHED, &t->state) ||
tasklet_unlock(t))) {
/* yada yada yada */
tasklet_unlock(t);
}
And we still make it into a one-liner.
;-)
(In reply to comment #29) > Well, we could do: > > if (tasklet_trylock(t) && (test_bit(TASKLET_STATE_SCHED, &t->state) || > tasklet_unlock(t))) { > /* yada yada yada */ > > tasklet_unlock(t); > } > And we still make it into a one-liner. That'd be an overkill. :-) changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|TESTED |SUBMITTED
------- Additional Comments From jstultz.com (prefers email at johnstul.com) 2007-06-20 15:11 EDT -------
Verified fixed in 2.6.21-31.el5rt
|