Bug 242422 - BUG: at kernel/softirq.c:639 __tasklet_action()
BUG: at kernel/softirq.c:639 __tasklet_action()
Status: CLOSED DUPLICATE of bug 239819
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: realtime-kernel (Show other bugs)
1.0
x86_64 Linux
low Severity medium
: ---
: ---
Assigned To: Steven Rostedt
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2007-06-04 05:35 EDT by IBM Bug Proxy
Modified: 2008-02-27 14:58 EST (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2007-06-06 20:48:50 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
rostedt.patch (8.04 KB, text/plain)
2007-06-04 21:15 EDT, IBM Bug Proxy
no flags Details
tasklet-fix.patch (906 bytes, text/plain)
2007-06-05 21:50 EDT, IBM Bug Proxy
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
IBM Linux Technology Center 34969 None None None Never

  None (edit)
Description IBM Bug Proxy 2007-06-04 05:35:44 EDT
LTC Owner is: jstultz@us.ibm.com
LTC Originator is: jstultz@us.ibm.com


Running kernbench on an LS20 w/ the 2.6.21-14.el5rt (x86_64) kernel, I got the
following bug trace:

BUG: at kernel/softirq.c:639 __tasklet_action()

Call Trace:
 [<ffffffff8106d5da>] dump_trace+0xaa/0x32a
 [<ffffffff8106d89b>] show_trace+0x41/0x5c
 [<ffffffff8106d8cb>] dump_stack+0x15/0x17
 [<ffffffff81094a97>] __tasklet_action+0xdf/0x12e
 [<ffffffff81094f76>] tasklet_action+0x27/0x29
 [<ffffffff8109530a>] ksoftirqd+0x16c/0x271
 [<ffffffff81033d4d>] kthread+0xf5/0x128
 [<ffffffff8105ff68>] child_rip+0xa/0x12

Looks like PaulM already saw this: http://lkml.org/lkml/2007/2/25/1

This seems mostly harmless, but I tripped it again w/ -14ibm running kernbench.

I'm getting a good number of these w/ kernbench. Again, they don't lock the box,
and seem harmless, but should get some more eyes on it.

- John

-------------------------------------

I have seen a number of these when doing some testing. Will take a look at it. 

- Ankita
Comment 1 IBM Bug Proxy 2007-06-04 09:45:29 EDT
----- Additional Comments From ankigarg@in.ibm.com (prefers email at ankita@in.ibm.com)  2007-06-04 09:40 EDT -------
Seems to be some locking issue..digging further. 
Comment 2 IBM Bug Proxy 2007-06-04 11:16:30 EDT
----- Additional Comments From ankigarg@in.ibm.com (prefers email at ankita@in.ibm.com)  2007-06-04 11:09 EDT -------
Ok, I think I know the reason of the warning messages. Will generate a patch and
post for review. 
Comment 3 IBM Bug Proxy 2007-06-04 21:15:45 EDT
Created attachment 156165 [details]
rostedt.patch
Comment 4 IBM Bug Proxy 2007-06-04 21:16:01 EDT
----- Additional Comments From jstultz@us.ibm.com (prefers email at johnstul@us.ibm.com)  2007-06-04 21:12 EDT -------
 
test patch from Steven Rostedt 

Here's the test patch from Steven Rostedt that I tested against RH's 2.6.21-23
kernel. Runninng kernbench easily triggered the issue on a 4 way LS21, so it
didn't work, but I wanted to attach it so folks know what has been tried. 
Comment 5 IBM Bug Proxy 2007-06-04 21:40:21 EDT
----- Additional Comments From jstultz@us.ibm.com (prefers email at johnstul@us.ibm.com)  2007-06-04 21:37 EDT -------
Other interesting tidbit, I had a hard time reproducing this until I set HZ=1000
in my .config. 
Comment 6 IBM Bug Proxy 2007-06-04 23:26:20 EDT
----- Additional Comments From jstultz@us.ibm.com (prefers email at johnstul@us.ibm.com)  2007-06-04 23:20 EDT -------
So I added some debug code to print out the function name of the affected
tasklet and I got:

tasklet_bug: t=0xffffffff81ffd4d0  t->func=0xffffffff810d70b0                  
                                                  

BUG: at kernel/softirq.c:641 __tasklet_action()
Call Trace:
 [<ffffffff810a45d0>] __tasklet_action+0xb0/0x150
 [<ffffffff810a4fdd>] ksoftirqd+0x17d/0x2a0
 [<ffffffff810a4e60>] ksoftirqd+0x0/0x2a0
 [<ffffffff810364ba>] kthread+0x12a/0x170
 [<ffffffff8102a992>] schedule_tail+0x82/0x130
 [<ffffffff81065f68>] child_rip+0xa/0x12
 [<ffffffff81036390>] kthread+0x0/0x170
 [<ffffffff81065f5e>] child_rip+0x0/0x12


Looking up t->func=0xffffffff810d70b0 in System.map (and w/ gdb vmlinux list)
give me: rcu_process_callbacks

Paul, does that help narrow it down at all? 
Comment 7 IBM Bug Proxy 2007-06-05 00:15:46 EDT
----- Additional Comments From jstultz@us.ibm.com (prefers email at johnstul@us.ibm.com)  2007-06-05 00:12 EDT -------
Since nothing calls tasklet_kill on the rcu_tasklet, I'm not sure if
tasklet_kill's manipulation of t->state is the issue, as some have theorized.

Instead I wonder if the "t->state = TASKLET_STATEF_PENDING;" line is the
problem. It seems to be the only spot where we don't take care to use a
test_and_set/clear type accessor and overwrite other bits.

It seems the interaction of tasklet_schedule() in rcu_check_callbacks() which is
called from the sched_tick_timer is likely suspect. 
Comment 8 IBM Bug Proxy 2007-06-05 01:10:23 EDT
------- Additional Comments From ankigarg@in.ibm.com (prefers email at ankita@in.ibm.com)  2007-06-05 01:05 EDT -------
(In reply to comment #13)

> Instead I wonder if the "t->state = TASKLET_STATEF_PENDING;" line is the
> problem. It seems to be the only spot where we don't take care to use a
> test_and_set/clear type accessor and overwrite other bits.

The intention is to overwrite all the bits and have only the bit indicating that
the tasklet is pending. Right after setting that bit, the code continues with
the next tasklet on the list. On a subsequent tasklet_enable(), the
TASKLET_STATE_SCHED bit is set again. So, this might not be the issue. 
Comment 9 IBM Bug Proxy 2007-06-05 09:45:25 EDT
----- Additional Comments From ankigarg@in.ibm.com (prefers email at ankita@in.ibm.com)  2007-06-05 09:41 EDT -------
Ok, some thoughts.

The only two places where the TASKLET_STATE_SCHED bit is being cleared are in
tasklet_kill and __tasklet_action itself. The clearing of the
TASKLET_STATE_SCHED in __tasklet_action should be done while the
TASKLET_STATE_RUN is set so that the t->state is not messed up. 


while (!tasklet_tryunlock(t)) {
     /*
      * If it got disabled meanwhile, bail out:
      */
      if (atomic_read(&t->count))
         goto out_disabled;
      /*
       * If it got scheduled meanwhile, re-execute
       * the tasklet function:
       */
       if (test_and_clear_bit(TASKLET_STATE_SCHED, &t->state))
         goto again;
       if (!--loops) {
         printk("hm, tasklet state: %08lx
", t->state);
         WARN_ON(1);
         tasklet_unlock(t);
         break;
       }
}

I think the problem lies in the above loop. I plan to try basically the
following by suitably modifying the locking:

if (test_and_clear_bit(TASKLET_STATE_SCHED, &t->state))
    goto again;

while (!tasklet_tryunlock(t)) {
     /*
      * If it got disabled meanwhile, bail out:
      */
      if (atomic_read(&t->count))
         goto out_disabled;
      /*
       * If it got scheduled meanwhile, re-execute
       * the tasklet function:
       */

       if (!--loops) {
         printk("hm, tasklet state: %08lx
", t->state);
         WARN_ON(1);
         tasklet_unlock(t);
         break;
       }
}

I am not sure if that will work. 
Comment 10 IBM Bug Proxy 2007-06-05 09:45:53 EDT
----- Additional Comments From ankigarg@in.ibm.com (prefers email at ankita@in.ibm.com)  2007-06-05 09:43 EDT -------

> It seems the interaction of tasklet_schedule() in rcu_check_callbacks() which is
> called from the sched_tick_timer is likely suspect. 
> 
Then this would imply that the locking while setting/resetting the state is not
proper... 
Comment 11 IBM Bug Proxy 2007-06-05 10:50:37 EDT
----- Additional Comments From dvhltc@us.ibm.com  2007-06-05 10:48 EDT -------
rostedt asked for the loaded module list on irc today:

This is from an LS20 running 2.6.21-23.el5rt that readily repeats this trace.

dvhart@ltcrt3:~/linux-rt-tests/internal/stress/kernbench/kernbench-0.30$ /sbin/lsmod
Module                  Size  Used by
autofs4                59784  2 
hidp                   88832  2 
rfcomm                110888  0 
l2cap                  92672  10 hidp,rfcomm
bluetooth             129028  5 hidp,rfcomm,l2cap
sunrpc                234472  1 
nf_conntrack_netbios_ns    36608  0 
ipt_REJECT             37888  1 
nf_conntrack_ipv4      55056  3 
xt_state               35968  3 
iptable_filter         36480  1 
ip_tables              56672  1 iptable_filter
ip6t_REJECT            38656  1 
xt_tcpudp              36864  12 
ip6table_filter        36224  1 
ip6_tables             50496  1 ip6table_filter
x_tables               56328  6
ipt_REJECT,xt_state,ip_tables,ip6t_REJECT,xt_tcpudp,ip6_tables
ipv6                  493664  23 ip6t_REJECT
dm_mirror              57408  0 
dm_mod                104208  1 dm_mirror
video                  54288  0 
sbs                    51544  0 
i2c_ec                 39552  1 sbs
dock                   45640  0 
button                 42912  0 
battery                45320  0 
asus_acpi              53548  0 
ac                     39688  0 
parport_pc             65064  0 
lp                     49040  0 
parport                77708  2 parport_pc,lp
sg                     73256  0 
amd_rng                36360  0 
pcspkr                 36992  0 
i2c_amd756             41604  0 
i2c_core               60160  2 i2c_ec,i2c_amd756
shpchp                 70556  0 
k8temp                 39808  0 
hwmon                  37128  1 k8temp
tg3                   151044  0 
rtc_cmos               41896  0 
rtc_core               45832  1 rtc_cmos
rtc_lib                37248  1 rtc_core
serio_raw              41348  0 
mptspi                 54416  2 
mptscsih               57728  1 mptspi
mptbase                92896  2 mptspi,mptscsih
scsi_transport_spi     63360  1 mptspi
sd_mod                 56704  3 
scsi_mod              203576  5 sg,mptspi,mptscsih,scsi_transport_spi,sd_mod
ext3                  177808  1 
jbd                   101616  1 ext3
ehci_hcd               70412  0 
ohci_hcd               56708  0 
uhci_hcd               60960  0 
Comment 12 IBM Bug Proxy 2007-06-05 10:55:31 EDT
------- Additional Comments From Paul.McKenney@us.ibm.com (prefers email at paulmck@us.ibm.com)  2007-06-05 10:53 EDT -------
(In reply to comment #13)
> Since nothing calls tasklet_kill on the rcu_tasklet, I'm not sure if
> tasklet_kill's manipulation of t->state is the issue, as some have theorized.
> 
> Instead I wonder if the "t->state = TASKLET_STATEF_PENDING;" line is the
> problem. It seems to be the only spot where we don't take care to use a
> test_and_set/clear type accessor and overwrite other bits.
> 
> It seems the interaction of tasklet_schedule() in rcu_check_callbacks() which is
> called from the sched_tick_timer is likely suspect. 

Calling tasklet_schedule() from rcu_check_callbacks() might be an issue -- but
tasklet_disable() is not called from RCU, so the TASKLET_STATEF_PENDING should
not ever happen for RCU.  Will look some more... 
Comment 13 IBM Bug Proxy 2007-06-05 12:25:18 EDT
------- Additional Comments From Paul.McKenney@us.ibm.com (prefers email at paulmck@us.ibm.com)  2007-06-05 12:20 EDT -------
(In reply to comment #15)
> Ok, some thoughts.
> 
> The only two places where the TASKLET_STATE_SCHED bit is being cleared are in
> tasklet_kill and __tasklet_action itself. The clearing of the
> TASKLET_STATE_SCHED in __tasklet_action should be done while the
> TASKLET_STATE_RUN is set so that the t->state is not messed up. 
> 
> 
> while (!tasklet_tryunlock(t)) {
>      /*
>       * If it got disabled meanwhile, bail out:
>       */
>       if (atomic_read(&t->count))
>          goto out_disabled;
>       /*
>        * If it got scheduled meanwhile, re-execute
>        * the tasklet function:
>        */
>        if (test_and_clear_bit(TASKLET_STATE_SCHED, &t->state))
>          goto again;
>        if (!--loops) {
>          printk("hm, tasklet state: %08lx
", t->state);
>          WARN_ON(1);
>          tasklet_unlock(t);
>          break;
>        }
> }
> 
> I think the problem lies in the above loop. I plan to try basically the
> following by suitably modifying the locking:
> 
> if (test_and_clear_bit(TASKLET_STATE_SCHED, &t->state))
>     goto again;
> 
> while (!tasklet_tryunlock(t)) {
>      /*
>       * If it got disabled meanwhile, bail out:
>       */
>       if (atomic_read(&t->count))
>          goto out_disabled;
>       /*
>        * If it got scheduled meanwhile, re-execute
>        * the tasklet function:
>        */
> 
>        if (!--loops) {
>          printk("hm, tasklet state: %08lx
", t->state);
>          WARN_ON(1);
>          tasklet_unlock(t);
>          break;
>        }
> }
> 
> I am not sure if that will work.

This approach has a race -- suppose that the tasklet is rescheduled just after
the test_and_clear_bit().  Because rescheduling the tasklet sets the
TASKLET_STATE_SCHED bit, the tasklet_tryunlock() can never succeed.

One approach would be to replace the straight stores with cmpxchg, allowing
better error checking.  Another approach would be to have tasklet_kill() set a
bit in the ->state word, allowing the other operations to fail when performed on
a zombie tasklet. 
Comment 14 IBM Bug Proxy 2007-06-05 12:30:45 EDT
------- Additional Comments From Paul.McKenney@us.ibm.com (prefers email at paulmck@us.ibm.com)  2007-06-05 12:24 EDT -------
(In reply to comment #18)
> (In reply to comment #13)
> > Since nothing calls tasklet_kill on the rcu_tasklet, I'm not sure if
> > tasklet_kill's manipulation of t->state is the issue, as some have theorized.
> > 
> > Instead I wonder if the "t->state = TASKLET_STATEF_PENDING;" line is the
> > problem. It seems to be the only spot where we don't take care to use a
> > test_and_set/clear type accessor and overwrite other bits.
> > 
> > It seems the interaction of tasklet_schedule() in rcu_check_callbacks() which is
> > called from the sched_tick_timer is likely suspect. 
> 
> Calling tasklet_schedule() from rcu_check_callbacks() might be an issue -- but
> tasklet_disable() is not called from RCU, so the TASKLET_STATEF_PENDING should
> not ever happen for RCU.  Will look some more...

And, since RCU's tasklets are statically allocated, this failure cannot possibly
be due to premature recycling of data structures involving tasklets.  But the
only thing that is done to the RCU tasklets is tasklet_schedule() and
tasklet_init()...

So, this was using CONFIG_PREEMPT_RCU, right? 
Comment 15 IBM Bug Proxy 2007-06-05 12:55:40 EDT
----- Additional Comments From jstultz@us.ibm.com (prefers email at johnstul@us.ibm.com)  2007-06-05 12:52 EDT -------
Yes. CONFIG_PREEMPT_RCU is being used (required w/ PREEMPT_RT) 
Comment 16 IBM Bug Proxy 2007-06-05 12:56:13 EDT
----- Additional Comments From jstultz@us.ibm.com (prefers email at johnstul@us.ibm.com)  2007-06-05 12:53 EDT -------
Also I've reproduced the issue a few times and every time it is the rcu tasklet. 
Comment 17 IBM Bug Proxy 2007-06-05 20:30:48 EDT
----- Additional Comments From jstultz@us.ibm.com (prefers email at johnstul@us.ibm.com)  2007-06-05 20:27 EDT -------
So I think I've found the issue:

So we have two main call paths:

tasklet_schedule()
   if (!test_and_set_bit(TASKLET_STATE_SCHED, &t->state)) {
         tasklet_trylock()
         [add tasklet to per cpu list]
   }

then later..
__tasklet_action()
   tasklet_trylock()
   test_and_clear_bit(TASKLET_STATE_SCHED...
again:
   t->func()
   while (!tasklet_tryunlock(t)) {
      if (test_and_clear_bit(TASKLET_STATE_SCHED, &t->state))
         goto again;
   }




So imagine if you will, a call to tasklet_schedule runs as normal, we add the
tasklet to the list. Then __tasklet_action runs, clears the sched bit and runs
func().

At that moment, on another cpu, we call tasklet_schedule on the same tasklet. We
set the bit and, before we call tasklet_trylock, are then preempted.

Back in __tasklet_action, we try to unlock, realize the sched bit is set, clear
it and jump back to again, where we re-run the function, tryunlock succeeds and
we return.

Then the call to tasklet_schedule is un-preempted, and we add the tasklet (now
however the sched bit is clear) to the list. 

Then the next time __tasklet_action runs, the tasklet is on the list, but does
not have the sched bit set and flags a warning. 
Comment 18 IBM Bug Proxy 2007-06-05 20:31:21 EDT
----- Additional Comments From jstultz@us.ibm.com (prefers email at johnstul@us.ibm.com)  2007-06-05 20:29 EDT -------
I'm now testing a patch that tries to verify my theory. 
Comment 19 IBM Bug Proxy 2007-06-05 21:50:31 EDT
Created attachment 156301 [details]
tasklet-fix.patch
Comment 20 IBM Bug Proxy 2007-06-05 21:50:37 EDT
----- Additional Comments From jstultz@us.ibm.com (prefers email at johnstul@us.ibm.com)  2007-06-05 21:47 EDT -------
 
potential fix for tasklet race

I haven't yet been able to verify that this is the real issue, but I wanted to
push this patch out to the bug so others could review (its getting close to
quitting time). I'll continue testing and will provide additional feedback when
I'm more confident this will resolve the issue. 
Comment 21 IBM Bug Proxy 2007-06-05 22:00:23 EDT
----- Additional Comments From jstultz@us.ibm.com (prefers email at johnstul@us.ibm.com)  2007-06-05 21:58 EDT -------
Ok. I've verified that the issue is where I think it is. Now I just have to
verify the fix resolves it. 
Comment 22 IBM Bug Proxy 2007-06-05 22:20:40 EDT
changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |FIXEDAWAITINGTEST
         Resolution|                            |FIX_BY_IBM




------- Additional Comments From jstultz@us.ibm.com (prefers email at johnstul@us.ibm.com)  2007-06-05 22:19 EDT -------
I'm pretty sure this is the fix, so I'm marking it fixedawatingtest and will
follow up with overnight test results tomorrow. 
Comment 23 IBM Bug Proxy 2007-06-05 23:45:46 EDT
------- Additional Comments From Paul.McKenney@us.ibm.com (prefers email at paulmck@us.ibm.com)  2007-06-05 23:39 EDT -------
(In reply to comment #23)
> So I think I've found the issue:
> 
> So we have two main call paths:
> 
> tasklet_schedule()
>    if (!test_and_set_bit(TASKLET_STATE_SCHED, &t->state)) {
>          tasklet_trylock()
>          [add tasklet to per cpu list]
>    }
> 
> then later..
> __tasklet_action()
>    tasklet_trylock()
>    test_and_clear_bit(TASKLET_STATE_SCHED...
> again:
>    t->func()
>    while (!tasklet_tryunlock(t)) {
>       if (test_and_clear_bit(TASKLET_STATE_SCHED, &t->state))
>          goto again;
>    }
> 
> 
> 
> 
> So imagine if you will, a call to tasklet_schedule runs as normal, we add the
> tasklet to the list. Then __tasklet_action runs, clears the sched bit and runs
> func().
> 
> At that moment, on another cpu, we call tasklet_schedule on the same tasklet. We
> set the bit and, before we call tasklet_trylock, are then preempted.
> 
> Back in __tasklet_action, we try to unlock, realize the sched bit is set, clear
> it and jump back to again, where we re-run the function, tryunlock succeeds and
> we return.
> 
> Then the call to tasklet_schedule is un-preempted, and we add the tasklet (now
> however the sched bit is clear) to the list. 
> 
> Then the next time __tasklet_action runs, the tasklet is on the list, but does
> not have the sched bit set and flags a warning.

Good catch!!!  And good show finding it!!!

There is at least one suspicious case of doing tasklet_kill() while a timer that
does a tasklet_schedule() is still active in drivers/isdn/gigaset/common.c line
424 of kernel.org 2.6.21-rt8, BTW. 
Comment 24 IBM Bug Proxy 2007-06-06 01:56:10 EDT
----- Additional Comments From ankigarg@in.ibm.com (prefers email at ankita@in.ibm.com)  2007-06-06 01:51 EDT -------
John, I ran kernbench on a 8G machine with the patch applied. There was no
occurence of this BUG message (without the patch, there are a number of such
messages seen). 
Comment 25 Steven Rostedt 2007-06-06 20:48:50 EDT

*** This bug has been marked as a duplicate of 239819 ***
Comment 26 Sergei Shtylyov 2007-06-07 13:38:34 EDT
(In reply to comment #19)
> Created an attachment (id=156301) [edit]
> tasklet-fix.patch
 
Couldn't we just write:

if (tasklet_trylock(t) && test_bit(TASKLET_STATE_SCHED, &t->state)) {
   /* whatever was there*/
}

which will boil down to one-liner?
Comment 27 IBM Bug Proxy 2007-06-07 14:15:32 EDT
----- Additional Comments From jstultz@us.ibm.com (prefers email at johnstul@us.ibm.com)  2007-06-07 14:10 EDT -------
No, that one-liner will not work, because we still have to release the lock
should we acquire it. 
Comment 28 Sergei Shtylyov 2007-06-07 14:18:49 EDT
(In reply to comment #27)
> ----- Additional Comments From jstultz@us.ibm.com (prefers email at
johnstul@us.ibm.com)  2007-06-07 14:10 EDT -------
> No, that one-liner will not work, because we still have to release the lock
> should we acquire it. 

Ah, sorry -- overlooked that. Have you submitted it upstream yet? :-)
Comment 29 Steven Rostedt 2007-06-07 14:38:11 EDT
Well, we could do:

if (tasklet_trylock(t) && (test_bit(TASKLET_STATE_SCHED, &t->state) ||
tasklet_unlock(t))) {
  /* yada yada yada */

  tasklet_unlock(t);
}

And we still make it into a one-liner.

;-)
Comment 30 Sergei Shtylyov 2007-06-07 14:45:25 EDT
(In reply to comment #29)
> Well, we could do:
> 
> if (tasklet_trylock(t) && (test_bit(TASKLET_STATE_SCHED, &t->state) ||
> tasklet_unlock(t))) {
>   /* yada yada yada */
> 
>   tasklet_unlock(t);
> }
 
> And we still make it into a one-liner.

That'd be an overkill. :-)
Comment 31 IBM Bug Proxy 2007-06-20 15:18:44 EDT
changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|TESTED                      |SUBMITTED




------- Additional Comments From jstultz@us.ibm.com (prefers email at johnstul@us.ibm.com)  2007-06-20 15:11 EDT -------
Verified fixed in 2.6.21-31.el5rt 

Note You need to log in before you can comment on or make changes to this bug.