Bug 129756 - SLEEP_ON_BKLCHECK trips for the first ten sleep on timeout calls
Summary: SLEEP_ON_BKLCHECK trips for the first ten sleep on timeout calls
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 2
Hardware: i686
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Dave Jones
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2004-08-12 15:29 UTC by Mike Marciniszyn
Modified: 2015-01-04 22:08 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2005-01-17 08:32:29 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
patch for the problem (580 bytes, patch)
2004-08-12 15:37 UTC, Mike Marciniszyn
no flags Details | Diff

Description Mike Marciniszyn 2004-08-12 15:29:38 UTC
From Bugzilla Helper:User-Agent: Mozilla/5.0 (compatible; Konqueror/3; Linux)Description of problem:The following macro (kernel/sched.c):#define SLEEP_ON_BKLCHECK                               \        if (unlikely(!kernel_locked()) &&               \            sleep_on_bkl_warnings < 10) {               \                sleep_on_bkl_warnings++;                \                WARN_ON(1);                             \        }appears to have the kernel_locked() test backwards.  The first 10 calls to this macro execute the WARN_ON print and dump_stack.The macro should probably read:#define SLEEP_ON_BKLCHECK                               \        if (unlikely(kernel_locked()) &&               \            sleep_on_bkl_warnings < 10) {               \                sleep_on_bkl_warnings++;                \                WARN_ON(1);                             \        }(e.g without the not).Version-Release number of selected component (if applicable):linux-2.6.5-1.358How reproducible:AlwaysSteps to Reproduce:1. load a driver that started a thread2. Have the thread code daemon and loop doing sleep_on_timeout().3. Monitor the log    Actual Results:  The log entries below	Expected Results:  No log entriesAdditional info:Here is a print from a daemonized thread doing interruptible_sleep on_timeout.  Note the lock_depth of -1 (not locked).Aug 12 10:03:05 frodo kernel: lock_depth = -1 timeout = 5000Aug 12 10:03:05 frodo kernel: Badness in interruptible_sleep_on_timeout at kernel/sched.c:1942Aug 12 10:03:05 frodo kernel: Call Trace:Aug 12 10:03:05 frodo kernel:  [<0227eec8>] interruptible_sleep_on_timeout+0x5d/0xb1Aug 12 10:03:05 frodo kernel:  [<02115e97>] default_wake_function+0x0/0xcAug 12 10:03:05 frodo kernel:  [<021189cd>] printk+0x106/0x113Aug 12 10:03:05 frodo kernel:  [<13cdd2dd>] SweeperThread+0x118/0x134 [ipoib]Aug 12 10:03:05 frodo kernel:  [<13cdd1c5>] SweeperThread+0x0/0x134 [ipoib]Aug 12 10:03:05 frodo kernel:  [<021041d9>] kernel_thread_helper+0x5/0xbAug 12 10:03:05 frodo kernel:

Comment 1 Arjan van de Ven 2004-08-12 15:34:16 UTC
hmm the formatting of your comment is all messed up to the point it's
hard to make sure what exactly you mean.

Comment 2 Mike Marciniszyn 2004-08-12 15:37:36 UTC
Created attachment 102651 [details]
patch for the problem

The patch removes the ! from the kernel_locked test.

Comment 3 Arjan van de Ven 2004-08-12 15:38:40 UTC
also from what I can make of it.. could you paste some of your code in
question?

The design goal of the code on our side is to verify the BKL is helt
since holding the BKL is mandatory over calls to *_sleep_on().

Comment 4 Arjan van de Ven 2004-08-12 15:39:33 UTC
your patch is incorrect; the BKL *must* be helt when calling this
function and that is what is being checked.

Also please please just fix your code to not use this deprecated and
misdesigned API.

Comment 5 Mike Marciniszyn 2004-08-12 15:48:33 UTC
So the code insists that a global spin lock is held prior to sleeping?

I can't believe that is correct...

I can re-work the driver, by I see no mention of the API being 
deprecated in 2.6.  Do you have any links to documentation?

Mike

Comment 6 Arjan van de Ven 2004-08-12 15:50:15 UTC
The BKL is not a (normal) spinlock.

*_sleep_on() were deprecated and wrong ever since 2.2 kernels.

from the header that defines them:
/*
 * These are the old interfaces to sleep waiting for an event.
 * They are racy.  DO NOT use them, use the wait_event* interfaces above.
 * We plan to remove these interfaces during 2.7.
 */


Comment 7 Mike Marciniszyn 2004-08-12 16:10:24 UTC
I will rework to use the wait_event* stuff as suggested in the header 
files.

Thanks.




Note You need to log in before you can comment on or make changes to this bug.