Bug 129756 - SLEEP_ON_BKLCHECK trips for the first ten sleep on timeout calls
SLEEP_ON_BKLCHECK trips for the first ten sleep on timeout calls
Status: CLOSED NOTABUG
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
2
i686 Linux
medium Severity medium
: ---
: ---
Assigned To: Dave Jones
Brian Brock
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2004-08-12 11:29 EDT by Mike Marciniszyn
Modified: 2015-01-04 17:08 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2005-01-17 03:32:29 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
patch for the problem (580 bytes, patch)
2004-08-12 11:37 EDT, Mike Marciniszyn
no flags Details | Diff

  None (edit)
Description Mike Marciniszyn 2004-08-12 11:29:38 EDT
From Bugzilla Helper:User-Agent: Mozilla/5.0 (compatible; Konqueror/3; Linux)Description of problem:The following macro (kernel/sched.c):#define SLEEP_ON_BKLCHECK                               \        if (unlikely(!kernel_locked()) &&               \            sleep_on_bkl_warnings < 10) {               \                sleep_on_bkl_warnings++;                \                WARN_ON(1);                             \        }appears to have the kernel_locked() test backwards.  The first 10 calls to this macro execute the WARN_ON print and dump_stack.The macro should probably read:#define SLEEP_ON_BKLCHECK                               \        if (unlikely(kernel_locked()) &&               \            sleep_on_bkl_warnings < 10) {               \                sleep_on_bkl_warnings++;                \                WARN_ON(1);                             \        }(e.g without the not).Version-Release number of selected component (if applicable):linux-2.6.5-1.358How reproducible:AlwaysSteps to Reproduce:1. load a driver that started a thread2. Have the thread code daemon and loop doing sleep_on_timeout().3. Monitor the log    Actual Results:  The log entries below	Expected Results:  No log entriesAdditional info:Here is a print from a daemonized thread doing interruptible_sleep on_timeout.  Note the lock_depth of -1 (not locked).Aug 12 10:03:05 frodo kernel: lock_depth = -1 timeout = 5000Aug 12 10:03:05 frodo kernel: Badness in interruptible_sleep_on_timeout at kernel/sched.c:1942Aug 12 10:03:05 frodo kernel: Call Trace:Aug 12 10:03:05 frodo kernel:  [<0227eec8>] interruptible_sleep_on_timeout+0x5d/0xb1Aug 12 10:03:05 frodo kernel:  [<02115e97>] default_wake_function+0x0/0xcAug 12 10:03:05 frodo kernel:  [<021189cd>] printk+0x106/0x113Aug 12 10:03:05 frodo kernel:  [<13cdd2dd>] SweeperThread+0x118/0x134 [ipoib]Aug 12 10:03:05 frodo kernel:  [<13cdd1c5>] SweeperThread+0x0/0x134 [ipoib]Aug 12 10:03:05 frodo kernel:  [<021041d9>] kernel_thread_helper+0x5/0xbAug 12 10:03:05 frodo kernel:
Comment 1 Arjan van de Ven 2004-08-12 11:34:16 EDT
hmm the formatting of your comment is all messed up to the point it's
hard to make sure what exactly you mean.
Comment 2 Mike Marciniszyn 2004-08-12 11:37:36 EDT
Created attachment 102651 [details]
patch for the problem

The patch removes the ! from the kernel_locked test.
Comment 3 Arjan van de Ven 2004-08-12 11:38:40 EDT
also from what I can make of it.. could you paste some of your code in
question?

The design goal of the code on our side is to verify the BKL is helt
since holding the BKL is mandatory over calls to *_sleep_on().
Comment 4 Arjan van de Ven 2004-08-12 11:39:33 EDT
your patch is incorrect; the BKL *must* be helt when calling this
function and that is what is being checked.

Also please please just fix your code to not use this deprecated and
misdesigned API.
Comment 5 Mike Marciniszyn 2004-08-12 11:48:33 EDT
So the code insists that a global spin lock is held prior to sleeping?

I can't believe that is correct...

I can re-work the driver, by I see no mention of the API being 
deprecated in 2.6.  Do you have any links to documentation?

Mike
Comment 6 Arjan van de Ven 2004-08-12 11:50:15 EDT
The BKL is not a (normal) spinlock.

*_sleep_on() were deprecated and wrong ever since 2.2 kernels.

from the header that defines them:
/*
 * These are the old interfaces to sleep waiting for an event.
 * They are racy.  DO NOT use them, use the wait_event* interfaces above.
 * We plan to remove these interfaces during 2.7.
 */
Comment 7 Mike Marciniszyn 2004-08-12 12:10:24 EDT
I will rework to use the wait_event* stuff as suggested in the header 
files.

Thanks.


Note You need to log in before you can comment on or make changes to this bug.