From Bugzilla Helper:User-Agent: Mozilla/5.0 (compatible; Konqueror/3; Linux)Description of problem:The following macro (kernel/sched.c):#define SLEEP_ON_BKLCHECK \ if (unlikely(!kernel_locked()) && \ sleep_on_bkl_warnings < 10) { \ sleep_on_bkl_warnings++; \ WARN_ON(1); \ }appears to have the kernel_locked() test backwards. The first 10 calls to this macro execute the WARN_ON print and dump_stack.The macro should probably read:#define SLEEP_ON_BKLCHECK \ if (unlikely(kernel_locked()) && \ sleep_on_bkl_warnings < 10) { \ sleep_on_bkl_warnings++; \ WARN_ON(1); \ }(e.g without the not).Version-Release number of selected component (if applicable):linux-2.6.5-1.358How reproducible:AlwaysSteps to Reproduce:1. load a driver that started a thread2. Have the thread code daemon and loop doing sleep_on_timeout().3. Monitor the log Actual Results: The log entries below Expected Results: No log entriesAdditional info:Here is a print from a daemonized thread doing interruptible_sleep on_timeout. Note the lock_depth of -1 (not locked).Aug 12 10:03:05 frodo kernel: lock_depth = -1 timeout = 5000Aug 12 10:03:05 frodo kernel: Badness in interruptible_sleep_on_timeout at kernel/sched.c:1942Aug 12 10:03:05 frodo kernel: Call Trace:Aug 12 10:03:05 frodo kernel: [<0227eec8>] interruptible_sleep_on_timeout+0x5d/0xb1Aug 12 10:03:05 frodo kernel: [<02115e97>] default_wake_function+0x0/0xcAug 12 10:03:05 frodo kernel: [<021189cd>] printk+0x106/0x113Aug 12 10:03:05 frodo kernel: [<13cdd2dd>] SweeperThread+0x118/0x134 [ipoib]Aug 12 10:03:05 frodo kernel: [<13cdd1c5>] SweeperThread+0x0/0x134 [ipoib]Aug 12 10:03:05 frodo kernel: [<021041d9>] kernel_thread_helper+0x5/0xbAug 12 10:03:05 frodo kernel:
hmm the formatting of your comment is all messed up to the point it's hard to make sure what exactly you mean.
Created attachment 102651 [details] patch for the problem The patch removes the ! from the kernel_locked test.
also from what I can make of it.. could you paste some of your code in question? The design goal of the code on our side is to verify the BKL is helt since holding the BKL is mandatory over calls to *_sleep_on().
your patch is incorrect; the BKL *must* be helt when calling this function and that is what is being checked. Also please please just fix your code to not use this deprecated and misdesigned API.
So the code insists that a global spin lock is held prior to sleeping? I can't believe that is correct... I can re-work the driver, by I see no mention of the API being deprecated in 2.6. Do you have any links to documentation? Mike
The BKL is not a (normal) spinlock. *_sleep_on() were deprecated and wrong ever since 2.2 kernels. from the header that defines them: /* * These are the old interfaces to sleep waiting for an event. * They are racy. DO NOT use them, use the wait_event* interfaces above. * We plan to remove these interfaces during 2.7. */
I will rework to use the wait_event* stuff as suggested in the header files. Thanks.