Bug 662384
Summary: | Pid: 797, comm: ivtv0 Not tainted 2.6.37-0.rc5.git2.1.fc15.x86_64 #1 | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Nicolas Mailhot <nicolas.mailhot> | ||||||||||
Component: | kernel | Assignee: | Jarod Wilson <jarodwilson> | ||||||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||||||
Severity: | medium | Docs Contact: | |||||||||||
Priority: | low | ||||||||||||
Version: | rawhide | CC: | awalls, dougsland, gansalmon, itamar, jarod, jonathan, kernel-maint, kmcmartin, madhu.chinakonda, mchehab | ||||||||||
Target Milestone: | --- | ||||||||||||
Target Release: | --- | ||||||||||||
Hardware: | Unspecified | ||||||||||||
OS: | Unspecified | ||||||||||||
Whiteboard: | |||||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||||
Doc Text: | Story Points: | --- | |||||||||||
Clone Of: | Environment: | ||||||||||||
Last Closed: | 2012-02-21 20:08:07 UTC | Type: | --- | ||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||
Documentation: | --- | CRM: | |||||||||||
Verified Versions: | Category: | --- | |||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
Embargoed: | |||||||||||||
Attachments: |
|
Description
Nicolas Mailhot
2010-12-12 09:54:57 UTC
Created attachment 468200 [details]
dmesg
Created attachment 468201 [details]
lspci
Hm... This is probably some fallout from some of the locking that has been redone upstream as a result of the BKL getting killed... Mauro, does this look familiar to you, by chance? (And even better, do we have a patch for it upstream already?). (In reply to comment #3) > Hm... This is probably some fallout from some of the locking that has been > redone upstream as a result of the BKL getting killed... Mauro, does this look > familiar to you, by chance? (And even better, do we have a patch for it > upstream already?). This bug is new to me. It may be related to BKL removal stuff. Those patches are somewhat aggressive. Let's see if awalls have some ideas about it. (In reply to comment #4) > (In reply to comment #3) > > Hm... This is probably some fallout from some of the locking that has been > > redone upstream as a result of the BKL getting killed... Mauro, does this look > > familiar to you, by chance? (And even better, do we have a patch for it > > upstream already?). > > This bug is new to me. It may be related to BKL removal stuff. Those patches > are somewhat aggressive. Let's see if awalls have some ideas about it. Btw, as Linus merged the BKL fixes upstream today, the better is to rebase rawhide kernel with today's patches and see if this fixes the issue. [ 21.769255] ivtv: Start initialization, version 1.4.2 [ 21.769347] ivtv0: Initializing card 0 [ 21.769349] ivtv0: Autodetected Hauppauge card (cx23416 based) [ 21.771615] INFO: trying to register non-static key. [ 21.771640] the code is fine but needs lockdep annotation. [ 21.771661] turning off the locking correctness validator. [ 21.771684] Pid: 776, comm: ivtv0 Not tainted 2.6.37-0.rc6.git0.1.fc15.x86_64 #1 [ 21.771712] Call Trace: [ 21.771728] [<ffffffff8107fce7>] register_lock_class+0x102/0x2e2 [ 21.771750] [<ffffffff8108227c>] ? __lock_acquire+0x56a/0xd65 [ 21.771775] [<ffffffff81011269>] ? paravirt_read_tsc+0x9/0xd [ 21.771796] [<ffffffff81081dfa>] __lock_acquire+0xe8/0xd65 [ 21.771816] [<ffffffff8107fb8a>] ? lock_release_holdtime+0x54/0x5b [ 21.771840] [<ffffffff814b4d70>] ? _raw_spin_unlock_irq+0x30/0x3a [ 21.771864] [<ffffffff81083365>] ? trace_hardirqs_on_caller+0x10b/0x12f [ 21.771890] [<ffffffff8106ed1f>] ? kthread_worker_fn+0xa9/0x14c [ 21.771915] [<ffffffff81082f46>] lock_acquire+0xc8/0xf3 [ 21.771939] [<ffffffff8106ed1f>] ? kthread_worker_fn+0xa9/0x14c [ 21.771962] [<ffffffff814b4561>] _raw_spin_lock_irq+0x40/0x75 [ 21.771984] [<ffffffff8106ed1f>] ? kthread_worker_fn+0xa9/0x14c [ 21.772005] [<ffffffff8106ed1f>] kthread_worker_fn+0xa9/0x14c [ 21.772005] [<ffffffff8106ec76>] ? kthread_worker_fn+0x0/0x14c [ 21.772005] [<ffffffff8106ee62>] kthread+0xa0/0xa8 [ 21.772005] [<ffffffff81083365>] ? trace_hardirqs_on_caller+0x10b/0x12f [ 21.772005] [<ffffffff8100bb24>] kernel_thread_helper+0x4/0x10 [ 21.772005] [<ffffffff814b51d0>] ? restore_args+0x0/0x30 [ 21.772005] [<ffffffff8106edc2>] ? kthread+0x0/0xa8 [ 21.772005] [<ffffffff8100bb20>] ? kernel_thread_helper+0x0/0x10 [ 21.772307] ivtv 0000:06:01.0: PCI INT A -> GSI 19 (level, low) -> IRQ 19 [ 21.772339] ivtv0: Unreasonably low latency timer, setting to 64 (was 32) [ 21.827155] tveeprom 6-0050: The eeprom says no radio is present, but the tuner type [ 21.827185] tveeprom 6-0050: indicates otherwise. I will assume that radio is present. [ 21.827212] tveeprom 6-0050: Hauppauge model 26039, rev C1A5, serial# 8203263 [ 21.827236] tveeprom 6-0050: tuner model is TCL MPE05-2 (idx 105, type 38) [ 21.827260] tveeprom 6-0050: TV standards PAL(B/G) PAL(I) SECAM(L/L') PAL(D/D1/K) (eeprom 0x74) [ 21.827288] tveeprom 6-0050: audio processor is CX25842 (idx 36) [ 21.827309] tveeprom 6-0050: decoder processor is CX25842 (idx 29) [ 21.827330] tveeprom 6-0050: has radio, has IR receiver, has IR transmitter [ 21.827353] ivtv0: Autodetected Hauppauge WinTV PVR-150 [ 21.827371] ivtv0: Reopen i2c bus for IR-blaster support [ 22.160241] cx25840 6-0044: cx25842-23 found @ 0x88 (ivtv i2c driver #0) (In reply to comment #4) > (In reply to comment #3) > > Hm... This is probably some fallout from some of the locking that has been > > redone upstream as a result of the BKL getting killed... Mauro, does this look > > familiar to you, by chance? (And even better, do we have a patch for it > > upstream already?). > > This bug is new to me. It may be related to BKL removal stuff. Those patches > are somewhat aggressive. Let's see if awalls have some ideas about it. This is not BKL related. This is not a critical ivtv bug, the driver code is correct. This is a lockdep gripe on how a kernel API improperly initializes a spinlock. This lockdep gripe is related to Tejun Heo's improved implementation of kenrel workqueues for handling deferred work. Specifically ivtv had a single threaded workqueue with real-time priority for timing critical contraints on VBI data over I2C. Tejun had to convert it to a kthread worker with realtime priority. The new workqueue implementation uses pools of workers, and real-time priority obviously can't be designated on an ivtv work object. Hence the dedicated worker thread for each ivtv card instance. The problem appears to be init_kthread_worker() and KTHREAD_WORKER_INIT() using initializers for static spin_lock objects (SPIN_LOCK_UNLOCKED) on the dynamically allocated, per ivtv card instance, itv->irq_worker object's internal spin lock. I think it should be using spin_lock_init() for those objects' internal locks. I could hack up a patch if needed. However Tejun or someone familiar with kthread worker internals and/or lockdep would turn around an optimal and correct solution in shorter time. BTW when did Fedora start shiping with all the lock validation enabled in the config? Regards, Andy (In reply to comment #7) > BTW when did Fedora start shiping with all the lock validation enabled in the > config? This is a fedora-devel kernel (aka rawhide). They always have all sorts of snail/debugging options activated Created attachment 469544 [details]
Patch to get rid of lockdep's initial INFO gripe
Created attachment 469545 [details]
dmesg for subsequent potential problem lockdep detects
(In reply to comment #7) > (In reply to comment #4) > > (In reply to comment #3) > > > Hm... This is probably some fallout from some of the locking that has been > > > redone upstream as a result of the BKL getting killed... Mauro, does this look > > > familiar to you, by chance? (And even better, do we have a patch for it > > > upstream already?). > > > > This bug is new to me. It may be related to BKL removal stuff. Those patches > > are somewhat aggressive. Let's see if awalls have some ideas about it. > > This is a lockdep gripe on how a kernel API improperly initializes a spinlock. > This lockdep gripe is related to Tejun Heo's improved implementation of kenrel > workqueues for handling deferred work. > The problem appears to be init_kthread_worker() and KTHREAD_WORKER_INIT() using > initializers for static spin_lock objects (SPIN_LOCK_UNLOCKED) on the > dynamically allocated, per ivtv card instance, itv->irq_worker object's > internal spin lock. I think it should be using spin_lock_init() for > those objects' internal locks. > > I could hack up a patch if needed. However Tejun or someone familiar with > kthread worker internals and/or lockdep would turn around an optimal and > correct solution in shorter time. The patch to get rid of the lockdep INFO gripe was easier than I thought, since ivtv is the only current user of the kernel API call in question. With the attached patch, lockdep now is happy about how the dynamically allocated kthread_worker's spinlock_t is initialized. :) However, lockdep is now able to gripe about a potential locking problem with the new V4L2 control framework and ivtv's kthread worker. :( Dmesg output is attached, since I won't be able to continue investigation for at least a day or two. Regards, Andy (In reply to comment #11) > The patch to get rid of the lockdep INFO gripe was easier than I thought, since > ivtv is the only current user of the kernel API call in question. > > With the attached patch, lockdep now is happy about how the dynamically > allocated kthread_worker's spinlock_t is initialized. :) I have sent the patch I attached to the LKML for review, hopefully by at least Tejun, and inclusion upstream. https://lkml.org/lkml/2010/12/19/101 > However, lockdep is now able to gripe about a potential locking problem with > the new V4L2 control framework and ivtv's kthread worker. :( > > Dmesg output is attached, since I won't be able to continue investigation for > at least a day or two. Hans Verkuil provided confirmation that the recursive lock is a problem in the new V4L2 control handling framework. I'm not sure if that problem is in the rawhide kernel in question, yet. I recommend declaring the attached patch as a fix for this bug report and writing a new bug report *if* the new V4L2 control framework (and recursive lock) is in the rawhide kernel in question. Regards, Andy (In reply to comment #12) > (In reply to comment #11) > > With the attached patch, lockdep now is happy about how the dynamically > > allocated kthread_worker's spinlock_t is initialized. :) > > I have sent the patch I attached to the LKML for review, hopefully > by at least Tejun, and inclusion upstream. > > https://lkml.org/lkml/2010/12/19/101 Here is the patch for this bug, that Tejun is pushing to Linus: http://git.kernel.org/?p=linux/kernel/git/tj/wq.git;a=commitdiff;h=4f32e9b1f812fd6c00cc85a127583fefbdedaedc > > However, lockdep is now able to gripe about a potential locking problem with > > the new V4L2 control framework > Hans Verkuil provided confirmation that the recursive lock is a problem in > the new V4L2 control handling framework. I'm not sure if that problem is > in the rawhide kernel in question, yet. The potential recursive mutex lock that lockdep finds in the V4L2 control framework is not a real issue, as no deadlocks can actually occur, if V4L2 drivers are written properly. However, lockdep_set_class_and_name() or mutex_lock_nested() calls need to be added in the V4L2 control framework to inform lockdep that, that particular recursive lock on the struct v4l2_ctrl_handler.lock (aka &hdl>lock) mutex lock class is OK. That is a separate bug for which no final fix is yet available. Regards, Andy The ivtv bug has been fixed since 2.6.37. Closing this out... |