Bug 736752 - INFO: suspicious rcu_dereference_check() usage in IPoIB code
Summary: INFO: suspicious rcu_dereference_check() usage in IPoIB code
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 16
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-09-08 15:46 UTC by Albert Strasheim
Modified: 2012-03-27 12:47 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-03-27 12:47:17 UTC


Attachments (Terms of Use)

Description Albert Strasheim 2011-09-08 15:46:50 UTC
Description of problem:

[  794.990455] ===================================================
[  794.997970] [ INFO: suspicious rcu_dereference_check() usage. ]
[  795.003965] ---------------------------------------------------
[  795.009965] include/net/dst.h:91 invoked rcu_dereference_check() without protection!
[  795.017774]
[  795.017775] other info that might help us debug this:
[  795.017775]
[  795.025972]
[  795.025972] rcu_scheduler_active = 1, debug_locks = 0
[  795.032634] 4 locks held by kworker/u:0/5:
[  795.036794]  #0:  ((name)){.+.+.+}, at: [<ffffffff81075a61>] process_one_work+0x14d/0x3e7
[  795.045556]  #1:  ((&port_priv->work)){+.+.+.}, at: [<ffffffff81075a61>] process_one_work+0x14d/0x3e7
[  795.055559]  #2:  (rcu_read_lock_bh){.+....}, at: [<ffffffff814185eb>] dev_queue_xmit+0x0/0x618
[  795.065110]  #3:  (_xmit_INFINIBAND){+.-...}, at: [<ffffffff814307b6>] sch_direct_xmit+0x4e/0x14e
[  795.074831]
[  795.074831] stack backtrace:
[  795.079736] Pid: 5, comm: kworker/u:0 Tainted: G        W   3.1.0-0.rc3.git0.0.fc16.x86_64 #1
[  795.088759] Call Trace:
[  795.091493]  [<ffffffff8108ca23>] lockdep_rcu_dereference+0xa7/0xaf
[  795.098019]  [<ffffffffa0083a0d>] dst_get_neighbour+0x52/0x5a [ib_ipoib]
[  795.105005]  [<ffffffffa0084568>] ipoib_start_xmit+0x3a/0x3b8 [ib_ipoib]
[  795.112000]  [<ffffffff814184b6>] dev_hard_start_xmit+0x44f/0x584
[  795.118355]  [<ffffffff814307da>] sch_direct_xmit+0x72/0x14e
[  795.124294]  [<ffffffff814189e0>] dev_queue_xmit+0x3f5/0x618
[  795.130233]  [<ffffffff814185eb>] ? dev_hard_start_xmit+0x584/0x584
[  795.136756]  [<ffffffff8108f439>] ? trace_hardirqs_on_caller+0x121/0x158
[  795.143743]  [<ffffffffa0084293>] path_rec_completion+0x30d/0x35e [ib_ipoib]
[  795.151062]  [<ffffffffa0069000>] ib_sa_path_rec_callback+0x51/0x75 [ib_sa]
[  795.158292]  [<ffffffffa006833b>] recv_handler+0x41/0x4d [ib_sa]
[  795.164585]  [<ffffffffa0051d7e>] ib_mad_completion_handler+0x44d/0x643 [ib_mad]
[  795.172468]  [<ffffffff8108b885>] ? trace_hardirqs_off+0xd/0xf
[  795.178598]  [<ffffffffa0051931>] ? ib_mad_send_done_handler+0x157/0x157 [ib_mad]
[  795.186548]  [<ffffffff81075b19>] process_one_work+0x205/0x3e7
[  795.192664]  [<ffffffff81075a61>] ? process_one_work+0x14d/0x3e7
[  795.198958]  [<ffffffff8108ceeb>] ? lock_acquired+0x210/0x243
[  795.204991]  [<ffffffff810767c7>] worker_thread+0xda/0x15d
[  795.210745]  [<ffffffff810766ed>] ? manage_workers+0x176/0x176
[  795.216865]  [<ffffffff8107a18d>] kthread+0xa8/0xb0
[  795.222014]  [<ffffffff8150d284>] kernel_thread_helper+0x4/0x10
[  795.228215]  [<ffffffff815046f4>] ? retint_restore_args+0x13/0x13
[  795.234578]  [<ffffffff8107a0e5>] ? __init_kthread_worker+0x5a/0x5a
[  795.241121]  [<ffffffff8150d280>] ? gs_change+0x13/0x13

Version-Release number of selected component (if applicable):

kernel-3.1.0-0.rc3.git0.0.fc16.x86_64

How reproducible:

Always

Steps to Reproduce:
1. Ping a host over IPoIB

Comment 1 Dave Jones 2012-02-06 17:37:40 UTC
is this still happening with the latest update (you'll likely need the kernel-debug variant installed to check)

Comment 2 Albert Strasheim 2012-03-04 15:37:35 UTC
As far as I can tell, this is fixed on kernel-debug 3.2.7-1.

Comment 3 Josh Boyer 2012-03-05 15:38:52 UTC
(In reply to comment #2)
> As far as I can tell, this is fixed on kernel-debug 3.2.7-1.

"Fixed" is the wrong resolution I think.  The IOMMU was disabled by default.  If you boot with iommu=on and a kernel-debug kernel, it will probably show back up.

Comment 4 Albert Strasheim 2012-03-05 15:55:19 UTC
Sorry, I wanted to mention that. What is the future of the IOMMU stuff? Why was it disabled in 3.1.6? There are other IB bugs that also trigger with the IOMMU enabled, but I don't know if we should be reporting them if the default is to disable it.

Comment 5 Dave Jones 2012-03-22 16:59:20 UTC
[mass update]
kernel-3.3.0-4.fc16 has been pushed to the Fedora 16 stable repository.
Please retest with this update.

Comment 6 Dave Jones 2012-03-22 17:03:16 UTC
[mass update]
kernel-3.3.0-4.fc16 has been pushed to the Fedora 16 stable repository.
Please retest with this update.

Comment 7 Dave Jones 2012-03-22 17:14:12 UTC
[mass update]
kernel-3.3.0-4.fc16 has been pushed to the Fedora 16 stable repository.
Please retest with this update.

Comment 8 Albert Strasheim 2012-03-23 16:44:01 UTC
looks fixed

Comment 9 Josh Boyer 2012-03-26 21:06:43 UTC
(In reply to comment #8)
> looks fixed

Just to clarify, you tested kernel-debug-3.3.0-4.fc16 and specified iommu=on on the kernel command line?

Comment 10 Albert Strasheim 2012-03-27 03:55:38 UTC
That is correct.

Comment 11 Josh Boyer 2012-03-27 12:47:17 UTC
Thanks Albert.


Note You need to log in before you can comment on or make changes to this bug.