Noticed this in my kernel logs: 100G device mounted with lock_nolock on a hand-compiled kernel running the latest nmw kernel bits as of today. [ 87.799733] GFS2 installed [ 87.812359] GFS2: fsid=dm-14: Trying to join cluster "lock_nolock", "dm-14" [ 87.820199] GFS2: fsid=dm-14: Now mounting FS... [ 87.825628] [ 87.827321] =============================== [ 87.832428] [ INFO: suspicious RCU usage. ] [ 87.837132] 3.8.0-rc2 #9 Tainted: G W O [ 87.842216] ------------------------------- [ 87.846913] include/linux/rculist_bl.h:23 suspicious rcu_dereference_check() usage! [ 87.855482] [ 87.855482] other info that might help us debug this: [ 87.855482] [ 87.864448] [ 87.864448] rcu_scheduler_active = 1, debug_locks = 0 [ 87.871761] 1 lock held by mount/1197: [ 87.875973] #0: (&type->s_umount_key#30/1){+.+.+.}, at: [<ffffffff811cfa4d>] sget+0x37d/0x640 [ 87.885893] [ 87.885893] stack backtrace: [ 87.890783] Pid: 1197, comm: mount Tainted: G W O 3.8.0-rc2 #9 [ 87.898099] Call Trace: [ 87.900861] [<ffffffff810d9f8d>] lockdep_rcu_suspicious+0xfd/0x130 [ 87.907917] [<ffffffffa03582c8>] search_bucket+0x138/0x180 [gfs2] [ 87.914854] [<ffffffffa03592e8>] gfs2_glock_get+0x618/0x770 [gfs2] [ 87.921896] [<ffffffffa0358cd5>] ? gfs2_glock_get+0x5/0x770 [gfs2] [ 87.928931] [<ffffffffa035b7b0>] gfs2_glock_nq_num+0x30/0xa0 [gfs2] [ 87.936068] [<ffffffffa0367b7d>] fill_super+0x64d/0xe40 [gfs2] [ 87.942720] [<ffffffff81358104>] ? snprintf+0x34/0x40 [ 87.948479] [<ffffffff816bc14d>] ? __mutex_unlock_slowpath+0xdd/0x180 [ 87.955796] [<ffffffffa03685f3>] gfs2_mount+0x283/0x2e0 [gfs2] [ 87.962431] [<ffffffff811d0cc3>] mount_fs+0x43/0x1b0 [ 87.968091] [<ffffffff8118b2b0>] ? __alloc_percpu+0x10/0x20 [ 87.974433] [<ffffffff811eff13>] vfs_kern_mount+0x73/0x110 [ 87.980676] [<ffffffff811f26e6>] do_mount+0x216/0xa70 [ 87.986432] [<ffffffff8118501b>] ? memdup_user+0x4b/0x90 [ 87.992475] [<ffffffff811850bb>] ? strndup_user+0x5b/0x80 [ 87.998620] [<ffffffff811f2fce>] sys_mount+0x8e/0xe0 [ 88.004281] [<ffffffff816c8899>] system_call_fastpath+0x16/0x1b [ 88.076900] GFS2: fsid=dm-14.0: jid=0, already locked for use [ 88.083342] GFS2: fsid=dm-14.0: jid=0: Looking at journal... [ 91.208829] GFS2: fsid=dm-14.0: jid=0: Done [ 91.213559] GFS2: fsid=dm-14.0: first mount done, others may mount
this looks like something that needs fixing upstream..... Notice that rcu_dereference_check() will check for rcu_read_lock but that is all. So when we repeat the search for an existing glock under the spinlock, then this will trigger. I think probably the bug is in rculist_bl.h since we have: static inline struct hlist_bl_node *hlist_bl_first_rcu(struct hlist_bl_head *h) { return (struct hlist_bl_node *) ((unsigned long)rcu_dereference(h->first) & ~LIST_BL_LOCKMASK); } What we probably need to do is to add a function hlist_bl_is_locked() and then to have: static inline struct hlist_bl_node *hlist_bl_first_rcu(struct hlist_bl_head *h) { return (struct hlist_bl_node *) ((unsigned long)rcu_dereference_check(h->first, hlist_bl_is_locked(h)) & ~LIST_BL_LOCKMASK); } or something along those lines....
Created attachment 690282 [details] Proposed fix Does this fix the issue for you?
The above patch seems to fix the issue. I've seen the messages before on the first mount after a reboot, but I don't see it anymore with this patch.
I've posted the proposed fix upstream
This bug appears to have been reported against 'rawhide' during the Fedora 19 development cycle. Changing version to '19'. (As we did not run this process for some time, it could affect also pre-Fedora 19 development cycle bugs. We are very sorry. It will help us with cleanup during Fedora 19 End Of Life. Thank you.) More information and reason for this action is here: https://fedoraproject.org/wiki/BugZappers/HouseKeeping/Fedora19
Is this still a problem with 3.9 based F19 kernels?
I think so - patch was scheduled for the next merge window upstream last I saw
Fix in upstream kernel: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/include/linux/list_bl.h?id=49d0de082c31de34cc896c14eec5f1c2ade0415a So it will be in the 3.10 kernel.