Description of problem: While starting cluster services on all nodes in the cluster, one node produced the following warning message. [root@dash-01 ~]# service cman start Starting cluster: Loading modules... done Mounting configfs... done Setting network parameters... done Starting cman... done Starting daemons... ============================================= [ INFO: possible recursive locking detected ] 2.6.29-0.53.rc2.git1.fc11.x86_64 #1 --------------------------------------------- dlm_controld/5751 is trying to acquire lock: (&sb->s_type->i_mutex_key#12/2){--..}, at: [<ffffffffa01c6c48>] configfs_attach_group+0x4a/0x183 [configfs] but task is already holding lock: (&sb->s_type->i_mutex_key#12/2){--..}, at: [<ffffffffa01c6c48>] configfs_attach_group+0x4a/0x183 [configfs] other info that might help us debug this: 2 locks held by dlm_controld/5751: #0: (&sb->s_type->i_mutex_key#11/1){--..}, at: [<ffffffff810e794b>] lookup_create+0x26/0x94 #1: (&sb->s_type->i_mutex_key#12/2){--..}, at: [<ffffffffa01c6c48>] configfs_attach_group+0x4a/0x183 [configfs] stack backtrace: Pid: 5751, comm: dlm_controld Not tainted 2.6.29-0.53.rc2.git1.fc11.x86_64 #1 Call Trace: [<ffffffff8106e715>] __lock_acquire+0x863/0xc41 [<ffffffff8106eb80>] lock_acquire+0x8d/0xba [<ffffffffa01c6c48>] ? configfs_attach_group+0x4a/0x183 [configfs] [<ffffffff813818aa>] __mutex_lock_common+0x107/0x39c [<ffffffffa01c6c48>] ? configfs_attach_group+0x4a/0x183 [configfs] [<ffffffff8138308b>] ? _spin_unlock+0x26/0x2a [<ffffffffa01c6c48>] ? configfs_attach_group+0x4a/0x183 [configfs] [<ffffffff81381be8>] mutex_lock_nested+0x35/0x3a [<ffffffffa01c6c48>] configfs_attach_group+0x4a/0x183 [configfs] [<ffffffff8138308b>] ? _spin_unlock+0x26/0x2a [<ffffffffa01c6cf8>] configfs_attach_group+0xfa/0x183 [configfs] [<ffffffffa01c6fbc>] configfs_mkdir+0x23b/0x326 [configfs] [<ffffffff810e7c1e>] vfs_mkdir+0x6c/0xbb [<ffffffff810e9922>] sys_mkdirat+0xa2/0xf5 [<ffffffff8101130a>] ? sysret_check+0x46/0x81 [<ffffffff8106d719>] ? trace_hardirqs_on_caller+0x12f/0x153 [<ffffffff810e9988>] sys_mkdir+0x13/0x15 [<ffffffff810112ba>] system_call_fastpath+0x16/0x1b done Starting fencing... done [ OK ] Version-Release number of selected component (if applicable): kernel-2.6.29-0.53.rc2.git1.fc11.x86_64 cman-3.0.0-4.alpha3.fc11.x86_64 How reproducible: Unknown Steps to Reproduce: 1. service cman start Actual results: See above Expected results: [root@dash-01 ~]# service cman start Starting cluster: Loading modules... done Mounting configfs... done Setting network parameters... done Starting cman... done Starting daemons... done Starting fencing... done [ OK ] Additional info:
We already know about this one. It should really be filed against configfs since that is where the problem lies. There is a long correspondence on lkml about it, and it might even get fixed shortly.
Looks like the "fix" for this has hit upstream now.
Reassigning to fabio (he does the fc11 builds).
This is a kernel bug and it has been fixed afaict or at least I can't reproduce it anylonger on my test machines.