Description of problem: [ INFO: soft-safe -> soft-unsafe lock order detected ] 2.6.19-1.2891.fc7 #1 ------------------------------------------------------ cupsd/1884 [HC0[0]:SC0[1]:HE1:SE0] is trying to acquire: (&ssec->nlbl_lock){--..}, at: [<c04cec37>] selinux_netlbl_socket_setsid+0xbb/0x123 and this task is already holding: (af_callback_keys + sk->sk_family#3){-.-+}, at: [<c05daa1c>] inet_accept+0x70/0xb5 which would create a new lock dependency: (af_callback_keys + sk->sk_family#3){-.-+} -> (&ssec->nlbl_lock){--..} but this new dependency connects a soft-irq-safe lock: (af_callback_keys + sk->sk_family#3){-.-+} ... which became soft-irq-safe at: [<c043fff1>] __lock_acquire+0x37d/0x9f8 [<c044094d>] lock_acquire+0x56/0x6f [<c05fbdb6>] _read_lock_bh+0x30/0x3d [<c04c687e>] selinux_socket_sock_rcv_skb+0xbd/0x252 [<c05d0645>] tcp_v4_rcv+0x37a/0x909 [<c05b7593>] ip_local_deliver+0x185/0x22e [<c05b73d6>] ip_rcv+0x418/0x450 [<c059ae9c>] netif_receive_skb+0x2db/0x35a [<c059c85f>] process_backlog+0x95/0xf6 [<c059ca46>] net_rx_action+0xa1/0x1a8 [<c042bf5a>] __do_softirq+0x6f/0xe2 [<c04063a1>] do_softirq+0x61/0xc7 [<ffffffff>] 0xffffffff to a soft-irq-unsafe lock: (&ssec->nlbl_lock){--..} ... which became soft-irq-unsafe at: ... [<c044007d>] __lock_acquire+0x409/0x9f8 [<c044094d>] lock_acquire+0x56/0x6f [<c05fbc89>] _spin_lock+0x2b/0x38 [<c04cec37>] selinux_netlbl_socket_setsid+0xbb/0x123 [<c04d0c92>] selinux_netlbl_socket_post_create+0x2d/0x2f [<c04c807b>] selinux_socket_post_create+0x156/0x15c [<c059213c>] __sock_create+0x179/0x1b2 [<c05921ae>] sock_create+0x1a/0x1f [<c0592435>] sys_socket+0x1b/0x3c [<c0592cba>] sys_socketcall+0x77/0x241 [<c0404050>] syscall_call+0x7/0xb [<ffffffff>] 0xffffffff other info that might help us debug this: 3 locks held by cupsd/1884: #0: (sk_lock-AF_INET){--..}, at: [<c05da9dd>] inet_accept+0x31/0xb5 #1: (af_callback_keys + sk->sk_family#3){-.-+}, at: [<c05daa1c>] inet_accept+0x70/0xb5 #2: (policy_rwlock){..-?}, at: [<c04cebd0>] selinux_netlbl_socket_setsid+0x54/0x123 the soft-irq-safe lock's dependencies: -> (af_callback_keys + sk->sk_family#3){-.-+} ops: 0 { initial-use at: [<c0440098>] __lock_acquire+0x424/0x9f8 [<c044094d>] lock_acquire+0x56/0x6f [<c05fbd3b>] _write_lock_bh+0x30/0x3d [<c05c1871>] tcp_close+0x24b/0x52b [<c05da570>] inet_release+0x43/0x49 [<c0591f72>] sock_release+0x17/0x68 [<c05921e0>] sock_close+0x2d/0x33 [<c047959b>] __fput+0xbe/0x168 [<c047965c>] fput+0x17/0x19 [<c0476f5d>] filp_close+0x54/0x5c [<c0477ee1>] sys_close+0x78/0xb0 [<c0404050>] syscall_call+0x7/0xb [<ffffffff>] 0xffffffff hardirq-on-W at: [<c0440056>] __lock_acquire+0x3e2/0x9f8 [<c044094d>] lock_acquire+0x56/0x6f [<c05fbd3b>] _write_lock_bh+0x30/0x3d [<c05c1871>] tcp_close+0x24b/0x52b [<c05da570>] inet_release+0x43/0x49 [<c0591f72>] sock_release+0x17/0x68 [<c05921e0>] sock_close+0x2d/0x33 [<c047959b>] __fput+0xbe/0x168 [<c047965c>] fput+0x17/0x19 [<c0476f5d>] filp_close+0x54/0x5c [<c0477ee1>] sys_close+0x78/0xb0 [<c0404050>] syscall_call+0x7/0xb [<ffffffff>] 0xffffffff in-softirq-R at: [<c043fff1>] __lock_acquire+0x37d/0x9f8 [<c044094d>] lock_acquire+0x56/0x6f [<c05fbdb6>] _read_lock_bh+0x30/0x3d [<c04c687e>] selinux_socket_sock_rcv_skb+0xbd/0x252 [<c05d0645>] tcp_v4_rcv+0x37a/0x909 [<c05b7593>] ip_local_deliver+0x185/0x22e [<c05b73d6>] ip_rcv+0x418/0x450 [<c059ae9c>] netif_receive_skb+0x2db/0x35a [<c059c85f>] process_backlog+0x95/0xf6 [<c059ca46>] net_rx_action+0xa1/0x1a8 [<c042bf5a>] __do_softirq+0x6f/0xe2 [<c04063a1>] do_softirq+0x61/0xc7 [<ffffffff>] 0xffffffff hardirq-on-R at: [<c044001f>] __lock_acquire+0x3ab/0x9f8 [<c044094d>] lock_acquire+0x56/0x6f [<c05fbdb6>] _read_lock_bh+0x30/0x3d [<c04c687e>] selinux_socket_sock_rcv_skb+0xbd/0x252 [<c05d0645>] tcp_v4_rcv+0x37a/0x909 [<c05b7593>] ip_local_deliver+0x185/0x22e [<c05b73d6>] ip_rcv+0x418/0x450 [<c059ae9c>] netif_receive_skb+0x2db/0x35a [<c059c85f>] process_backlog+0x95/0xf6 [<c059ca46>] net_rx_action+0xa1/0x1a8 [<c042bf5a>] __do_softirq+0x6f/0xe2 [<c04063a1>] do_softirq+0x61/0xc7 [<ffffffff>] 0xffffffff } ... key at: [<c09f6770>] af_callback_keys+0x10/0x100 the soft-irq-unsafe lock's dependencies: -> (&ssec->nlbl_lock){--..} ops: 0 { initial-use at: [<c0440098>] __lock_acquire+0x424/0x9f8 [<c044094d>] lock_acquire+0x56/0x6f [<c05fbc89>] _spin_lock+0x2b/0x38 [<c04cec37>] selinux_netlbl_socket_setsid+0xbb/0x123 [<c04d0c92>] selinux_netlbl_socket_post_create+0x2d/0x2f [<c04c807b>] selinux_socket_post_create+0x156/0x15c [<c059213c>] __sock_create+0x179/0x1b2 [<c05921ae>] sock_create+0x1a/0x1f [<c0592435>] sys_socket+0x1b/0x3c [<c0592cba>] sys_socketcall+0x77/0x241 [<c0404050>] syscall_call+0x7/0xb [<ffffffff>] 0xffffffff softirq-on-W at: [<c044007d>] __lock_acquire+0x409/0x9f8 [<c044094d>] lock_acquire+0x56/0x6f [<c05fbc89>] _spin_lock+0x2b/0x38 [<c04cec37>] selinux_netlbl_socket_setsid+0xbb/0x123 [<c04d0c92>] selinux_netlbl_socket_post_create+0x2d/0x2f [<c04c807b>] selinux_socket_post_create+0x156/0x15c [<c059213c>] __sock_create+0x179/0x1b2 [<c05921ae>] sock_create+0x1a/0x1f [<c0592435>] sys_socket+0x1b/0x3c [<c0592cba>] sys_socketcall+0x77/0x241 [<c0404050>] syscall_call+0x7/0xb [<ffffffff>] 0xffffffff hardirq-on-W at: [<c0440056>] __lock_acquire+0x3e2/0x9f8 [<c044094d>] lock_acquire+0x56/0x6f [<c05fbc89>] _spin_lock+0x2b/0x38 [<c04cec37>] selinux_netlbl_socket_setsid+0xbb/0x123 [<c04d0c92>] selinux_netlbl_socket_post_create+0x2d/0x2f [<c04c807b>] selinux_socket_post_create+0x156/0x15c [<c059213c>] __sock_create+0x179/0x1b2 [<c05921ae>] sock_create+0x1a/0x1f [<c0592435>] sys_socket+0x1b/0x3c [<c0592cba>] sys_socketcall+0x77/0x241 [<c0404050>] syscall_call+0x7/0xb [<ffffffff>] 0xffffffff } ... key at: [<c09ebb20>] __key.27522+0x0/0x8 stack backtrace: [<c04051c1>] show_trace_log_lvl+0x1a/0x2f [<c0405766>] show_trace+0x12/0x14 [<c04057ea>] dump_stack+0x16/0x18 [<c043fbe0>] check_usage+0x242/0x24c [<c04404f3>] __lock_acquire+0x87f/0x9f8 [<c044094d>] lock_acquire+0x56/0x6f [<c05fbc89>] _spin_lock+0x2b/0x38 [<c04cec37>] selinux_netlbl_socket_setsid+0xbb/0x123 [<c04d0f40>] selinux_netlbl_sock_graft+0xc7/0xcf [<c04c4b74>] selinux_sock_graft+0x32/0x36 [<c05daa3b>] inet_accept+0x8f/0xb5 [<c0592b93>] sys_accept+0xd0/0x180 [<c0592d13>] sys_socketcall+0xd0/0x241 [<c0404050>] syscall_call+0x7/0xb Version-Release number of selected component (if applicable): 2.6.19-1.2891.fc7 How reproducible: This seems to happen every time.
Paul, Eric...any ideas? Are we missing some patches for 2.6.19 that went into netdev tree that might account for this problem? Thanks.
I'm currently chasing two other bugs right now, but I'll grab the latest FC/Rawhide kernel sources and take a look sometime this week. Thanks for capturing all of the debug information Miles.
I believe the following patch should correct this problem: http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=9883a13c72dbf8c518814b6091019643cdb34429 I'll try rebuilding the 2.6.19-1.2891.fc7 kernel with this patch applied and see what I get. In the meantime if Miles wants to give it a shot too that would be helpful.
there should be some test rawhide kernels with the patch in question at http://people.redhat.com/eparis/BZ220966 please let us know what you find with these kernels
I believe there patch I mentioned in comment #3 does not entirely solve this problem as there is still a softirq "unsafe" spinlock in selinux_netlbl_socket_setsid(). I am currently working on a patch for to use the correct locking calls, I will update the BZ when I have a patch.
Created attachment 144868 [details] Patch to David Miller's net-2.6 git tree This patch should address the problem described here, however, due to other problems with the 2.6.19-1.2891.fc7 I am having difficulty testing this to verify that it both solves the problem and does not introduce any new regressions. I am in the process of building a new kernel, I'll update once I know more.
I am currently on a business trip and will need to look into these patches when I get home. Thanks.
I was able to get a kernel working with this fix today and I have been playing with it for the past couple of hours and it appears to fix the problem for me. I'll be pushing this patch upstream later today.
my people page now has fedora kernels with both of these fixes. will wait for reporters comments before claiming it is 'fixed upstream'
this bug seems to duplicate 220726
*** Bug 220726 has been marked as a duplicate of this bug. ***
(In reply to comment #10) > this bug seems to duplicate 220726 Hi dex, Can you verify that Eric's test kernels solve the lockdep problem you are seeing? Thanks.
uname -a Linux dexterFC5t1 2.6.19-1.2904.2.2.fc7.2.BZ220966.TESTING.ONLY #0 SMP Fri Jan 5 14:02:45 EST 2007 i686 athlon i386 GNU/Linux testing ... so far so good a couple of hours up all's quiet
Great, thanks for testing the fix. If you see anything strange be sure to post an update.
from #13 uptime 08:41:39 up 14:32, 0 users, load average: 0.30, 0.34, 0.28 no sign of bug here under normal daily load well done.
*** Bug 223649 has been marked as a duplicate of this bug. ***
Moving back to waiting on me to check the upstream and FC devel kernel status of this fix.
closing as this is fixed in the shipping FC6 kernels.