Bug 220966 - NetLabel: 2.6.19-1.2891.fc7 - INFO: soft-safe -> soft-unsafe lock order detected
NetLabel: 2.6.19-1.2891.fc7 - INFO: soft-safe -> soft-unsafe lock order detected
Status: CLOSED CURRENTRELEASE
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
rawhide
i386 Linux
medium Severity medium
: ---
: ---
Assigned To: Eric Paris
Brian Brock
:
: 220726 223649 (view as bug list)
Depends On:
Blocks: 221504
  Show dependency treegraph
 
Reported: 2006-12-29 13:01 EST by Miles Lane
Modified: 2007-11-30 17:11 EST (History)
9 users (show)

See Also:
Fixed In Version: FC6
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2007-03-13 15:48:34 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Patch to David Miller's net-2.6 git tree (1.04 KB, patch)
2007-01-04 18:54 EST, Paul Moore
no flags Details | Diff

  None (edit)
Description Miles Lane 2006-12-29 13:01:50 EST
Description of problem:

[ INFO: soft-safe -> soft-unsafe lock order detected ]
2.6.19-1.2891.fc7 #1
------------------------------------------------------
cupsd/1884 [HC0[0]:SC0[1]:HE1:SE0] is trying to acquire:
 (&ssec->nlbl_lock){--..}, at: [<c04cec37>]
selinux_netlbl_socket_setsid+0xbb/0x123

and this task is already holding:
 (af_callback_keys + sk->sk_family#3){-.-+}, at: [<c05daa1c>]
inet_accept+0x70/0xb5
which would create a new lock dependency:
 (af_callback_keys + sk->sk_family#3){-.-+} -> (&ssec->nlbl_lock){--..}

but this new dependency connects a soft-irq-safe lock:
 (af_callback_keys + sk->sk_family#3){-.-+}
... which became soft-irq-safe at:
 [<c043fff1>] __lock_acquire+0x37d/0x9f8
 [<c044094d>] lock_acquire+0x56/0x6f
 [<c05fbdb6>] _read_lock_bh+0x30/0x3d
 [<c04c687e>] selinux_socket_sock_rcv_skb+0xbd/0x252
 [<c05d0645>] tcp_v4_rcv+0x37a/0x909
 [<c05b7593>] ip_local_deliver+0x185/0x22e
 [<c05b73d6>] ip_rcv+0x418/0x450
 [<c059ae9c>] netif_receive_skb+0x2db/0x35a
 [<c059c85f>] process_backlog+0x95/0xf6
 [<c059ca46>] net_rx_action+0xa1/0x1a8
 [<c042bf5a>] __do_softirq+0x6f/0xe2
 [<c04063a1>] do_softirq+0x61/0xc7
 [<ffffffff>] 0xffffffff

to a soft-irq-unsafe lock:
 (&ssec->nlbl_lock){--..}
... which became soft-irq-unsafe at:
...  [<c044007d>] __lock_acquire+0x409/0x9f8
 [<c044094d>] lock_acquire+0x56/0x6f
 [<c05fbc89>] _spin_lock+0x2b/0x38
 [<c04cec37>] selinux_netlbl_socket_setsid+0xbb/0x123
 [<c04d0c92>] selinux_netlbl_socket_post_create+0x2d/0x2f
 [<c04c807b>] selinux_socket_post_create+0x156/0x15c
 [<c059213c>] __sock_create+0x179/0x1b2
 [<c05921ae>] sock_create+0x1a/0x1f
 [<c0592435>] sys_socket+0x1b/0x3c
 [<c0592cba>] sys_socketcall+0x77/0x241
 [<c0404050>] syscall_call+0x7/0xb
 [<ffffffff>] 0xffffffff

other info that might help us debug this:

3 locks held by cupsd/1884:
 #0:  (sk_lock-AF_INET){--..}, at: [<c05da9dd>] inet_accept+0x31/0xb5
 #1:  (af_callback_keys + sk->sk_family#3){-.-+}, at: [<c05daa1c>]
inet_accept+0x70/0xb5
 #2:  (policy_rwlock){..-?}, at: [<c04cebd0>]
selinux_netlbl_socket_setsid+0x54/0x123

the soft-irq-safe lock's dependencies:
-> (af_callback_keys + sk->sk_family#3){-.-+} ops: 0 {
  initial-use  at:
                       [<c0440098>] __lock_acquire+0x424/0x9f8
                       [<c044094d>] lock_acquire+0x56/0x6f
                       [<c05fbd3b>] _write_lock_bh+0x30/0x3d
                       [<c05c1871>] tcp_close+0x24b/0x52b
                       [<c05da570>] inet_release+0x43/0x49
                       [<c0591f72>] sock_release+0x17/0x68
                       [<c05921e0>] sock_close+0x2d/0x33
                       [<c047959b>] __fput+0xbe/0x168
                       [<c047965c>] fput+0x17/0x19
                       [<c0476f5d>] filp_close+0x54/0x5c
                       [<c0477ee1>] sys_close+0x78/0xb0
                       [<c0404050>] syscall_call+0x7/0xb
                       [<ffffffff>] 0xffffffff
  hardirq-on-W at:
                       [<c0440056>] __lock_acquire+0x3e2/0x9f8
                       [<c044094d>] lock_acquire+0x56/0x6f
                       [<c05fbd3b>] _write_lock_bh+0x30/0x3d
                       [<c05c1871>] tcp_close+0x24b/0x52b
                       [<c05da570>] inet_release+0x43/0x49
                       [<c0591f72>] sock_release+0x17/0x68
                       [<c05921e0>] sock_close+0x2d/0x33
                       [<c047959b>] __fput+0xbe/0x168
                       [<c047965c>] fput+0x17/0x19
                       [<c0476f5d>] filp_close+0x54/0x5c
                       [<c0477ee1>] sys_close+0x78/0xb0
                       [<c0404050>] syscall_call+0x7/0xb
                       [<ffffffff>] 0xffffffff
  in-softirq-R at:
                       [<c043fff1>] __lock_acquire+0x37d/0x9f8
                       [<c044094d>] lock_acquire+0x56/0x6f
                       [<c05fbdb6>] _read_lock_bh+0x30/0x3d
                       [<c04c687e>] selinux_socket_sock_rcv_skb+0xbd/0x252
                       [<c05d0645>] tcp_v4_rcv+0x37a/0x909
                       [<c05b7593>] ip_local_deliver+0x185/0x22e
                       [<c05b73d6>] ip_rcv+0x418/0x450
                       [<c059ae9c>] netif_receive_skb+0x2db/0x35a
                       [<c059c85f>] process_backlog+0x95/0xf6
                       [<c059ca46>] net_rx_action+0xa1/0x1a8
                       [<c042bf5a>] __do_softirq+0x6f/0xe2
                       [<c04063a1>] do_softirq+0x61/0xc7
                       [<ffffffff>] 0xffffffff
  hardirq-on-R at:
                       [<c044001f>] __lock_acquire+0x3ab/0x9f8
                       [<c044094d>] lock_acquire+0x56/0x6f
                       [<c05fbdb6>] _read_lock_bh+0x30/0x3d
                       [<c04c687e>] selinux_socket_sock_rcv_skb+0xbd/0x252
                       [<c05d0645>] tcp_v4_rcv+0x37a/0x909
                       [<c05b7593>] ip_local_deliver+0x185/0x22e
                       [<c05b73d6>] ip_rcv+0x418/0x450
                       [<c059ae9c>] netif_receive_skb+0x2db/0x35a
                       [<c059c85f>] process_backlog+0x95/0xf6
                       [<c059ca46>] net_rx_action+0xa1/0x1a8
                       [<c042bf5a>] __do_softirq+0x6f/0xe2
                       [<c04063a1>] do_softirq+0x61/0xc7
                       [<ffffffff>] 0xffffffff
 }
 ... key      at: [<c09f6770>] af_callback_keys+0x10/0x100

the soft-irq-unsafe lock's dependencies:
-> (&ssec->nlbl_lock){--..} ops: 0 {
  initial-use  at:
                       [<c0440098>] __lock_acquire+0x424/0x9f8
                       [<c044094d>] lock_acquire+0x56/0x6f
                       [<c05fbc89>] _spin_lock+0x2b/0x38
                       [<c04cec37>] selinux_netlbl_socket_setsid+0xbb/0x123
                       [<c04d0c92>] selinux_netlbl_socket_post_create+0x2d/0x2f
                       [<c04c807b>] selinux_socket_post_create+0x156/0x15c
                       [<c059213c>] __sock_create+0x179/0x1b2
                       [<c05921ae>] sock_create+0x1a/0x1f
                       [<c0592435>] sys_socket+0x1b/0x3c
                       [<c0592cba>] sys_socketcall+0x77/0x241
                       [<c0404050>] syscall_call+0x7/0xb
                       [<ffffffff>] 0xffffffff
  softirq-on-W at:
                       [<c044007d>] __lock_acquire+0x409/0x9f8
                       [<c044094d>] lock_acquire+0x56/0x6f
                       [<c05fbc89>] _spin_lock+0x2b/0x38
                       [<c04cec37>] selinux_netlbl_socket_setsid+0xbb/0x123
                       [<c04d0c92>] selinux_netlbl_socket_post_create+0x2d/0x2f
                       [<c04c807b>] selinux_socket_post_create+0x156/0x15c
                       [<c059213c>] __sock_create+0x179/0x1b2
                       [<c05921ae>] sock_create+0x1a/0x1f
                       [<c0592435>] sys_socket+0x1b/0x3c
                       [<c0592cba>] sys_socketcall+0x77/0x241
                       [<c0404050>] syscall_call+0x7/0xb
                       [<ffffffff>] 0xffffffff
  hardirq-on-W at:
                       [<c0440056>] __lock_acquire+0x3e2/0x9f8
                       [<c044094d>] lock_acquire+0x56/0x6f
                       [<c05fbc89>] _spin_lock+0x2b/0x38
                       [<c04cec37>] selinux_netlbl_socket_setsid+0xbb/0x123
                       [<c04d0c92>] selinux_netlbl_socket_post_create+0x2d/0x2f
                       [<c04c807b>] selinux_socket_post_create+0x156/0x15c
                       [<c059213c>] __sock_create+0x179/0x1b2
                       [<c05921ae>] sock_create+0x1a/0x1f
                       [<c0592435>] sys_socket+0x1b/0x3c
                       [<c0592cba>] sys_socketcall+0x77/0x241
                       [<c0404050>] syscall_call+0x7/0xb
                       [<ffffffff>] 0xffffffff
 }
 ... key      at: [<c09ebb20>] __key.27522+0x0/0x8

stack backtrace:
 [<c04051c1>] show_trace_log_lvl+0x1a/0x2f
 [<c0405766>] show_trace+0x12/0x14
 [<c04057ea>] dump_stack+0x16/0x18
 [<c043fbe0>] check_usage+0x242/0x24c
 [<c04404f3>] __lock_acquire+0x87f/0x9f8
 [<c044094d>] lock_acquire+0x56/0x6f
 [<c05fbc89>] _spin_lock+0x2b/0x38
 [<c04cec37>] selinux_netlbl_socket_setsid+0xbb/0x123
 [<c04d0f40>] selinux_netlbl_sock_graft+0xc7/0xcf
 [<c04c4b74>] selinux_sock_graft+0x32/0x36
 [<c05daa3b>] inet_accept+0x8f/0xb5
 [<c0592b93>] sys_accept+0xd0/0x180
 [<c0592d13>] sys_socketcall+0xd0/0x241
 [<c0404050>] syscall_call+0x7/0xb

Version-Release number of selected component (if applicable):
2.6.19-1.2891.fc7

How reproducible:
This seems to happen every time.
Comment 1 Steve Grubb 2006-12-30 07:47:29 EST
Paul, Eric...any ideas? Are we missing some patches for 2.6.19  that went into
netdev tree that might account for this problem? Thanks.
Comment 2 Paul Moore 2007-01-02 16:31:17 EST
I'm currently chasing two other bugs right now, but I'll grab the latest 
FC/Rawhide kernel sources and take a look sometime this week.  Thanks for 
capturing all of the debug information Miles.
Comment 3 Paul Moore 2007-01-03 15:48:58 EST
I believe the following patch should correct this problem:

http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=9883a13c72dbf8c518814b6091019643cdb34429

I'll try rebuilding the 2.6.19-1.2891.fc7 kernel with this patch applied and 
see what I get.  In the meantime if Miles wants to give it a shot too that 
would be helpful.
Comment 4 Eric Paris 2007-01-04 12:22:27 EST
there should be some test rawhide kernels with the patch in question at
http://people.redhat.com/eparis/BZ220966

please let us know what you find with these kernels
Comment 5 Paul Moore 2007-01-04 17:27:30 EST
I believe there patch I mentioned in comment #3 does not entirely solve this 
problem as there is still a softirq "unsafe" spinlock in 
selinux_netlbl_socket_setsid().  I am currently working on a patch for to use 
the correct locking calls, I will update the BZ when I have a patch.
Comment 6 Paul Moore 2007-01-04 18:54:08 EST
Created attachment 144868 [details]
Patch to David Miller's net-2.6 git tree

This patch should address the problem described here, however, due to other
problems with the 2.6.19-1.2891.fc7 I am having difficulty testing this to
verify that it both solves the problem and does not introduce any new
regressions.

I am in the process of building a new kernel, I'll update once I know more.
Comment 7 Miles Lane 2007-01-04 22:55:38 EST
I am currently on a business trip and will need to look into these patches when
I get home.  Thanks.
Comment 8 Paul Moore 2007-01-05 14:56:54 EST
I was able to get a kernel working with this fix today and I have been playing 
with it for the past couple of hours and it appears to fix the problem for me.  
I'll be pushing this patch upstream later today.
Comment 9 Eric Paris 2007-01-08 14:30:54 EST
my people page now has fedora kernels with both of these fixes.  will wait for
reporters comments before claiming it is 'fixed upstream'
Comment 10 dex 2007-01-09 05:44:35 EST
this bug seems to duplicate 220726
Comment 11 Eric Paris 2007-01-09 09:40:10 EST
*** Bug 220726 has been marked as a duplicate of this bug. ***
Comment 12 Paul Moore 2007-01-09 11:40:19 EST
(In reply to comment #10)
> this bug seems to duplicate 220726

Hi dex,

Can you verify that Eric's test kernels solve the lockdep problem you are 
seeing?

Thanks.
Comment 13 dex 2007-01-09 14:07:49 EST
uname -a
Linux dexterFC5t1 2.6.19-1.2904.2.2.fc7.2.BZ220966.TESTING.ONLY #0 SMP Fri Jan 5
14:02:45 EST 2007 i686 athlon i386 GNU/Linux

testing ... so far so good a couple of hours up all's quiet 
Comment 14 Paul Moore 2007-01-09 14:10:58 EST
Great, thanks for testing the fix.  If you see anything strange be sure to 
post an update.
Comment 15 dex 2007-01-10 03:46:15 EST
from #13

uptime
 08:41:39 up 14:32,  0 users,  load average: 0.30, 0.34, 0.28

no sign of bug here under normal daily load well done.
Comment 16 Eric Paris 2007-02-14 10:41:19 EST
*** Bug 223649 has been marked as a duplicate of this bug. ***
Comment 17 Eric Paris 2007-02-14 10:44:59 EST
Moving back to waiting on me to check the upstream and FC devel kernel status of
this fix.
Comment 18 Eric Paris 2007-03-13 15:48:34 EDT
closing as this is fixed in the shipping FC6 kernels.

Note You need to log in before you can comment on or make changes to this bug.