Bug 168421 - kernel: BUG: rwlock recursion on CPU#0, ntpd/2120
kernel: BUG: rwlock recursion on CPU#0, ntpd/2120
Status: CLOSED RAWHIDE
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
rawhide
i686 Linux
medium Severity medium
: ---
: ---
Assigned To: David Miller
Brian Brock
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2005-09-15 16:58 EDT by Tom London
Modified: 2007-11-30 17:11 EST (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2005-09-28 05:31:34 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Upstream patch for event cache locking bug (7.07 KB, patch)
2005-09-27 18:35 EDT, David Miller
no flags Details | Diff

  None (edit)
Description Tom London 2005-09-15 16:58:41 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8b3) Gecko/20050827 Fedora/1.1-0.2.8.deerpark.alpha2 Firefox/1.0+

Description of problem:
Sep 15 13:01:26 localhost kernel: BUG: rwlock recursion on CPU#0, ntpd/2120, f8adbbe0 (Tainted: P     )
Sep 15 13:01:26 localhost kernel:  [<c01d6bbc>] _raw_write_lock+0x56/0x58
Sep 15 13:01:26 localhost kernel:  [<f8ad0f64>] destroy_conntrack+0x4e/0xfa [ip_conntrack]
Sep 15 13:01:26 localhost kernel:  [<f8ad0b4b>] __ip_ct_event_cache_init+0x66/0x72 [ip_conntrack]
Sep 15 13:01:26 localhost kernel:  [<f8ad2418>] ip_ct_refresh_acct+0xf3/0x104 [ip_conntrack]
Sep 15 13:01:26 localhost kernel:  [<f8ad3c56>] udp_packet+0x23/0x81 [ip_conntrack]
Sep 15 13:01:26 localhost kernel:  [<f8ad1a52>] ip_conntrack_in+0x158/0x260 [ip_conntrack]
Sep 15 13:01:26 localhost kernel:  [<c030bded>] nf_iterate+0x60/0x84
Sep 15 13:01:26 localhost kernel:  [<c02db4bc>] dst_output+0x0/0x18
Sep 15 13:01:26 localhost kernel:  [<c030be5e>] nf_hook_slow+0x4d/0xf2
Sep 15 13:01:26 localhost kernel:  [<c02db4bc>] dst_output+0x0/0x18
Sep 15 13:01:26 localhost kernel:  [<c02ddcbf>] ip_push_pending_frames+0x2c8/0x445
Sep 15 13:01:26 localhost kernel:  [<c02db4bc>] dst_output+0x0/0x18
Sep 15 13:01:26 localhost kernel:  [<c02f68a8>] udp_push_pending_frames+0x131/0x26b
Sep 15 13:01:26 localhost kernel:  [<c02f6dc6>] udp_sendmsg+0x3a8/0x708
Sep 15 13:01:26 localhost kernel:  [<c02fdabe>] inet_sendmsg+0x2b/0x49
Sep 15 13:01:26 localhost kernel:  [<c02b4b5f>] sock_sendmsg+0xe4/0xff
Sep 15 13:01:26 localhost kernel:  [<c01045b4>] do_IRQ+0x51/0x82
Sep 15 13:01:26 localhost kernel:  [<c01463e5>] check_poison_obj+0x24/0x17e
Sep 15 13:01:26 localhost kernel:  [<c012dd26>] autoremove_wake_function+0x0/0x37
Sep 15 13:01:26 localhost kernel:  [<c0145b6c>] dbg_redzone1+0xe/0x1f
Sep 15 13:01:26 localhost kernel:  [<c0147d39>] cache_alloc_debugcheck_after+0x2e/0x11a
Sep 15 13:01:26 localhost kernel:  [<c0148201>] __kmalloc+0x90/0xd9
Sep 15 13:01:26 localhost kernel:  [<c013dc1d>] audit_sockaddr+0x39/0x78
Sep 15 13:01:26 localhost kernel:  [<c013dc1d>] audit_sockaddr+0x39/0x78
Sep 15 13:01:26 localhost kernel:  [<c02b5ecb>] sys_sendto+0xf5/0x116
Sep 15 13:01:26 localhost kernel:  [<c031c371>] _spin_unlock_irq+0x5/0x7
Sep 15 13:01:26 localhost kernel:  [<c0102c47>] handle_signal+0xeb/0x136
Sep 15 13:01:26 localhost kernel:  [<c01463e5>] check_poison_obj+0x24/0x17e
Sep 15 13:01:26 localhost kernel:  [<c01461e4>] poison_obj+0x20/0x3d
Sep 15 13:01:26 localhost kernel:  [<c0145b6c>] dbg_redzone1+0xe/0x1f
Sep 15 13:01:26 localhost kernel:  [<c0145b6c>] dbg_redzone1+0xe/0x1f
Sep 15 13:01:26 localhost kernel:  [<c0147d39>] cache_alloc_debugcheck_after+0x2e/0x11a
Sep 15 13:01:26 localhost kernel:  [<c02b67c9>] sys_socketcall+0x1b9/0x292
Sep 15 13:01:26 localhost kernel:  [<c0102ecd>] syscall_call+0x7/0xb



Version-Release number of selected component (if applicable):
kernel-2.6.13-1.1555_FC5

How reproducible:
Sometimes

Steps to Reproduce:
1. ?
2.
3.
  

Additional info:
Comment 1 Tom London 2005-09-21 11:57:08 EDT
Got this with latest rawhide kernel: kernel-2.6.13-1.1565_FC5.
Looks the same as the one above, except the process reported is 
vmnet-natd (was ntpd in the above).

Sep 21 08:39:26 localhost kernel: BUG: rwlock recursion on CPU#0,
vmnet-natd/5081, f8b52bc0 (Tainted: P     )
Sep 21 08:39:26 localhost kernel:  [<c01d701c>] _raw_write_lock+0x56/0x58
Sep 21 08:39:26 localhost kernel:  [<f8b47f64>] destroy_conntrack+0x4e/0xfa
[ip_conntrack]
Sep 21 08:39:26 localhost kernel:  [<f8b47b4b>]
__ip_ct_event_cache_init+0x66/0x72 [ip_conntrack]
Sep 21 08:39:26 localhost kernel:  [<f8b4941c>] ip_ct_refresh_acct+0xf7/0x108
[ip_conntrack]
Sep 21 08:39:26 localhost kernel:  [<f8b4ac56>] udp_packet+0x23/0x81 [ip_conntrack]
Sep 21 08:39:26 localhost kernel:  [<f8b48a52>] ip_conntrack_in+0x158/0x260
[ip_conntrack]
Sep 21 08:39:26 localhost kernel:  [<c030c32d>] nf_iterate+0x60/0x84
Sep 21 08:39:26 localhost kernel:  [<c02db96c>] dst_output+0x0/0x18
Sep 21 08:39:26 localhost kernel:  [<c030c39e>] nf_hook_slow+0x4d/0xf2
Sep 21 08:39:26 localhost kernel:  [<c02db96c>] dst_output+0x0/0x18
Sep 21 08:39:26 localhost kernel:  [<c02de16f>] ip_push_pending_frames+0x2c8/0x445
Sep 21 08:39:26 localhost kernel:  [<c02db96c>] dst_output+0x0/0x18
Sep 21 08:39:26 localhost kernel:  [<c02f6dd8>] udp_push_pending_frames+0x131/0x26b
Sep 21 08:39:26 localhost kernel:  [<c01b7833>] avc_has_perm_noaudit+0x26/0xd1
Sep 21 08:39:26 localhost kernel:  [<c02f72f6>] udp_sendmsg+0x3a8/0x708
Sep 21 08:39:26 localhost kernel:  [<c01bb85b>] socket_has_perm+0x50/0x58
Sep 21 08:39:26 localhost kernel:  [<c02fdfee>] inet_sendmsg+0x2b/0x49
Sep 21 08:39:26 localhost kernel:  [<c02b500f>] sock_sendmsg+0xe4/0xff
Sep 21 08:39:26 localhost kernel:  [<c02d7593>] __ip_route_output_key+0xa9/0xc2
Sep 21 08:39:26 localhost kernel:  [<c012dd16>] autoremove_wake_function+0x0/0x37
Sep 21 08:39:26 localhost kernel:  [<c02f5180>]
ip4_datagram_connect+0x210/0x340Sep 21 08:39:26 localhost kernel:  [<c02b637b>]
sys_sendto+0xf5/0x116
Sep 21 08:39:26 localhost kernel:  [<c031c8ce>] _read_unlock_irq+0x5/0x7
Sep 21 08:39:26 localhost kernel:  [<c013f1ad>] find_get_page+0x36/0x41
Sep 21 08:39:26 localhost kernel:  [<c01401dd>] filemap_nopage+0x2d4/0x384
Sep 21 08:39:26 localhost kernel:  [<c02fdc4b>] inet_dgram_connect+0x2b/0x57
Sep 21 08:39:26 localhost kernel:  [<c01463d5>] check_poison_obj+0x24/0x17e
Sep 21 08:39:26 localhost kernel:  [<c014788d>] cache_free_debugcheck+0x11f/0x1fe
Sep 21 08:39:26 localhost kernel:  [<c01461d4>] poison_obj+0x20/0x3d
Sep 21 08:39:26 localhost kernel:  [<c02b63d3>] sys_send+0x37/0x3b
Sep 21 08:39:26 localhost kernel:  [<c02b6c23>] sys_socketcall+0x163/0x292
Sep 21 08:39:26 localhost kernel:  [<c0102ecd>] syscall_call+0x7/0xb
Comment 2 Dave Jones 2005-09-21 22:43:10 EDT
reassigning to davem, though this may be a problem in your vmnet modules that
you'll have to bring up with vmware.
Comment 3 David Miller 2005-09-21 23:26:57 EDT
No, this looks like a legitimate bug, will report to the netfilter
folks so that it gets fixed upstream.
Comment 4 Todd Mokros 2005-09-25 09:52:03 EDT
Here's an additional report, with a freshly booted kernel-2.6.13-1.1576_FC5.i686:

BUG: rwlock recursion on CPU#0, nscd/3528, e8b41bc0 (Not tainted)
 [<c01dc39c>] _raw_write_lock+0x56/0x58
 [<e8b36f64>] destroy_conntrack+0x4e/0xfa [ip_conntrack]
 [<e8b36b4b>] __ip_ct_event_cache_init+0x66/0x72 [ip_conntrack]
 [<e8b3841c>] ip_ct_refresh_acct+0xf7/0x108 [ip_conntrack]
 [<e8b39c56>] udp_packet+0x23/0x81 [ip_conntrack]
 [<e8b37a52>] ip_conntrack_in+0x158/0x260 [ip_conntrack]
 [<c0312ecd>] nf_iterate+0x60/0x84
 [<c02e250c>] dst_output+0x0/0x18
 [<c0312f3e>] nf_hook_slow+0x4d/0xf2
 [<c02e250c>] dst_output+0x0/0x18
 [<c02e4d0f>] ip_push_pending_frames+0x2c8/0x445
 [<c02e250c>] dst_output+0x0/0x18
 [<c02fd978>] udp_push_pending_frames+0x131/0x26b
 [<c02fde96>] udp_sendmsg+0x3a8/0x708
 [<c032422e>] do_page_fault+0x25e/0x640
 [<c0304b8e>] inet_sendmsg+0x2b/0x49
 [<c02bbbaf>] sock_sendmsg+0xe4/0xff
 [<c02de133>] __ip_route_output_key+0xa9/0xc2
 [<c0132df6>] autoremove_wake_function+0x0/0x37
 [<c02fbd20>] ip4_datagram_connect+0x210/0x340
 [<c02bcf1b>] sys_sendto+0xf5/0x116
 [<c03047eb>] inet_dgram_connect+0x2b/0x57
 [<c02bcc9b>] sys_connect+0x6e/0x9e
 [<c014b2f4>] poison_obj+0x20/0x3d
 [<c014b4f5>] check_poison_obj+0x24/0x17e
 [<c014b2f4>] poison_obj+0x20/0x3d
 [<c02bcf73>] sys_send+0x37/0x3b
 [<c02bd7c3>] sys_socketcall+0x163/0x292
 [<c0102ecd>] syscall_call+0x7/0xb
Comment 5 Tom London 2005-09-27 10:14:43 EDT
Still get this in 2.6.13-1.1578_FC5:

Sep 27 07:09:11 localhost kernel: BUG: rwlock recursion on CPU#0, ntpd/2359,
f8be4bc0 (Tainted: P     )
Sep 27 07:09:11 localhost kernel:  [<c01dc39c>] _raw_write_lock+0x56/0x58
Sep 27 07:09:11 localhost kernel:  [<f8bd9f64>] destroy_conntrack+0x4e/0xfa
[ip_conntrack]
Sep 27 07:09:11 localhost kernel:  [<f8bd9b4b>]
__ip_ct_event_cache_init+0x66/0x72 [ip_conntrack]
Sep 27 07:09:11 localhost kernel:  [<f8bdb41c>] ip_ct_refresh_acct+0xf7/0x108
[ip_conntrack]
Sep 27 07:09:11 localhost kernel:  [<f8bdcc56>] udp_packet+0x23/0x81 [ip_conntrack]
Sep 27 07:09:11 localhost kernel:  [<f8bdaa52>] ip_conntrack_in+0x158/0x260
[ip_conntrack]
Sep 27 07:09:11 localhost kernel:  [<c0312ecd>] nf_iterate+0x60/0x84
Sep 27 07:09:11 localhost kernel:  [<c02e250c>] dst_output+0x0/0x18
Sep 27 07:09:11 localhost kernel:  [<c0312f3e>] nf_hook_slow+0x4d/0xf2
Sep 27 07:09:11 localhost kernel:  [<c02e250c>] dst_output+0x0/0x18
Sep 27 07:09:11 localhost kernel:  [<c02e4d0f>] ip_push_pending_frames+0x2c8/0x445
Sep 27 07:09:11 localhost kernel:  [<c02e250c>] dst_output+0x0/0x18
Sep 27 07:09:11 localhost kernel:  [<c02fd978>] udp_push_pending_frames+0x131/0x26b
Sep 27 07:09:11 localhost kernel:  [<c02fde96>] udp_sendmsg+0x3a8/0x708
Sep 27 07:09:11 localhost kernel:  [<c0304b8e>] inet_sendmsg+0x2b/0x49
Sep 27 07:09:11 localhost kernel:  [<c02bbbaf>] sock_sendmsg+0xe4/0xff
Sep 27 07:09:11 localhost kernel:  [<c0132df6>] autoremove_wake_function+0x0/0x37
Sep 27 07:09:11 localhost kernel:  [<c014b4f5>] check_poison_obj+0x24/0x17e
Sep 27 07:09:11 localhost kernel:  [<c0132df6>] autoremove_wake_function+0x0/0x37
Sep 27 07:09:11 localhost kernel:  [<c014ac7c>] dbg_redzone1+0xe/0x1f
Sep 27 07:09:11 localhost kernel:  [<c014ce49>]
cache_alloc_debugcheck_after+0x2e/0x11a
Sep 27 07:09:11 localhost kernel:  [<c014d311>] __kmalloc+0x90/0xd9
Sep 27 07:09:11 localhost kernel:  [<c0142ced>] audit_sockaddr+0x39/0x78
Sep 27 07:09:11 localhost kernel:  [<c0142ced>] audit_sockaddr+0x39/0x78
Sep 27 07:09:11 localhost kernel:  [<c02bcf1b>] sys_sendto+0xf5/0x116
Sep 27 07:09:11 localhost kernel:  [<c0323451>] _spin_unlock_irq+0x5/0x7
Sep 27 07:09:11 localhost kernel:  [<c0102c47>] handle_signal+0xeb/0x136
Sep 27 07:09:11 localhost kernel:  [<c014b4f5>] check_poison_obj+0x24/0x17e
Sep 27 07:09:11 localhost kernel:  [<c014b2f4>] poison_obj+0x20/0x3d
Sep 27 07:09:11 localhost kernel:  [<c014ac7c>] dbg_redzone1+0xe/0x1f
Sep 27 07:09:11 localhost kernel:  [<c014ce49>]
cache_alloc_debugcheck_after+0x2e/0x11a
Sep 27 07:09:11 localhost kernel:  [<c02bd819>] sys_socketcall+0x1b9/0x292
Sep 27 07:09:11 localhost kernel:  [<c0102ecd>] syscall_call+0x7/0xb
Comment 6 David Miller 2005-09-27 14:27:17 EDT
Current upstream 2.6.x Linus tree has the fix for this, so whenever
that gets into the -devel kernel you'll get the fix.
Comment 7 Dave Jones 2005-09-27 18:11:17 EDT
2.6.13-1.1578 is upstream 2.6.14rc2-git6, which is current, so something still
isn't right.
Comment 8 David Miller 2005-09-27 18:33:44 EDT
the fix went in three nights ago, and the code no longer takes the write
lock on the event cache code paths so the backtraces we're getting now
are impossible with the upstream tree

I'll attach the patch to this bug report, so that it is clear exactly
what change should be fixing this bug.
Comment 9 David Miller 2005-09-27 18:35:35 EDT
Created attachment 119333 [details]
Upstream patch for event cache locking bug

2.6.14 GIT upstream netfilter fix for this bug
Comment 10 Dave Jones 2005-09-28 05:31:34 EDT
dumb thinko in the kernel spec file wasn't applying the -git patches.
Will be in tomorrows rawhide.

Note You need to log in before you can comment on or make changes to this bug.