From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8b3) Gecko/20050827 Fedora/1.1-0.2.8.deerpark.alpha2 Firefox/1.0+ Description of problem: Sep 15 13:01:26 localhost kernel: BUG: rwlock recursion on CPU#0, ntpd/2120, f8adbbe0 (Tainted: P ) Sep 15 13:01:26 localhost kernel: [<c01d6bbc>] _raw_write_lock+0x56/0x58 Sep 15 13:01:26 localhost kernel: [<f8ad0f64>] destroy_conntrack+0x4e/0xfa [ip_conntrack] Sep 15 13:01:26 localhost kernel: [<f8ad0b4b>] __ip_ct_event_cache_init+0x66/0x72 [ip_conntrack] Sep 15 13:01:26 localhost kernel: [<f8ad2418>] ip_ct_refresh_acct+0xf3/0x104 [ip_conntrack] Sep 15 13:01:26 localhost kernel: [<f8ad3c56>] udp_packet+0x23/0x81 [ip_conntrack] Sep 15 13:01:26 localhost kernel: [<f8ad1a52>] ip_conntrack_in+0x158/0x260 [ip_conntrack] Sep 15 13:01:26 localhost kernel: [<c030bded>] nf_iterate+0x60/0x84 Sep 15 13:01:26 localhost kernel: [<c02db4bc>] dst_output+0x0/0x18 Sep 15 13:01:26 localhost kernel: [<c030be5e>] nf_hook_slow+0x4d/0xf2 Sep 15 13:01:26 localhost kernel: [<c02db4bc>] dst_output+0x0/0x18 Sep 15 13:01:26 localhost kernel: [<c02ddcbf>] ip_push_pending_frames+0x2c8/0x445 Sep 15 13:01:26 localhost kernel: [<c02db4bc>] dst_output+0x0/0x18 Sep 15 13:01:26 localhost kernel: [<c02f68a8>] udp_push_pending_frames+0x131/0x26b Sep 15 13:01:26 localhost kernel: [<c02f6dc6>] udp_sendmsg+0x3a8/0x708 Sep 15 13:01:26 localhost kernel: [<c02fdabe>] inet_sendmsg+0x2b/0x49 Sep 15 13:01:26 localhost kernel: [<c02b4b5f>] sock_sendmsg+0xe4/0xff Sep 15 13:01:26 localhost kernel: [<c01045b4>] do_IRQ+0x51/0x82 Sep 15 13:01:26 localhost kernel: [<c01463e5>] check_poison_obj+0x24/0x17e Sep 15 13:01:26 localhost kernel: [<c012dd26>] autoremove_wake_function+0x0/0x37 Sep 15 13:01:26 localhost kernel: [<c0145b6c>] dbg_redzone1+0xe/0x1f Sep 15 13:01:26 localhost kernel: [<c0147d39>] cache_alloc_debugcheck_after+0x2e/0x11a Sep 15 13:01:26 localhost kernel: [<c0148201>] __kmalloc+0x90/0xd9 Sep 15 13:01:26 localhost kernel: [<c013dc1d>] audit_sockaddr+0x39/0x78 Sep 15 13:01:26 localhost kernel: [<c013dc1d>] audit_sockaddr+0x39/0x78 Sep 15 13:01:26 localhost kernel: [<c02b5ecb>] sys_sendto+0xf5/0x116 Sep 15 13:01:26 localhost kernel: [<c031c371>] _spin_unlock_irq+0x5/0x7 Sep 15 13:01:26 localhost kernel: [<c0102c47>] handle_signal+0xeb/0x136 Sep 15 13:01:26 localhost kernel: [<c01463e5>] check_poison_obj+0x24/0x17e Sep 15 13:01:26 localhost kernel: [<c01461e4>] poison_obj+0x20/0x3d Sep 15 13:01:26 localhost kernel: [<c0145b6c>] dbg_redzone1+0xe/0x1f Sep 15 13:01:26 localhost kernel: [<c0145b6c>] dbg_redzone1+0xe/0x1f Sep 15 13:01:26 localhost kernel: [<c0147d39>] cache_alloc_debugcheck_after+0x2e/0x11a Sep 15 13:01:26 localhost kernel: [<c02b67c9>] sys_socketcall+0x1b9/0x292 Sep 15 13:01:26 localhost kernel: [<c0102ecd>] syscall_call+0x7/0xb Version-Release number of selected component (if applicable): kernel-2.6.13-1.1555_FC5 How reproducible: Sometimes Steps to Reproduce: 1. ? 2. 3. Additional info:
Got this with latest rawhide kernel: kernel-2.6.13-1.1565_FC5. Looks the same as the one above, except the process reported is vmnet-natd (was ntpd in the above). Sep 21 08:39:26 localhost kernel: BUG: rwlock recursion on CPU#0, vmnet-natd/5081, f8b52bc0 (Tainted: P ) Sep 21 08:39:26 localhost kernel: [<c01d701c>] _raw_write_lock+0x56/0x58 Sep 21 08:39:26 localhost kernel: [<f8b47f64>] destroy_conntrack+0x4e/0xfa [ip_conntrack] Sep 21 08:39:26 localhost kernel: [<f8b47b4b>] __ip_ct_event_cache_init+0x66/0x72 [ip_conntrack] Sep 21 08:39:26 localhost kernel: [<f8b4941c>] ip_ct_refresh_acct+0xf7/0x108 [ip_conntrack] Sep 21 08:39:26 localhost kernel: [<f8b4ac56>] udp_packet+0x23/0x81 [ip_conntrack] Sep 21 08:39:26 localhost kernel: [<f8b48a52>] ip_conntrack_in+0x158/0x260 [ip_conntrack] Sep 21 08:39:26 localhost kernel: [<c030c32d>] nf_iterate+0x60/0x84 Sep 21 08:39:26 localhost kernel: [<c02db96c>] dst_output+0x0/0x18 Sep 21 08:39:26 localhost kernel: [<c030c39e>] nf_hook_slow+0x4d/0xf2 Sep 21 08:39:26 localhost kernel: [<c02db96c>] dst_output+0x0/0x18 Sep 21 08:39:26 localhost kernel: [<c02de16f>] ip_push_pending_frames+0x2c8/0x445 Sep 21 08:39:26 localhost kernel: [<c02db96c>] dst_output+0x0/0x18 Sep 21 08:39:26 localhost kernel: [<c02f6dd8>] udp_push_pending_frames+0x131/0x26b Sep 21 08:39:26 localhost kernel: [<c01b7833>] avc_has_perm_noaudit+0x26/0xd1 Sep 21 08:39:26 localhost kernel: [<c02f72f6>] udp_sendmsg+0x3a8/0x708 Sep 21 08:39:26 localhost kernel: [<c01bb85b>] socket_has_perm+0x50/0x58 Sep 21 08:39:26 localhost kernel: [<c02fdfee>] inet_sendmsg+0x2b/0x49 Sep 21 08:39:26 localhost kernel: [<c02b500f>] sock_sendmsg+0xe4/0xff Sep 21 08:39:26 localhost kernel: [<c02d7593>] __ip_route_output_key+0xa9/0xc2 Sep 21 08:39:26 localhost kernel: [<c012dd16>] autoremove_wake_function+0x0/0x37 Sep 21 08:39:26 localhost kernel: [<c02f5180>] ip4_datagram_connect+0x210/0x340Sep 21 08:39:26 localhost kernel: [<c02b637b>] sys_sendto+0xf5/0x116 Sep 21 08:39:26 localhost kernel: [<c031c8ce>] _read_unlock_irq+0x5/0x7 Sep 21 08:39:26 localhost kernel: [<c013f1ad>] find_get_page+0x36/0x41 Sep 21 08:39:26 localhost kernel: [<c01401dd>] filemap_nopage+0x2d4/0x384 Sep 21 08:39:26 localhost kernel: [<c02fdc4b>] inet_dgram_connect+0x2b/0x57 Sep 21 08:39:26 localhost kernel: [<c01463d5>] check_poison_obj+0x24/0x17e Sep 21 08:39:26 localhost kernel: [<c014788d>] cache_free_debugcheck+0x11f/0x1fe Sep 21 08:39:26 localhost kernel: [<c01461d4>] poison_obj+0x20/0x3d Sep 21 08:39:26 localhost kernel: [<c02b63d3>] sys_send+0x37/0x3b Sep 21 08:39:26 localhost kernel: [<c02b6c23>] sys_socketcall+0x163/0x292 Sep 21 08:39:26 localhost kernel: [<c0102ecd>] syscall_call+0x7/0xb
reassigning to davem, though this may be a problem in your vmnet modules that you'll have to bring up with vmware.
No, this looks like a legitimate bug, will report to the netfilter folks so that it gets fixed upstream.
Here's an additional report, with a freshly booted kernel-2.6.13-1.1576_FC5.i686: BUG: rwlock recursion on CPU#0, nscd/3528, e8b41bc0 (Not tainted) [<c01dc39c>] _raw_write_lock+0x56/0x58 [<e8b36f64>] destroy_conntrack+0x4e/0xfa [ip_conntrack] [<e8b36b4b>] __ip_ct_event_cache_init+0x66/0x72 [ip_conntrack] [<e8b3841c>] ip_ct_refresh_acct+0xf7/0x108 [ip_conntrack] [<e8b39c56>] udp_packet+0x23/0x81 [ip_conntrack] [<e8b37a52>] ip_conntrack_in+0x158/0x260 [ip_conntrack] [<c0312ecd>] nf_iterate+0x60/0x84 [<c02e250c>] dst_output+0x0/0x18 [<c0312f3e>] nf_hook_slow+0x4d/0xf2 [<c02e250c>] dst_output+0x0/0x18 [<c02e4d0f>] ip_push_pending_frames+0x2c8/0x445 [<c02e250c>] dst_output+0x0/0x18 [<c02fd978>] udp_push_pending_frames+0x131/0x26b [<c02fde96>] udp_sendmsg+0x3a8/0x708 [<c032422e>] do_page_fault+0x25e/0x640 [<c0304b8e>] inet_sendmsg+0x2b/0x49 [<c02bbbaf>] sock_sendmsg+0xe4/0xff [<c02de133>] __ip_route_output_key+0xa9/0xc2 [<c0132df6>] autoremove_wake_function+0x0/0x37 [<c02fbd20>] ip4_datagram_connect+0x210/0x340 [<c02bcf1b>] sys_sendto+0xf5/0x116 [<c03047eb>] inet_dgram_connect+0x2b/0x57 [<c02bcc9b>] sys_connect+0x6e/0x9e [<c014b2f4>] poison_obj+0x20/0x3d [<c014b4f5>] check_poison_obj+0x24/0x17e [<c014b2f4>] poison_obj+0x20/0x3d [<c02bcf73>] sys_send+0x37/0x3b [<c02bd7c3>] sys_socketcall+0x163/0x292 [<c0102ecd>] syscall_call+0x7/0xb
Still get this in 2.6.13-1.1578_FC5: Sep 27 07:09:11 localhost kernel: BUG: rwlock recursion on CPU#0, ntpd/2359, f8be4bc0 (Tainted: P ) Sep 27 07:09:11 localhost kernel: [<c01dc39c>] _raw_write_lock+0x56/0x58 Sep 27 07:09:11 localhost kernel: [<f8bd9f64>] destroy_conntrack+0x4e/0xfa [ip_conntrack] Sep 27 07:09:11 localhost kernel: [<f8bd9b4b>] __ip_ct_event_cache_init+0x66/0x72 [ip_conntrack] Sep 27 07:09:11 localhost kernel: [<f8bdb41c>] ip_ct_refresh_acct+0xf7/0x108 [ip_conntrack] Sep 27 07:09:11 localhost kernel: [<f8bdcc56>] udp_packet+0x23/0x81 [ip_conntrack] Sep 27 07:09:11 localhost kernel: [<f8bdaa52>] ip_conntrack_in+0x158/0x260 [ip_conntrack] Sep 27 07:09:11 localhost kernel: [<c0312ecd>] nf_iterate+0x60/0x84 Sep 27 07:09:11 localhost kernel: [<c02e250c>] dst_output+0x0/0x18 Sep 27 07:09:11 localhost kernel: [<c0312f3e>] nf_hook_slow+0x4d/0xf2 Sep 27 07:09:11 localhost kernel: [<c02e250c>] dst_output+0x0/0x18 Sep 27 07:09:11 localhost kernel: [<c02e4d0f>] ip_push_pending_frames+0x2c8/0x445 Sep 27 07:09:11 localhost kernel: [<c02e250c>] dst_output+0x0/0x18 Sep 27 07:09:11 localhost kernel: [<c02fd978>] udp_push_pending_frames+0x131/0x26b Sep 27 07:09:11 localhost kernel: [<c02fde96>] udp_sendmsg+0x3a8/0x708 Sep 27 07:09:11 localhost kernel: [<c0304b8e>] inet_sendmsg+0x2b/0x49 Sep 27 07:09:11 localhost kernel: [<c02bbbaf>] sock_sendmsg+0xe4/0xff Sep 27 07:09:11 localhost kernel: [<c0132df6>] autoremove_wake_function+0x0/0x37 Sep 27 07:09:11 localhost kernel: [<c014b4f5>] check_poison_obj+0x24/0x17e Sep 27 07:09:11 localhost kernel: [<c0132df6>] autoremove_wake_function+0x0/0x37 Sep 27 07:09:11 localhost kernel: [<c014ac7c>] dbg_redzone1+0xe/0x1f Sep 27 07:09:11 localhost kernel: [<c014ce49>] cache_alloc_debugcheck_after+0x2e/0x11a Sep 27 07:09:11 localhost kernel: [<c014d311>] __kmalloc+0x90/0xd9 Sep 27 07:09:11 localhost kernel: [<c0142ced>] audit_sockaddr+0x39/0x78 Sep 27 07:09:11 localhost kernel: [<c0142ced>] audit_sockaddr+0x39/0x78 Sep 27 07:09:11 localhost kernel: [<c02bcf1b>] sys_sendto+0xf5/0x116 Sep 27 07:09:11 localhost kernel: [<c0323451>] _spin_unlock_irq+0x5/0x7 Sep 27 07:09:11 localhost kernel: [<c0102c47>] handle_signal+0xeb/0x136 Sep 27 07:09:11 localhost kernel: [<c014b4f5>] check_poison_obj+0x24/0x17e Sep 27 07:09:11 localhost kernel: [<c014b2f4>] poison_obj+0x20/0x3d Sep 27 07:09:11 localhost kernel: [<c014ac7c>] dbg_redzone1+0xe/0x1f Sep 27 07:09:11 localhost kernel: [<c014ce49>] cache_alloc_debugcheck_after+0x2e/0x11a Sep 27 07:09:11 localhost kernel: [<c02bd819>] sys_socketcall+0x1b9/0x292 Sep 27 07:09:11 localhost kernel: [<c0102ecd>] syscall_call+0x7/0xb
Current upstream 2.6.x Linus tree has the fix for this, so whenever that gets into the -devel kernel you'll get the fix.
2.6.13-1.1578 is upstream 2.6.14rc2-git6, which is current, so something still isn't right.
the fix went in three nights ago, and the code no longer takes the write lock on the event cache code paths so the backtraces we're getting now are impossible with the upstream tree I'll attach the patch to this bug report, so that it is clear exactly what change should be fixing this bug.
Created attachment 119333 [details] Upstream patch for event cache locking bug 2.6.14 GIT upstream netfilter fix for this bug
dumb thinko in the kernel spec file wasn't applying the -git patches. Will be in tomorrows rawhide.