Bug 507535 - GPF in nf_conntrack_alloc() with KVM guests (2.6.31)
Summary: GPF in nf_conntrack_alloc() with KVM guests (2.6.31)
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: rawhide
Hardware: All
OS: Linux
high
medium
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: F12VirtTarget 513460
TreeView+ depends on / blocked
 
Reported: 2009-06-23 08:39 UTC by Saikat Guha
Modified: 2009-08-11 17:36 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-08-11 17:36:34 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Saikat Guha 2009-06-23 08:39:45 UTC
I get a GPF when routing packets from host to a second back and back into guest through IP tables DNAT rule on http://www.smolts.org/client/show/pub_fb7fc672-cdea-44da-9462-668f9497c120 running F11 + kernel-2.6.30-6.fc12.x86_64


general protection fault: 0000 [#1] SMP 
last sysfs file: /sys/module/nf_nat/initstate
CPU 1 
Modules linked in: ipt_MASQUERADE iptable_nat nf_nat tun bridge stp llc sunrpc ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 kvm_intel kvm i2c_i801 pcspkr i2c_core iTCO_wdt iTCO_vendor_support s
hpchp tg3 ata_generic pata_acpi [last unloaded: nf_nat]
Pid: 1700, comm: dnsmasq Not tainted 2.6.30-6.fc12.x86_64 #1 System Product Name
RIP: 0010:[<ffffffff811198fa>]  [<ffffffff811198fa>] kmem_cache_alloc+0x9f/0x17b
RSP: 0018:ffff8801044b96a8  EFLAGS: 00010002
RAX: 000000000000002c RBX: 0000000000008020 RCX: 0000000000000064
RDX: 048b48650000441f RSI: ffffffff811198d8 RDI: ffffffff81087974
RBP: ffff8801044b96f8 R08: ffff88010d2843b8 R09: 0000000000000246
R10: 00000000a2b0263c R11: 000000007e9c8bad R12: ffff88010d2a4090
R13: 0000000000000158 R14: 0000000000000246 R15: ffffffff81425152
FS:  00007fd9ea44b6f0(0000) GS:ffff88002eddb000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f1223d0e000 CR3: 0000000105d8b000 CR4: 00000000000026e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process dnsmasq (pid: 1700, threadinfo ffff8801044b8000, task ffff8801044cc7c0)
Stack:
 ffffffff8104898e 0000000000000246 ffff8801044b9718 00000000a2b0263c
 0000000081400760 ffffffff825cdd30 ffff8801044b97c8 ffff8801044b9798
 0000000000000020 ffffffff8177f280 ffff8801044b9748 ffffffff81425152
Call Trace:
 [<ffffffff8104898e>] ? cpuacct_charge+0x30/0xbd
 [<ffffffff81425152>] nf_conntrack_alloc+0xd2/0x192
 [<ffffffff81425275>] ? nf_ct_invert_tuple+0x63/0x82
 [<ffffffff814256ba>] nf_conntrack_in+0x2d2/0x830
 [<ffffffff8143730c>] ? dst_output+0x0/0x39
 [<ffffffff81472fb8>] ipv4_conntrack_local+0x53/0x70
 [<ffffffff81421a79>] nf_iterate+0x5c/0xb3
 [<ffffffff8143730c>] ? dst_output+0x0/0x39
 [<ffffffff81421b76>] nf_hook_slow+0xa6/0x136
 [<ffffffff8143730c>] ? dst_output+0x0/0x39
 [<ffffffff8143899b>] nf_hook_thresh.clone.0+0x50/0x6d
 [<ffffffff81438cbd>] __ip_local_out+0x91/0xa7
 [<ffffffff81438cf8>] ip_local_out+0x25/0x4d
 [<ffffffff81438fe9>] ip_push_pending_frames+0x2c9/0x357
 [<ffffffff81439baa>] ? ip_append_data+0x664/0x9a8
 [<ffffffff814588f8>] udp_push_pending_frames+0x2db/0x34a
 [<ffffffff81459b9f>] udp_sendmsg+0x59b/0x6c1
 [<ffffffff814613d6>] inet_sendmsg+0x63/0x80
 [<ffffffff813ec5f1>] __sock_sendmsg+0x70/0x8f
 [<ffffffff813ecfbd>] sock_sendmsg+0xdb/0x108
 [<ffffffff813ece3f>] ? sock_recvmsg+0xde/0x10b
 [<ffffffff8107584b>] ? autoremove_wake_function+0x0/0x5f
 [<ffffffff810f8cd8>] ? might_fault+0x71/0xd9
 [<ffffffff810f8cd8>] ? might_fault+0x71/0xd9
 [<ffffffff813f80ec>] ? verify_iovec+0x60/0xb4
 [<ffffffff813ed20b>] sys_sendmsg+0x221/0x2a5
 [<ffffffff81126ed8>] ? __fput+0x1a3/0x1c6
 [<ffffffff8112f4a7>] ? path_put+0x31/0x4c
 [<ffffffff810b4c33>] ? audit_syscall_entry+0x12d/0x16d
 [<ffffffff814b96ec>] ? trace_hardirqs_on_thunk+0x3a/0x3f
 [<ffffffff81013002>] system_call_fastpath+0x16/0x1b
Code: 1f 44 00 00 e8 7c e0 f6 ff 65 8b 04 25 88 cd 00 00 48 98 4d 8b 84 c4 28 11 00 00 49 8b 10 45 8b 68 18 48 85 d2 74 0d 41 8b 40 14 <48> 8b 04 c2 49 89 00 eb 13 83 ca ff 4c 89 f9 89 de 4c 89 e7 e8 
RIP  [<ffffffff811198fa>] kmem_cache_alloc+0x9f/0x17b
 RSP <ffff8801044b96a8>
---[ end trace b9fc7cf17ce2a67c ]---

Comment 1 Saikat Guha 2009-06-23 09:40:33 UTC
Confirmed for kernel-2.6.31-0.24.rc0.git18.fc12.x86_64 as well.



general protection fault: 0000 [#1] SMP
last sysfs file: /sys/module/nf_nat/initstate
CPU 1
Modules linked in: ipt_MASQUERADE iptable_nat nf_nat tun bridge stp llc sunrpc ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 kvm_intel kvm iTCO_wdt iTCO_vendor_support i2c_i801 pcspkr i2c_core ]
Pid: 1980, comm: dnsmasq Not tainted 2.6.31-0.24.rc0.git18.fc12.x86_64 #1 System Product Name
RIP: 0010:[<ffffffff81130352>]  [<ffffffff81130352>] kmem_cache_alloc+0xb0/0x18a
RSP: 0018:ffff88010059b738  EFLAGS: 00010086
RAX: 0000000000000034 RBX: ffff8801093c8000 RCX: ffff88010059b728
RDX: 894c000000c0978b RSI: ffffffff81130330 RDI: ffffffff81092d18
RBP: ffff88010059b788 R08: ffff88010937e3b8 R09: 0000000077b2ae3e
R10: 00000000f1d5b6ba R11: 00000000b4fd5d9b R12: 0000000000008020
R13: 0000000000008020 R14: 0000000000000246 R15: 0000000000000198
FS:  00007fd0fea7d6f0(0000) GS:ffff88002f1de000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ff31f478000 CR3: 0000000108cd2000 CR4: 00000000000026f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process dnsmasq (pid: 1980, threadinfo ffff88010059a000, task ffff8801091f24a0)
Stack:
 0000000000000000 ffffffff81458d24 ffff88010059b848 0000000077b2ae3e
<0> ffff88010059b778 ffffffff8262eee0 ffff88010059b848 ffff88010059b818
<0> 0000000000000020 ffffffff817f45d0 ffff88010059b7c8 ffffffff81458d24
Call Trace:
 [<ffffffff81458d24>] ? nf_conntrack_alloc+0xd3/0x1ae
 [<ffffffff81458d24>] nf_conntrack_alloc+0xd3/0x1ae
 [<ffffffff814590d1>] nf_conntrack_in+0x2d2/0x871
 [<ffffffff8146b024>] ? dst_output+0x0/0x39
 [<ffffffff814a6cfe>] ipv4_conntrack_local+0x53/0x70
 [<ffffffff814552b1>] nf_iterate+0x5c/0xb3
 [<ffffffff8146b024>] ? dst_output+0x0/0x39
 [<ffffffff81455395>] nf_hook_slow+0x8d/0x109
 [<ffffffff8146b024>] ? dst_output+0x0/0x39
 [<ffffffff8146c6b9>] nf_hook_thresh.clone.0+0x50/0x6d
 [<ffffffff8142a195>] ? memcpy_fromiovecend+0x61/0xa2
 [<ffffffff8146c9e1>] __ip_local_out+0x91/0xa7
 [<ffffffff8146ca1c>] ip_local_out+0x25/0x4d
 [<ffffffff8146cd0d>] ip_push_pending_frames+0x2c9/0x357
 [<ffffffff8146d8ce>] ? ip_append_data+0x664/0x9b2
 [<ffffffff8148c6ec>] udp_push_pending_frames+0x2db/0x34a
 [<ffffffff8148d994>] udp_sendmsg+0x59b/0x6c1
 [<ffffffff81495117>] inet_sendmsg+0x63/0x80
 [<ffffffff8141e9b1>] __sock_sendmsg+0x70/0x8f
 [<ffffffff8148e04c>] ? udp_lib_get_port+0x272/0x29b
 [<ffffffff8141f37d>] sock_sendmsg+0xdb/0x108
 [<ffffffff8106a465>] ? _local_bh_enable_ip+0xe7/0x109
 [<ffffffff8107f56f>] ? autoremove_wake_function+0x0/0x5f
 [<ffffffff81093b66>] ? trace_hardirqs_on_caller+0x32/0x175
 [<ffffffff81422c52>] ? release_sock+0xf4/0x113
 [<ffffffff81093cc9>] ? trace_hardirqs_on+0x20/0x36
 [<ffffffff8106a465>] ? _local_bh_enable_ip+0xe7/0x109
 [<ffffffff81420189>] ? move_addr_to_kernel+0x5b/0x78
 [<ffffffff814202b6>] sys_sendto+0x110/0x152
 [<ffffffff81145d0a>] ? path_put+0x31/0x4c
 [<ffffffff810c12ca>] ? audit_syscall_entry+0x12d/0x16d
 [<ffffffff814edfde>] ? trace_hardirqs_on_thunk+0x3a/0x3f
 [<ffffffff81012f42>] system_call_fastpath+0x16/0x1b
Code: 1f 44 00 00 e8 c8 29 f6 ff 65 8b 04 25 58 e3 00 00 48 98 4c 8b 84 c3 28 11 00 00 49 8b 10 45 8b 78 18 48 85 d2 74 0d 41 8b 40 14 <48> 8b 04 c2 49 89 00 eb 15 48 8b 4d b8 83 ca ff 44 89 e6 48 89 
RIP  [<ffffffff81130352>] kmem_cache_alloc+0xb0/0x18a
 RSP <ffff88010059b738>
---[ end trace e8de3b50dbc777bd ]---

Comment 2 Mark McLoughlin 2009-06-23 15:58:13 UTC
Ouch. Any ideas herbert?

Saikat: could you give us more information on how you're configuring the guests? How you're launching them, your bridge configuration, routing config etc.?

Comment 3 Herbert Xu 2009-06-23 16:26:14 UTC
Looks like memory corruption.  Does this happen under the upstream kernel too?

Comment 4 Saikat Guha 2009-06-23 17:19:15 UTC
I haven't tried upstream kernels. kernel-2.6.31-0.24.rc0.git18.fc12.x86_64 was the latest on Koji. I experienced the problem both with the latest 2.6.30 and the latest 2.6.29 on koji.

As for config, I am doing everything with libvirt:

For guests, I am using qemu+kvm, and --network=network:default. That IIRC brings up guests on a local private bridge vibr0. libvirt automatically configures iptables on the host is NAT the guests to the external interface eth0. An additional DNAT rule forwards specific host ports (on eth0) to specific guest/guest-port (on vibr0). Finally, dnsmasq, which provides dhcp and dns on vibr0, is what triggers the GPF.

I can log into the guest (through the port forwarding or console) and use yum, which does DNS lookups, and nothing bad happens. The specific command below kills it deterministically.

H: Host
G1: Guest 1 (vibr0 address)
  G1P: Guest 1 port forward (H's eth0 address + some port)
G2: Guest 2
  G2P: Guest 2 port forward (H's eth0 address + some port) 
E: Another physical computer on eth0

From inside H (ssh'ed in from E):
# ssh -t E ssh -t G1P ssh -t G2 ssh -t H ...

So:
H -> E -> G1 -> G2 -> H
(originally meant as a stress test, but I can imagine getting into this situation IRL. Will look for a simpler test case, but this kills the host every time.)

On 2.6.29, it would OOPS on the E->G1 ssh.
On 2.6.30 and 2.6.31rc, it gets to G2, but OOPS in the dns lookup for H

I have memtested the host extensively, but haven't found any memory errors.

If there are specific configs you'd like me to attach, please let me know which.

Comment 5 Saikat Guha 2009-06-24 00:17:39 UTC
I have a much simpler command that deterministically crashes the host:

root@host# /etc/init.d/libvirtd restart
root@host# /etc/init.d/iptables restart
root@host# while true; do echo Moo | nc -w 0 -u 192.168.122.168 22; done

Where 192.168.122.168 is the guest IP on vibr0

Comment 6 Saikat Guha 2009-06-24 03:12:34 UTC
Maybe related to BUG in 
http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg265131.html

In that bug:
> iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE

I have a post routing rule

> PC is at kmem_cache_alloc+0x2c/0x54

RIP: ... kmem_cache_alloc+0xb0/0x18a

If it is the same bug, may be related to netfilter and not kvm

Comment 7 Mark McLoughlin 2009-06-24 12:42:41 UTC
What NIC model are you using for the guest?

(In reply to comment #5)
> I have a much simpler command that deterministically crashes the host:
> 
> root@host# /etc/init.d/libvirtd restart
> root@host# /etc/init.d/iptables restart

At this point, the nat post-routing rule is gone, right?

> root@host# while true; do echo Moo | nc -w 0 -u 192.168.122.168 22; done

Hmm, I can't reproduce with this

Comment 8 Saikat Guha 2009-06-24 12:52:24 UTC
I narrowed it down to the iptables restart.

If I leave iptables untouched after it starts up on boot, everything works perfectly. (with or without postrouting).

If ever I run /etc/init.d/iptables restart, the host crashes soon afterwards.

Comment 9 Mark McLoughlin 2009-08-11 15:06:02 UTC
Does this still happen with 2.6.31 kernels?

Comment 10 Saikat Guha 2009-08-11 16:19:31 UTC
Sorry, I no longer have access to the machine to test this issue any further.

Comment 11 Saikat Guha 2009-08-11 16:21:09 UTC
FWIW, the latest kernel I tried in Comment #1 (kernel-2.6.31-0.24.rc0.git18.fc12.x86_64) had the bug.

Comment 12 Mark McLoughlin 2009-08-11 17:36:34 UTC
Realistically, if no-one can reproduce this, it's not going to get fixed

Closing as INSUFFICIENT_DATA, but if anyone else can reproduce, please do re-open

Many thanks for the report, though


Note You need to log in before you can comment on or make changes to this bug.