Red Hat Bugzilla – Full Text Bug Listing
|Summary:||Masqueraded tcp connections from guest get stuck after syn/ack - checksum problem?|
|Product:||[Fedora] Fedora||Reporter:||Robin Green <greenrd>|
|Component:||xen||Assignee:||Herbert Xu <herbert.xu>|
|Status:||CLOSED DUPLICATE||QA Contact:||Brian Brock <bbrock>|
|Version:||5||CC:||ehabkost, katzj, mcepl, xen-maint|
|Fixed In Version:||Doc Type:||Bug Fix|
|Doc Text:||Story Points:||---|
|Last Closed:||2007-04-24 18:50:41 EDT||Type:||---|
|oVirt Team:||---||RHEL 7.3 requirements from Atomic Host:|
|Bug Depends On:|
Description Robin Green 2006-03-21 20:11:11 EST
Description of problem: Masqueraded tcp connections from a xen guest get stuck after the first packet after the syn/ack. This did not occur on a fc4 host with stock xen and xen kernel from XenSource. It does occur with Fedora xen and XenSource kernel-xenU. Unfortunately, I cannot easily retest with the stock xen and stock xen kernels any more because the stock xen0 kernel is incompatible with udev in fc5. ethereal on the host shows that the syn/ack has a correct checksum, but the next packet doesn't, either coming in on vif or going out on eth0. Networking between guest and host works fine - ethereal on the host shows that some of the checksums are bogus there too, but I suspect no-one's checking them because it's a virtual interface maybe? I'm aware that ethereal and tcpdump can give bogus checksum errors for outgoing packets, but this is for incoming packets as well. Version-Release number of selected component (if applicable): xen-3.0.1-4 How reproducible: Always Steps to Reproduce: 1. Create a xen guest that uses NAT networking. 2. Boot the guest and start networking 3. In the host, run ethereal and filter on port 80 4. From the guest, do links http://www.google.ie/ Actual results: Browser hangs forever waiting for a reply. Expected results: Quick reply
Comment 1 Robin Green 2006-03-21 20:13:03 EST
Oh, I should have made clear - this occurs whether you use the xenU kernel from xensource or the Fedora xenU kernel.
Comment 2 Robin Green 2006-03-21 21:35:28 EST
I connected to another machine here on campus (secure.ucd.ie), from the xen guest, and took a tcpdump -v on secure.ucd.ie. The packets are getting to the server, but after the syn/ack handshaking, the server isn't replying. tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 96 bytes 01:50:58.313460 IP (tos 0x0, ttl 63, id 7455, offset 0, flags [DF], proto 6, length: 60) greenrd.ucd.ie.57464 > secure.ucd.ie.http: S [tcp sum ok] 3318021448:3318021448(0) win 5840 <mss 1460,sackOK,timestamp 4294959191 0,nop,wscale 2> 01:50:58.316452 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto 6, length: 60) secure.ucd.ie.http > greenrd.ucd.ie.57464: S [tcp sum ok] 3457191582:3457191582(0) ack 3318021449 win 5792 <mss 1460,sackOK,timestamp 19629476 4294959191,nop,wscale 2> 01:50:58.313689 IP (tos 0x0, ttl 63, id 7456, offset 0, flags [DF], proto 6, length: 52) greenrd.ucd.ie.57464 > secure.ucd.ie.http: . [tcp sum ok] ack 1 win 1460 <nop,nop,timestamp 4294959192 19629476> 01:50:58.314112 IP (tos 0x0, ttl 63, id 7457, offset 0, flags [DF], proto 6, length: 227) greenrd.ucd.ie.57464 > secure.ucd.ie.http: P 1:176(175) ack 1 win 1460 <nop,nop,timestamp 4294959193 19629476> 01:50:58.518081 IP (tos 0x0, ttl 63, id 7458, offset 0, flags [DF], proto 6, length: 227) greenrd.ucd.ie.57464 > secure.ucd.ie.http: P 1:176(175) ack 1 win 1460 <nop,nop,timestamp 4294959244 19629476> 01:50:58.926064 IP (tos 0x0, ttl 63, id 7459, offset 0, flags [DF], proto 6, length: 227) greenrd.ucd.ie.57464 > secure.ucd.ie.http: P 1:176(175) ack 1 win 1460 <nop,nop,timestamp 4294959346 19629476> 01:50:59.742032 IP (tos 0x0, ttl 63, id 7460, offset 0, flags [DF], proto 6, length: 227) greenrd.ucd.ie.57464 > secure.ucd.ie.http: P 1:176(175) ack 1 win 1460 <nop,nop,timestamp 4294959550 19629476>
Comment 3 Robin Green 2006-03-22 11:52:26 EST
This is indeed a bad checksum generated in the guest, which is modified but not corrected by the masq on the host. Workaround: modprobe iptable_nat in the _guest_. I found this workaround at http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=495 which may or may not be the same bug. However it looks like recent patches in xen CVS may fix problems related to checksumming.
Comment 4 Herbert Straub 2007-01-21 17:57:57 EST
I can confirm this bug: Version: Fedora Core 6 with kernel-xen-2.6.19-1.2895.fc6. I can workaround with ethtool -K eth0 tx off in the xen Guest Domain. I found this net references: http://wiki.xensource.com/xenwiki/XenFaq#head-4ce9767df34fe1c9cf4f85f7e07cb10110eae9b7 --> 3.5 TCP and UDP checksum errors, ping but nothing else, ipsec tunnels don't form, DNAT translation doesn't work http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=447 http://www.redhat.com/archives/fedora-xen/2006-June/msg00020.html
Comment 5 Matěj Cepl 2007-04-24 08:42:42 EDT
Fully reproducible with kernel-xen-2.6.20-1.2944.fc6.x86_64 and RHEL4 as a guest.