Bug 189112
Summary: | XenU guest kernel reports "Received packet is 10 bytes before head." | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | lannet |
Component: | xen | Assignee: | Herbert Xu <herbert.xu> |
Status: | CLOSED WONTFIX | QA Contact: | Martin Jenner <mjenner> |
Severity: | high | Docs Contact: | |
Priority: | medium | ||
Version: | 5 | CC: | bstein, herbert.xu, hps, katzj, russell, xen-maint |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | i386 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2008-02-26 23:04:35 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
lannet
2006-04-16 12:18:02 UTC
I have a similar setup: I have 2 DSL connections (Both PPPoE connections), one that is on a separate router box (running FC4) that I connect to over ethernet (eth0) and the other that is on xen0. Kernel: 2.6.16-1.2080_FC5xen0 From the 2.6.16-1.2080_FC5xenU: - If I route via the separate router box over the ethernet port, I don't get the error - If I route via the PPPoE connection I get the error. Connection to the network also seems extremely slow, likely as the packets causing the error are being tossed. I am assuming the issue is specific to bridging code when relaying packets coming from the PPP connection. eth0: RealTek RTL-8029 found at 0xcc00, IRQ 17, 00:60:67:4E:01:46. eth1: VIA Rhine II at 0xfebff400, 00:15:f2:6f:37:7e, IRQ 18. I just tested with the latest updates-released versions of the xen0 and xenU, and can confirm I see the same problem. 2.6.16-1.2096_FC5xen0 #1 SMP Wed Apr 19 05:49:52 EDT 2006 i686 athlon i386 GNU/Linux 2.6.16-1.2096_FC5xenU #1 SMP Wed Apr 19 06:07:11 EDT 2006 i686 athlon i386 GNU/Linux ... Received packet is 10 bytes before head. printk: 17 messages suppressed. Received packet is 10 bytes before head. I also changed eth1 cards (eth0 is from the ASUS A8V-MX motherboard) to see if that would make a difference. I expect this is a problem unique to the combination of PPPoE and VIF drivers. eth1: RealTek RTL8139 at 0xf486ec00, 00:50:ba:50:62:53, IRQ 17 eth1: Identified 8139 chip type 'RTL-8139C' Is there any estimate of when this bug will be looked at? There is active work in this area now, so the bz is queued. Can you confirm this is the case with the later xen and kernel-xen packages for FC5 (or rawhide)? Yes this is a bug in the way Xen tries to avoid crossing a page boundary when going from dom0 to domU. As part of my work in adding scatter and gather support this problem should go away. I think, this problem is related to iptables / firewall. I have the following setup: +---------------------------------+ | Host box | | | | ppp0 | - Internet Uplink | Unused Bridge -- eth1 | - private subnet (192.168.2.0/24) | eth3 | - Internet Subnet (a.b.c.d/28) | | | | +---------+ | | eth3 has an IP address from the | | Virtual | Bridge | a.b.c.d/28 range | | System | | | | | | eth0 ---------+ | eth0 has another IP address from the | +---------+ | a.b.c.d/28 range, Default route for virtual | | points to the address on host/eth3 +---------------------------------+ Host/eth1 Host/eth3 Host/ppp0 Virtual/eth0 Host/eth1 - No NAT NAT No NAT Host/eth3 No NAT - No NAT No NAT Host/ppp0 NAT No NAT - No NAT Virtual/eth0 No NAT No NAT No NAT No NAT The virtual host must route all its traffic through the host system (which should act as firewall for the virtual host). So its default route points to the IP address on Host/eth3 (connected to the internal bridge). When I have no iptable rules loaded in the Host system, no error messages are logged. As soon as I load my firewall rules (and also the NAT Rules for the eth1 interface, which does have a bridge connected but this is not used in the virtual host), the logging starts: Received packet is 10 bytes before head. Received packet is 10 bytes before head. [ad infinitum] This happens _only_ for traffic from the virtual system to the internet (which ironically enough isn't currently affected by any firewall rule). It does not happen for traffic from the virtual host to systems connected to eth1 (in the private subnet). Both, Host and Virtual server have all current Fedora Core 5 updates installed (running Kernel 2.6.17-1.2145_FC5) The problem is simply that Xen is assuming that all packets passing from dom0 to domU has a 16-byte headroom which simply isn't the case. My SG patches for dom0=>domU removes this assumption. Now that we know where the problem lies, can we assume that we get an updated Kernel / xen version for FC5 soon? Basically the printk output limits the rate of packets that a virtual server can deliver. And it messes with my log files. :-) 2.6.17-1.2157_FC5 does _NOT_ change this problem for me. Herbert, Is your SG patch in 2.6.17-1.2157_FC5? I have been tracking a different problem which might be related, although because it is so intermittant it is hard to test. See comment #1 on https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=199944#c1 Shortform: Sockets get into a state where programs which are listening on a port no longer answer. Other ports are working at that point. Restarting the program temporarily fixes the problem. Everything was working until a recent upgrade. I've tried moving back to 2.6.17-1.2145 , but don't yet know for certain if this avoids the problem. The SG support for dom0 => domU has been merged upstream so hopefully we won't see this bug anymore. Well at least we won't see it in its current form since I deleted that printk :) Russell, this may be related in the sense that the SG patches may have fixed latent bugs in the networking code. So once the patches have filtered through I encourage you to test it and see if you can still reproduce that problem. Herbert, Do you have a sense of when the SG patches will make it into a kernel I can receive via Yum? Just so we have it somewhere, I created a separate bug for that listen() problem. https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=203122 This issue has still not been fixed with 2.6.17-1.2174_FC5 Sorry it's taking such a long time for these fixes to filter through. Part of the reason is we want to make very sure that the new code does not end up causing bigger problems than the old bugs :) Hopefully the kernels might be ready this week. The 2.6.18 (2189) kernel in FC5 testing should fix this. Please confirm this is fixed in the current release. Would love to, however #211090 blocks me currently (no xen at all ATM). change QA contact This report targets FC5, which is now end-of-life. Please re-test against Fedora 7 or later, and if the issue persists, open a new bug. Thanks |