Bug 430722
Summary: | [RHEL5 U2] e1000e network issues while running kernel-xen variant | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Jeff Burke <jburke> | ||||
Component: | kernel | Assignee: | Andy Gospodarek <agospoda> | ||||
Status: | CLOSED ERRATA | QA Contact: | Martin Jenner <mjenner> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | high | ||||||
Version: | 5.2 | CC: | auke-jan.h.kok, dzickus, jesse.brandeburg, peterm | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | All | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | RHBA-2008-0314 | Doc Type: | Bug Fix | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2008-05-21 15:08:17 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Jeff Burke
2008-01-29 16:22:45 UTC
Created attachment 293315 [details] e1000e-crc-revert.patch Herbert deserves the credit on this one since he seems to have figured out what the problem was. If we revert this patch, we are back to stripping the crc in software. Running the commands in comment #1 yields the following output: 13:42:17.075335 00:16:e6:8c:55:1e > 00:d0:01:25:30:0a, ethertype IPv4 (0x0800), length 1514: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto: ICMP (1), length: 1500) 10.12.4.139 > 10.13.255.101: ICMP echo request, id 10767, seq 16, length 1480 13:42:17.083581 00:d0:01:25:30:0a > 00:16:e6:8c:55:1e, ethertype IPv4 (0x0800), length 1514: (tos 0x0, ttl 254, id 31830, offset 0, flags [DF], proto: ICMP (1), length: 1500) 10.13.255.101 > 10.12.4.139: ICMP echo reply, id 10767, seq 16, length 1480 1480 bytes from ntap-storage0-b.boston.redhat.com (10.13.255.101): icmp_seq=16 ttl=254 time=8.25 ms 13:42:18.075355 00:16:e6:8c:55:1e > 00:d0:01:25:30:0a, ethertype IPv4 (0x0800), length 1514: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto: ICMP (1), length: 1500) 10.12.4.139 > 10.13.255.101: ICMP echo request, id 10767, seq 17, length 1480 13:42:18.081458 00:d0:01:25:30:0a > 00:16:e6:8c:55:1e, ethertype IPv4 (0x0800), length 1514: (tos 0x0, ttl 254, id 32086, offset 0, flags [DF], proto: ICMP (1), length: 1500) 10.13.255.101 > 10.12.4.139: ICMP echo reply, id 10767, seq 17, length 1480 1480 bytes from ntap-storage0-b.boston.redhat.com (10.13.255.101): icmp_seq=17 ttl=254 time=6.11 ms Auke, We are considering reverting this patch for e1000e since it doesn't play well with xen bridging: commit 140a74802894e9db57e5cd77ccff77e590ece5f3 Author: Auke Kok <auke-jan.h.kok> Date: Thu Oct 25 13:57:58 2007 -0700 e1000e: Re-enable SECRC - crc stripping This workaround code performed software stripping instead of the hardware which can do it much faster. None of the e1000e target hardware has issues with this feature and should work fine. This gives us some performance back on receive, and removes some kludging stripping the 4 bytes. Signed-off-by: Auke Kok <auke-jan.h.kok> Signed-off-by: Jeff Garzik <jeff> From the description it seems this will only effect performance, not specific functionality. Do you agree with that statement? correct, however I wonder why this breaks Xen - it sounds like a similar problem we had a while ago. Jesse, do you remember? This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release. It seems very odd to me that this would make a difference at all, but when receiving frames on the bridge without this patch they come through an extra 4 bytes bigger. Is there any way that the stripping done by the e1000 hardware strips the FCS (makes it all zeros), but doesn't change the received length? Maybe the subtraction of the length is still needed? Or maybe this just happens on our particular hardware? 07:00.0 Ethernet controller: Intel Corporation 80003ES2LAN Gigabit Ethernet Controller (Copper) (rev 01) 07:00.1 Ethernet controller: Intel Corporation 80003ES2LAN Gigabit Ethernet Controller (Copper) (rev 01) our older drivers used SECRC with no apparent issues. This functionality should work. However there really is no harm in doing it in software if you're looking for a quick fix. here is an excerpt from the manual: The SECRC bit controls whether the hardware strips the Ethernet CRC from the received packet. This stripping occurs prior to any checksum calculations. The stripped CRC is not DMA’d to host memory and is not included in the length reported in the descriptor. Can someone explain what the actual expected result is (besides the obvious "install over nfs should work") vs what was observed? I'm referring specifically to the tcpdump/ping command output as it seems everything is fine. It was also seen with the e1000e driver using this hardware: 01:00.0 Ethernet controller: Intel Corporation 82573E Gigabit Ethernet Controller (Copper) (rev 03) My test kernels have been updated to include a patch for this bugzilla. http://people.redhat.com/agospoda/#rhel5 Please test them and report back your results. in 2.6.18-77.el5 You can download this test kernel from http://people.redhat.com/dzickus/el5 An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2008-0314.html |