Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
For bugs related to Red Hat Enterprise Linux 5 product line. The current stable release is 5.10. For Red Hat Enterprise Linux 6 and above, please visit Red Hat JIRA https://issues.redhat.com/secure/CreateIssue!default.jspa?pid=12332745 to report new issues.

Bug 438227

Summary: net offload keeps some PV guests from full network activity with Xen
Product: Red Hat Enterprise Linux 5 Reporter: Mark Wagner <mwagner>
Component: kernelAssignee: Andy Gospodarek <agospoda>
Status: CLOSED CURRENTRELEASE QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: medium Docs Contact:
Priority: medium    
Version: 5.2CC: benlu, clalance, dash, fleitner, herbert.xu, peterm, tao, xen-maint
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-07-06 07:51:46 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
igb: Correctly get protocol information none

Description Mark Wagner 2008-03-19 19:27:54 UTC
Description of problem:

When running a 5.1 or 5.2 version of the Xen kernel, some networking operations
are not working correctly with PV guests.  For instance, I can ping and ssh into
a rhl5.1 PV guest but a netperf test with TCP fails to complete.  UDP traffic
also is similarly impacted with netperf reporting that the guest saw no UDP traffic.

This problem is observable on a system with the igb driver and the 82575EB
chipset.  This problem has not been observed on a different system with the tg3
driver and the BCM5715 chipset.

I have also been able to repo this issue with a Windows guest with PV drivers
that enable me to toggle the offload bits. 

In the PV guest, the TX offload is enabled. If I disable the tx offload in the
guest, things work as expected. 

with the tx offload disabled in the guest:
[root@specclient2 np2.4]# ./netperf -P 0 -l 20 -H 192.168.1.68 -t UDP_STREAM
262144   65507   20.00       36706      0     961.77
126976           20.00       36706            961.77

with tx offload enabled in the guest:
[root@specclient2 np2.4]# ./netperf -P 0 -l 20 -H 192.168.1.68 -t UDP_STREAM
netperf: receive_response: no response received. errno 0 counter 0

The lack of full network functionality is impacting the ability to perform
performance evaluations. 

Version-Release number of selected component (if applicable):
 Host
-----
[root@perf10 np2.4]# uname -a
Linux perf10.lab.boston.redhat.com 2.6.18-85.el5xen #1 SMP Tue Mar 11 19:07:05
EDT 2008 x86_64 x86_64 x86_64 GNU/Linux


[root@perf10 np2.4]# ethtool -i peth1
driver: igb
version: 1.0.8-k2
firmware-version: 1.11-5
bus-info: 0000:08:00.1

[root@perf10 np2.4]# ethtool -k peth1
Offload parameters for peth1:
Cannot get device udp large send offload settings: Operation not supported
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp segmentation offload: on
udp fragmentation offload: off
generic segmentation offload: off
[root@perf10 np2.4]# 


How reproducible:
Every time

Steps to Reproduce:
1. (assumes proper system constraints)
2. Create a paravirtual guest
3. run netperf from an external box to the guest
  
Actual results:


Expected results:


Additional info:

Comment 1 Andy Gospodarek 2008-03-20 18:31:25 UTC
I wonder if the igb driver has the same packet split issue we saw in e1000e....

/me goes to check


Comment 2 Andy Gospodarek 2008-03-20 19:23:47 UTC
There could be a problem since igb disables hardware crc stripping but doesn't
ever decrement the len of the frame.  Maybe not though.  If we actually had some
igb hardware that we could use for testing you can probably reproduce this issue
by just putting igb in a bridge and making the bridge the endpoint for traffic
-- but I'm sure Herbert knows that. :-)

Comment 3 Herbert Xu 2008-03-24 12:15:58 UTC
Is tx offload enabled on the tg3? Do we see this with domU to domU? Thanks1

Comment 4 Mark Wagner 2008-03-24 15:59:28 UTC
On the system with the issue the Dom0 is running an igb driver with tx offload
enabled by default
[root@perf10 np2.4]# ethtool -i peth1
driver: igb
version: 1.0.1
firmware-version: 1.11-5
bus-info: 0000:08:00.1
[root@perf10 np2.4]# ethtool -k peth1
Offload parameters for peth1:
Cannot get device udp large send offload settings: Operation not supported
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp segmentation offload: on
udp fragmentation offload: off
generic segmentation offload: off
[root@perf10 np2.4]# ethtool -k eth1
Offload parameters for eth1:
Cannot get device rx csum settings: Operation not supported
Cannot get device udp large send offload settings: Operation not supported
rx-checksumming: off
tx-checksumming: on
scatter-gather: on
tcp segmentation offload: on
udp fragmentation offload: off
generic segmentation offload: off


On the working system (with the tg3 driver)[root@specclient3 ~]# ethtool -i peth1
driver: tg3
version: 3.86
firmware-version: 5715-v3.28
bus-info: 0000:09:04.1
[root@specclient3 ~]# ethtool -k peth1
Offload parameters for peth1:
Cannot get device udp large send offload settings: Operation not supported
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp segmentation offload: on
udp fragmentation offload: off
generic segmentation offload: off
[root@specclient3 ~]# ethtool -k eth1
Offload parameters for eth1:
Cannot get device rx csum settings: Operation not supported
Cannot get device udp large send offload settings: Operation not supported
rx-checksumming: off
tx-checksumming: on
scatter-gather: on
tcp segmentation offload: on
udp fragmentation offload: off
generic segmentation offload: off


------------------

Two rhel51_pv guests on the same Dom0 can communicate "just fine" on the system
with the igb driver. 



---------------

At this point in time, the issue is with an external system trying to run the
tests against the guests.  A quick matrix (where ext stands for an external box)

domU <-> domU      Pass
dom0 <-> domU      Pass
ext  <-> dom)      Pass
ext  <-> PV guest  Fails
ext  <-> FV Guest  Pass

By default the FV guest has the tx offload disabled, enabling it does not seem
to have a negative impact.   

Turning off the tx offload on the PV guest allows things to work.

Comment 5 Herbert Xu 2008-03-26 09:01:55 UTC
My testing indicates a problem with the driver.  dom0 doesn't show it presumably
because of its NAT netfilter rule which causes the checksum to be computed in
software.

I've takled to Auke from Intel and he says he has some patches which may help. 
I'll check with him again tomorrow.

Comment 6 Herbert Xu 2008-04-01 08:08:46 UTC
I was told by Auke that this patch should help:

commit 44b0cda37534093fd9fefacd64d5fbb589c50795
Author: Mitch Williams <mitch.a.williams>
Date:   Fri Mar 7 10:32:13 2008 -0800

    igb: Correctly get protocol information

    We can't look at the socket to get protocol information. We should
    instead look directly at the packet, and hope there are no IPv6
    option headers.

    Signed-off-by: Mitch Williams <mitch.a.williams>
    Signed-off-by: Auke Kok <auke-jan.h.kok>
    Signed-off-by: Jeff Garzik <jeff>

Please let me know if it doesn't and I'll ask him for more :)

Comment 7 Mark Wagner 2008-04-01 14:13:37 UTC
Um, sorry but I have no idea what to do with a commit number at this point. 

some additional pointers / direction are needed. 

Comment 8 Andy Gospodarek 2008-04-01 18:36:14 UTC
The fix list in comment #6 was based on a patch upstream in February that I
guess morphed into the fix that was finally committed in March.

http://marc.info/?l=linux-netdev&m=120278022027115&w=2

We can take the upstream fix without too much trouble, but my original fix was
designed to prevent a panic when skb->sk was NULL since traffic on the bridge
wasn't sent on a socket.

It looks like the only thing this change actually does functionally as compared
to my patch is add this flag E1000_ADVTXD_TUCMD_L4T_TCP if the traffic is
ipv6/tcp traffic.



Comment 9 Herbert Xu 2008-04-03 13:42:42 UTC
Either patch should do the trick.  The problem is that the driver is currently
looking up the socket to get the protocol.  But of course for Xen there is no
socket since the packet is from the guest.

Comment 10 Herbert Xu 2008-04-03 13:43:30 UTC
Created attachment 300247 [details]
igb: Correctly get protocol information

Here is the patch that should fix the problem.

Comment 11 Andy Gospodarek 2008-04-10 20:45:09 UTC
My test kernels have been updated to include a patch for this bugzilla.

http://people.redhat.com/agospoda/#rhel5

Please test them and report back your results.

Comment 12 Mark Wagner 2008-04-12 03:08:58 UTC
I used Andy's test kernel 2.6.18-89.el5.gtest.46xen and tested with a rhel51 PV
guest. I ran some 60 sec runs of netperf using both tcp and udp (ipv4 only).  I
did not observe the previously mentioned issues.

I retested the original issue reported by rebooting the host with the stock
2.6.18-88 kernel and retesting.  In this situation the problem did occur as the
driver did not have the patch.  

Comment 13 Andy Gospodarek 2008-04-14 13:39:24 UTC
Looking back at the patch I see that the original one did have a NULL pointer
check, so I understand why this would help (of course know it does helps too!)
and not just prevent a panic.  Score another one for Herbert. :-)



Comment 17 RHEL Program Management 2009-02-16 15:35:04 UTC
Updating PM score.