Bug 699490

Summary: e1000e: TX bytes stats are over-reported when NAT is enabled and used
Product: Red Hat Enterprise Linux 6 Reporter: James M. Leddy <james.leddy>
Component: kernelAssignee: Flavio Leitner <fleitner>
Status: CLOSED NOTABUG QA Contact: Network QE <network-qe>
Severity: high Docs Contact:
Priority: high    
Version: 6.1CC: agospoda, arozansk, fleitner, gborsuk, james.leddy, kzhang
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 693430 Environment:
Last Closed: 2011-06-17 23:03:14 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On: 693430    
Bug Blocks:    

Comment 1 James M. Leddy 2011-04-25 19:35:39 UTC
Description of problem:

A NAT rule might use skb_copy() which returns a linear skb.  However, if the
original skb is non-linear, the gso headers confuses the e1000e driver which
reports the wrong number of tx bytes.

The driver does:
[...]
        segs = skb_shinfo(skb)->gso_segs ? : 1;
        /* multiply data chunks by size of headers */
        bytecount = ((segs - 1) * skb_headlen(skb)) + skb->len;

The skb_headlen() will return the full packet length for a linear skb and not
just the packet headers.  As the segs is not 0 because before it was a
non-linear, then the driver will actually multiply the total packet length by
the number of segs which is clearly incorrect.

Version-Release number of selected component (if applicable):
2.6.18-254.el5

How reproducible:
Always

Steps to Reproduce:
1. on the server, add the iptables rule and run the tcp_send.pl reproducer
2. on the client, run the tcp_client.pl
3. compare the statistics when the client connects to the original port with
the redirected port.

Server:
# iptables -t nat -A PREROUTING -p tcp --tcp-flags FIN,SYN,RST,ACK SYN --dport 
8888 -j REDIRECT --to-ports  8877

# ./tcp_send.pl -p 8877 &

client connecting directly:
$ ./tcp_receive.pl  -s server_ip:8877  10m

or
client connecting to the redirected port:
$ ./tcp_receive.pl  -s server_ip:8877  10m

Actual results:
This is when connecting directly:
# sar -n DEV 2 2 | egrep 'IFACE|eth2'
Average:        IFACE   rxpck/s   txpck/s   rxbyt/s   txbyt/s   rxcmp/s  
txcmp/s  rxmcst/s
Average:         eth2    415.25    912.25  27406.50 1312921.50      0.00     
0.00      0.00

This is when connecting to the redirected port:
Average:         eth2    415.50    914.50  27421.50 8838014.00      0.00     
0.00      0.00


Expected results:
The same amount to be reported.

Comment 3 RHEL Program Management 2011-04-26 06:00:57 UTC
Since RHEL 6.1 External Beta has begun, and this bug remains
unresolved, it has been rejected as it is not proposed as
exception or blocker.

Red Hat invites you to ask your support representative to
propose this request, if appropriate and relevant, in the
next release of Red Hat Enterprise Linux.

Comment 4 Andy Gospodarek 2011-04-29 17:23:56 UTC
Flavio, I seem to remember you had a patch for this already.  The brew builds referenced are gone, so please post the patch here.  Thanks.

Comment 5 Andy Gospodarek 2011-04-29 17:25:33 UTC
If I recall the patch was based on this patch:


commit 67fd4fcb78a7ced369a6bd8a131ec8c65ebd2bbb
Author: Jeff Kirsher <jeffrey.t.kirsher>
Date:   Fri Jan 7 05:12:09 2011 +0000

    e1000e: convert to stats64

Comment 9 Flavio Leitner 2011-06-17 23:03:14 UTC
Actually, I was reviewing the RHEL-6 sources again and I reminded that before we cloned this ticket, I told that in RHEL-6 skb_make_writable() didn't use skb_copy() for copying skbs, so there is no skb non-linear to linear issue anymore. 
(ref: sf#00318121 #32 Created By: Leitner, Flavio (6/18/2010))

I just confirmed that by running the reproducer using kernel 2.6.32-131.0.15.el6.x86_64 and all the numbers are okay with e1000e.

$ ethtool -i eth0
driver: e1000e
version: 1.2.20-k2
firmware-version: 1.8-5
bus-info: 0000:00:19.0

<sar>
07:40:43 PM     IFACE   rxpck/s   txpck/s    rxkB/s    txkB/s   rxcmp/s   txcmp/s  rxmcst/s
07:40:45 PM      eth0    224.50    912.50     14.47   1282.28      0.00      0.00      0.00
07:40:45 PM     IFACE   rxpck/s   txpck/s    rxkB/s    txkB/s   rxcmp/s   txcmp/s  rxmcst/s
07:40:47 PM      eth0    224.00    913.00     14.44   1282.31      0.00      0.00      0.00
</sar>

So, the only benefit of backporting the upstream patch would be the conversion to stats64 which RHEL-6 isn't ready yet, thus I'll close this as NotABug. 

Although I've been unable to spot any other way to reproduce this on RHEL-6, feel free to reopen if needed.

Cheers!
fbl