Bug 135273

Summary: tg3 transmitting bad UDP checksums...
Product: Red Hat Enterprise Linux 3 Reporter: Daniel J Blueman <daniel.blueman>
Component: kernelAssignee: John W. Linville <linville>
Status: CLOSED NOTABUG QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: medium    
Version: 3.0CC: petrides, riel
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2004-10-12 17:35:56 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Daniel J Blueman 2004-10-11 15:59:15 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7)
Gecko/20040808 Firefox/0.9.3

Description of problem:
The previous tg3 driver v3.1 was producing UDP fragmented packets
(from NFS traffic) with bad checksums on my hardware [1]

To verify v3.6RH fixed this, I took a tcpdump session, but actually
still did see some bad UDP packets with bad checksum [2].

--- [1]

eth0: Tigon3 [partno(BCM95704) rev 2002 PHY(5704)]
(PCIX:100MHz:64-bit) 10/100/1000BaseT Ethernet 00:00:1a:19:ba:a4

--- [2]

16:44:10.850000 quorum1.32770 > stegosaurus.domain: [bad udp cksum
4767!]  36543+ PTR? 1.1.1.239.in-addr.arpa. (40) (DF) (ttl 64, id
12316, len 68)
16:44:15.070000 quorum1.32770 > stegosaurus.domain: [bad udp cksum
67c1!]  13470+ PTR? 2.1.1.239.in-addr.arpa. (40) (DF) (ttl 64, id
12738, len 68)



Version-Release number of selected component (if applicable):
2.4.21-20.EL - tg3.c:v3.6RH (June 12, 2004)

How reproducible:
Always

Steps to Reproduce:
1. produce certain kinds of UDP traffic - here, DNS queries
2. run # tcpdump -vvv udp | grep 'cksum'
3. create some traffic and lean back
4. observe the network possibly silently dropping packets
    

Actual Results:  UDP packets transmitted with bad UDP checksum

Expected Results:  good UDP checksum

Additional info:

Comment 1 John W. Linville 2004-10-12 17:35:56 UTC
This is actually a feature of the TG3 hardware/driver.  Many other
cards behave similarly.

Many cards are capable of generating IP, TCP, and/or UDP checksums in
the hardware.  The OS passes frames w/o checksums to the hardware and
indicate in the transmit descriptor that the hardware should checksum
the frame.  The hardware obliges, and the frame is sent on the wire w/
the proper checksum.

The frames seen by tcpdump have not passed through the hardware. 
Therefore, the chance of tcpdump thinking the UDP checksum is correct
would be very low (approximately 1 in 65536)... :-)

The problem w/ your test is that you appear to be running tcpdump on
the transmitting host.  I'm sure you will see different results if you
were to run tcpdump on the receiving host (in your example, the DNS
server).

Comment 2 Daniel J Blueman 2004-10-12 22:20:29 UTC
John, I agree with your observations, however, I do not see this 
behaviour [of bad checksums] on other networking hardware with the 
same TSO (ie TX checksum at least) features at all.

If what you said did hold, then all TX-checksummed packets would have 
bad checksums, but this is not true, even with fragmented UDP packets.

One easy way is to:

tg3-sys# tcpdump udp

And see how many UDP packets do have good checksums, and:

e1000-sys# tcpdump udp

And find all packets have good checksums.

Can you take a look and reopen? BTW, any decent switch (eg Cisco VLAN 
ones) will drop packets w/ bad checksums, so you simply can't measure 
them anywhere else.

Comment 3 John W. Linville 2004-10-13 19:07:40 UTC
Actually, I see the same thing w/ FC2 on my box w/ e1000:

[root@savage root]# cat /etc/modprobe.conf
alias eth0 e1000

[root@savage root]# tcpdump -vvv udp | grep 'cksum'
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size
96 bytes
14:28:10.346009 IP (tos 0x0, ttl  64, id 22900, offset 0, flags [DF],
proto 17, length: 72) savage.devel.redhat.com.43161 >
ns1.rdu.redhat.com.domain: [bad udp cksum 248d!]  18162+ PTR?
255.59.16.172.in-addr.arpa. (44)
14:28:10.347369 IP (tos 0x0, ttl  64, id 22902, offset 0, flags [DF],
proto 17, length: 72) savage.devel.redhat.com.43161 >
ns1.rdu.redhat.com.domain: [bad udp cksum 2391!]  18163+ PTR?
245.56.16.172.in-addr.arpa. (44)
14:28:10.348753 IP (tos 0x0, ttl  64, id 22903, offset 0, flags [DF],
proto 17, length: 70) savage.devel.redhat.com.43161 >
ns1.rdu.redhat.com.domain: [bad udp cksum 5ec5!]  18164+ PTR?
1.58.16.172.in-addr.arpa. (42)
14:28:10.349994 IP (tos 0x0, ttl  64, id 22904, offset 0, flags [DF],
proto 17, length: 71) savage.devel.redhat.com.43161 >
ns1.rdu.redhat.com.domain: [bad udp cksum ba2a!]  18165+ PTR?
46.59.16.172.in-addr.arpa. (43)
14:28:10.351242 IP (tos 0x0, ttl  64, id 22905, offset 0, flags [DF],
proto 17, length: 71) savage.devel.redhat.com.43161 >
ns1.rdu.redhat.com.domain: [bad udp cksum c228!]  18166+ PTR?
28.52.16.172.in-addr.arpa. (43)
14:28:21.199013 IP (tos 0x0, ttl  64, id 33755, offset 0, flags [DF],
proto 17, length: 58) savage.devel.redhat.com.43161 >
ns1.rdu.redhat.com.domain: [bad udp cksum 442d!]  577+ AAAA?
slashdot.org. (30)
14:28:21.199864 IP (tos 0x0, ttl  64, id 33756, offset 0, flags [DF],
proto 17, length: 75) savage.devel.redhat.com.43161 >
ns1.rdu.redhat.com.domain: [bad udp cksum 441e!]  578+ AAAA?
slashdot.org.devel.redhat.com. (47)
14:28:21.200718 IP (tos 0x0, ttl  64, id 33756, offset 0, flags [DF],
proto 17, length: 74) savage.devel.redhat.com.43161 >
ns1.rdu.redhat.com.domain: [bad udp cksum 833e!]  579+ AAAA?
slashdot.org.corp.redhat.com. (46)
14:28:21.201577 IP (tos 0x0, ttl  64, id 33757, offset 0, flags [DF],
proto 17, length: 74) savage.devel.redhat.com.43161 >
ns1.rdu.redhat.com.domain: [bad udp cksum 9631!]  580+ AAAA?
slashdot.org.perf.redhat.com. (46)
14:28:21.202426 IP (tos 0x0, ttl  64, id 33758, offset 0, flags [DF],
proto 17, length: 69) savage.devel.redhat.com.43161 >
ns1.rdu.redhat.com.domain: [bad udp cksum 2365!]  581+ AAAA?
slashdot.org.redhat.com. (41)
14:28:21.203305 IP (tos 0x0, ttl  64, id 33759, offset 0, flags [DF],
proto 17, length: 58) savage.devel.redhat.com.43161 >
ns1.rdu.redhat.com.domain: [bad udp cksum 5a2d!]  582+ A?
slashdot.org. (30)

Re: fragmented UDP packets -- the UDP checksum covers the
un-fragmented UDP packet.  The card only sees ethernet frames.  If the
UDP packet is too big to fit in a single ethernet frame (i.e. it has
to be fragmented), then the kernel has to generate the UDP checksum
itself.  Therefore, tcpdump on the sending host will see only good UDP
checksums.

AFAIK switches/bridges will only drop frames with bad _ethernet_ CRCs,
at least partly for the reasoning in the above paragraph. 
Irregardless, the original suggestion to run tcpdump on the receiving
host will still hold provided you attach the sender and the receiver
with a sufficiently dumb network (e.g. a crossover cable)... :-)