Bug 644438
| Summary: | bnx2: Out of order arrival of UDP packets in application | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 5 | Reporter: | Flavio Leitner <fleitner> |
| Component: | kernel | Assignee: | John Feeney <jfeeney> |
| Status: | CLOSED ERRATA | QA Contact: | Hangbin Liu <haliu> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 5.5 | CC: | benlu, edwardn, enarvaez, gideonn, haliu, jarod, jfeeney, jtorrice, kzhang, mchan, peterm |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2011-01-13 21:57:31 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
I have confirmed the behavior of fragments and non-fragments being placed into different rx rings. This is not supposed to happen in theory. UDP packets (fragments or not) should be hashed based on source and destination IP addresses only, and so they should all be placed in the same ring. I am working with our firmware team to debug this issue. Thanks. Hi Michael, Do you have any news regarding to this issue? Is there anything that I could do to help you? thanks! Yes, it's a firmware issue. The new upstream firmware already has the fix. This new firmware is currently not in RHEL5.6. Should we update the firmware in RHEL5.6 to fix this issue? The x86_64 built rpm with this new firmware can be found on my people page at http://people.redhat.com/jfeeney/.rhel-5.6. Note: Even though the rpm does not include this bz in its name, the fix for this bz (644438) is in there. Given a pressing deadline, a quick turn around on testing would be very much appreciated. We have a good feedback from customer confirming that the out of order issue is fixed in your test kernel. thanks, fbl With the test kernel, the UDP out-of-order is not re-producible. That was a pretty quick turnaround. Thanks, Ben. in kernel-2.6.18-236.el5 You can download this test kernel (or newer) from http://people.redhat.com/jwilson/el5 Detailed testing feedback is always welcomed. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-0017.html |
Description of problem: The bnx2 NIC changes the packet received ordering. This is a simple reproducer script: #!/usr/bin/perl -w use strict; use IO::Socket::Multicast; # create a new UDP socket ready to read datagrams on port 1100 my $s = IO::Socket::Multicast->new(LocalPort=>1100); # Add a multicast group $s->mcast_add('225.0.1.1'); # Set outgoing interface to eth0 $s->mcast_if('eth0'); # Multicast a message to group 225.0.0.1 my $data1 = "1" x 5000; my $data2 = "1" x 200; $s->mcast_send($data1, '225.0.0.1:1200'); $s->mcast_send($data2, '225.0.0.1:1200'); Running it on a hp box with a e1000g driver loaded (for output of the data) and a ibm x3650m2 with a bnx2 driver loaded (non bonded) for receiving. If you run the script a few times, the receiving side reorders the udp packets sometimes. The problem doesn't happen disabling MSI (disables multiqueue too) Version-Release number of selected component (if applicable): kernel-2.6.18-194.11.4.el5 How reproducible: Frequently, just run the script few times monitoring the traffic dump. Although the UDP protocol doesn't guarantee the packet ordering, the NIC shouldn't change the order of received packets. This is the piece of code from netdev-2.6 enabling the RSS hashing: 5221 static void 5222 bnx2_init_all_rings(struct bnx2 *bp) 5223 { 5224 int i; 5225 u32 val; ... 5258 val = BNX2_RLUP_RSS_CONFIG_IPV4_RSS_TYPE_ALL_XI | 5259 BNX2_RLUP_RSS_CONFIG_IPV6_RSS_TYPE_ALL_XI; 5260 5261 REG_WR(bp, BNX2_RLUP_RSS_CONFIG, val); 5262 5263 } 5264 } If I understand that correctly, all protocols running on top of IPV4 and IPV6 (i.e. UDP) will be hashed and queued in multiples queues. However, I see in the reproducer script the following: ... # Multicast a message to group 225.0.0.1 my $data1 = "1" x 5000; my $data2 = "1" x 200; ... which would cause the packet to be fragmented. The fragmentation seems to be causing the network flow to be broken into two streams, one for fragmented, and one for non-fragmented and this in turn is causing out-of-order issues. Thus, this looks similar to bz#613780 (igb)