Bug 545530
| Summary: | mlx4_en excessive TCP retransmissions | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 5 | Reporter: | Issue Tracker <tao> |
| Component: | kernel | Assignee: | Jay Fenlason <fenlason> |
| Status: | CLOSED NOTABUG | QA Contact: | Red Hat Kernel QE team <kernel-qe> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 5.4 | CC: | cww, fenlason, jeder, jfeeney, ltroan, ogerlitz, peterm, samuel, tao |
| Target Milestone: | beta | ||
| Target Release: | 5.8 | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2011-10-17 03:16:40 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 502912, 600363, 680163, 719172 | ||
|
Description
Issue Tracker
2009-12-08 19:49:01 UTC
Event posted on 12-08-2009 02:37pm EST by woodard
From: Trent D'Hooge <tdhooge>
Subject: mlx4_en driver issue
Date: December 8, 2009 1:22:54 PM CST
To: Ben Coyote Woodard <woodard>, woodard9, Ira Weiny <weiny2>
Sending this to you first before opening a ticket so that we are on the same
page. Then we should open a ticket with RH.
RHEL5.3 used mtnic, RHEL5.4 uses the mlx4_en driver
Mellanox firmware version on the 10GigE card is 2.7.0. When using the mtnic
driver we were at firmware version 2.5.914.
First problem seen:
The mlx4_en driver seems to be losing enough packets to cause a number of TCP
connections to fail, timeout, and then eventually get connected. Lustre does
not like this and gets upset. (Even if Lustre was not upset this could cause
major performance issues...)
first problem found by Ira:
The 10GigE card was not using MSI interupts. He fixed this, but we are still
having problems.
from e-mails going around:
First our conclusion is the unified driver is BROKEN... As I say at the
bottom of this email. The only thing which has changed is the software. We
are using the unified driver from RHEL 5.4. The only modification has been
the patch I just applied to get MSI to work...
Now for the gory details...
After enabling MSI we still see connections getting into SYN_RECV and causing
problems.
ifconfig and ethtool show only a few errors on the RX side. I don't know how
running these nodes back to back is going to present the problem. Right now
running 2 nodes against each other through the switch results in no errors. I
believe there is something more complex going on because of the large number
of TCP connections which Lustre establishes.
We still see a large number of retransmissions in TCP.
# hype139 /sys/module/mlx4_core/parameters > netstat -s | grep retrans
1059776 segments retransmited
569233 fast retransmits
467332 forward retransmits
2536 retransmits in slow start
628 sack retransmits failed
# hype139 /sys/module/mlx4_core/parameters > ifconfig eth2
eth2 Link encap:Ethernet HWaddr 00:02:C9:04:6E:88
inet addr:172.16.1.201 Bcast:172.16.7.255 Mask:255.255.248.0
inet6 addr: fe80::202:c9ff:fe04:6e88/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:7500 Metric:1
RX packets:102570804 errors:20 dropped:23 overruns:23 frame:43
TX packets:211462289 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:83809240941 (78.0 GiB) TX bytes:1162296790558 (1.0 TiB)
# hype139 /sys/module/mlx4_core/parameters > ifconfig eth3
eth3 Link encap:Ethernet HWaddr 00:02:C9:04:6E:89
inet addr:172.16.9.203 Bcast:172.16.15.255 Mask:255.255.248.0
inet6 addr: fe80::202:c9ff:fe04:6e89/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:7500 Metric:1
RX packets:114022111 errors:0 dropped:29 overruns:29 frame:29
TX packets:241317415 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:90188106711 (83.9 GiB) TX bytes:1345718307382 (1.2 TiB)
I attempted to turn on debugging in the driver but nothing is being printed to
the console.
# hype139 /sys/module/mlx4_core/parameters >
cat /sys/module/mlx4_core/parameters/debug_level
1
here are the core settings.
/sys/module/mlx4_core/parameters/block_loopback
1
/sys/module/mlx4_core/parameters/debug_level
1
/sys/module/mlx4_core/parameters/enable_qos
N
/sys/module/mlx4_core/parameters/internal_err_reset
1
/sys/module/mlx4_core/parameters/log_mtts_per_seg
3
/sys/module/mlx4_core/parameters/log_num_cq
0
/sys/module/mlx4_core/parameters/log_num_mac
2
/sys/module/mlx4_core/parameters/log_num_mcg
0
/sys/module/mlx4_core/parameters/log_num_mpt
0
/sys/module/mlx4_core/parameters/log_num_mtt
0
/sys/module/mlx4_core/parameters/log_num_qp
0
/sys/module/mlx4_core/parameters/log_num_srq
0
/sys/module/mlx4_core/parameters/log_num_vlan
0
/sys/module/mlx4_core/parameters/log_rdmarc_per_qp
0
/sys/module/mlx4_core/parameters/msi_x
1
/sys/module/mlx4_core/parameters/panic_on_catas
0
/sys/module/mlx4_core/parameters/set_4k_mtu
0
/sys/module/mlx4_core/parameters/use_prio
N
And the ethernet driver settings:
# hype139 /sys/module/mlx4_core/parameters > for file
in /sys/module/mlx4_en/parameters/*; do echo $file; cat $file; done
/sys/module/mlx4_en/parameters/inline_thold
104
/sys/module/mlx4_en/parameters/ip_reasm
1
/sys/module/mlx4_en/parameters/num_lro
0
/sys/module/mlx4_en/parameters/pfcrx
0
/sys/module/mlx4_en/parameters/pfctx
0
/sys/module/mlx4_en/parameters/rss_mask
5
/sys/module/mlx4_en/parameters/rss_xor
0
We are still looking for errors anywhere else in the system (ie on the
switches or other network cards). But we have NOT FOUND ANY. We ran for 3
days with Myricom cards over the weekend without any issues. The mtnic driver
we were using previously worked (After much pain!). So we are highly
suspicious of the new unified driver. Once again, ONLY THE SOFTWARE HAS
CHANGED here... :-(
Is there perhaps a FW upgrade which needs to be done with the unified driver?
# hype139 /sys/module/mlx4_core/parameters > mstflint -d 02:00.0 q
Image type: ConnectX
FW Version: 2.7.0
Device ID: 26428
Chip Revision: A0
Description: Node Port1 Port2 Sys image
GUIDs: 0002c9030004e948 0002c9030004e949 0002c9030004e94a
0002c9030004e94b
MACs: 000000000000 000000000001
Board ID: (MT_0C40110009)
VSD:
PSID: MT_0C40110009
# hype139 /sys/module/mlx4_core/parameters > mstflint -d 85:00.0 q
Image type: ConnectX
FW Version: 2.7.0
Device ID: 25448
Chip Revision: A0
Description: Port1 Port2
MACs: 0002c9046e88 0002c9046e89
Board ID: (MT_0BD0110004)
VSD:
PSID: MT_0BD0110004
# hype139 /sys/module/mlx4_core/parameters > mstflint -d 86:00.0 q
Image type: ConnectX
FW Version: 2.7.0
Device ID: 26428
Chip Revision: A0
Description: Node Port1 Port2 Sys image
GUIDs: 0002c9030004e928 0002c9030004e929 0002c9030004e92a
0002c9030004e92b
MACs: 000000000000 000000000001
Board ID: (MT_0C40110009)
VSD:
PSID: MT_0C40110009
I don't know what else to try. We will continue to look for a smaller scale
reproducer but nothing we have done so far is working.
Ira
Begin forwarded message:
Date: Tue, 8 Dec 2009 09:00:53 -0800
From: Jim Garlick <garlick>
To: weiny2, behlendorf1, morrone2
Subject: SYN_RECV connections are back on hype
Uh oh, looks like the old problem is back.
Jim
ehype139: Active Internet connections (w/o servers)
ehype139: Proto Recv-Q Send-Q Local Address Foreign Address
State
ehype139: tcp 0 0 hype139-lnet0:lustresvc strauss2-eth2:1023
SYN_RECV
ehype139: tcp 0 0 hype139-lnet0:lustresvc
pigs7-lnet0:edvrpftpd SYN_RECV
ehype139: tcp 0 0 hype139-lnet0:lustresvc levi3-eth2:edvrpftpd
SYN_RECV
ehype139: tcp 0 0 hype139-lnet0:lustresvc strauss10-eth2:1023
SYN_RECV
ehype139: tcp 0 0 hype139-lnet0:lustresvc momus12-eth2:1021
SYN_RECV
ehype139: tcp 0 0 hype139-lnet0:lustresvc tycho12-lnet0:1021
SYN_RECV
ehype139: tcp 0 0 hype139-lnet0:lustresvc momus2-eth2:1023
SYN_RECV
ehype139: tcp 0 0 hype139-lnet0:lustresvc strauss13-eth2:1020
SYN_RECV
ehype139: tcp 0 0 hype139-lnet0:lustresvc levi4-eth2:1021
SYN_RECV
ehype139: tcp 0 0 hype139-lnet0:lustresvc momus5-eth2:1020
SYN_RECV
ehype139: tcp 0 0 hype139-lnet0:lustresvc pigs4-lnet0:1021
SYN_RECV
ehype139: tcp 0 0 hype139-lnet0:lustresvc strauss12-eth2:1021
SYN_RECV
ehype139: tcp 0 0 hype139-lnet0:lustresvc pigs2-lnet0:1023
SYN_RECV
ehype139: tcp 0 0 hype139-lnet0:lustresvc tycho4-lnet0:1021
SYN_RECV
ehype139: tcp 0 0 hype139-lnet0:lustresvc strauss1-eth2:1020
SYN_RECV
ehype139: tcp 0 0 hype139-lnet0:lustresvc momus14-eth2:1023
SYN_RECV
ehype139: tcp 0 0 hype139-lnet0:lustresvc
tycho7-lnet0:edvrpftpd SYN_RECV
ehype139: tcp 0 0 hype139-lnet0:lustresvc momus9-eth2:1020
SYN_RECV
ehype139: tcp 0 0 hype139-lnet0:lustresvc strauss6-eth2:1023
SYN_RECV
ehype139: tcp 0 0 hype139-lnet0:lustresvc pigs14-lnet0:1023
SYN_RECV
ehype139: tcp 0 0 hype139-lnet0:lustresvc tycho10-lnet0:1023
SYN_RECV
ehype139: tcp 0 0 hype139-lnet0:lustresvc momus6-eth2:1023
SYN_RECV
ehype139: tcp 0 0 hype139-lnet0:lustresvc levi6-eth2:1023
SYN_RECV
ehype139: tcp 0 0 hype139-lnet0:lustresvc
pigs15-lnet0:edvrpftpd SYN_RECV
ehype139: tcp 0 0 hype139-lnet0:lustresvc tycho6-lnet0:1023
SYN_RECV
ehype139: tcp 0 0 hype139-lnet0:lustresvc momus10-eth2:1023
SYN_RECV
This event sent from IssueTracker by kbaxley [LLNL (HPC)]
issue 373976
Event posted on 2009-12-16 13:23 PST by woodard Thanks Doug, Regarding the FW version Mellanox did not change much. They sent me a "2.7.0" version which reduced the number of outstanding PCI transactions on the bus from 16 to 12 to 8 to 4. I tried the 12, 8, and 4 versions. They thought there was evidence of a PCI issue but none of these helped. From our point of view we did not think this was the issue but we tried the FW just to make sure. Ira This event sent from IssueTracker by woodard issue 373976 Did you try the driver Doug attached? Did it behave any different than the 5.4 one? I've just added a bug of our own (relating our IBM HPC gear at VLSCI at the University of Melbourne) where we are seeing packets arriving on the physical eth1 10Gb/s interface being delivered incorrectly by the driver to eth0. https://bugzilla.redhat.com/show_bug.cgi?id=649623 We have replicated with this using 3 cards (Mellanox ConnectX2 MT26448) in 2 different servers so we've pretty confident it's not a hardware problem. This is with RHEL5.5 We've found that using the mlx4_en driver from the Mellanox site does seem to fix it though - so it might be worth investigating yourselves. Red Hat have told us this bug won't get fixed in 5.7, but they will look at whether or not they will fix it in 5.8. :-( It does appear that RHEL 6.1 might have the newer version of the driver without this problem though.. Closing this as not a bug. The original customer report was closed indicating that the problem was due to faulty hardware. If you disagree with this please open a support case with Red Hat support at access.redhat.com |