Bug 1468996
Summary: | ib_write_bw failed over ConnectX-4 Lx/ROCE | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | zguo <zguo> |
Component: | perftest | Assignee: | Jarod Wilson <jarod> |
Status: | CLOSED NOTABUG | QA Contact: | Infiniband QE <infiniband-qe> |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | 7.4 | CC: | abeausol, bhu, ddutile, dledford, h.roudbari, kheib, mstowell, rdma-dev-team, salmy |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2017-10-26 13:50:53 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
zguo
2017-07-10 08:23:44 UTC
I cannot reproduce this on the same hosts, same perftest, same kernel, and same firmware. Closing as NOTABUG. [root@rdma-virt-02 ~]$ ib_write_bw -c RC -d mlx5_2 ************************************ * Waiting for client to connect... * ************************************ --------------------------------------------------------------------------------------- RDMA_Write BW Test Dual-port : OFF Device : mlx5_2 Number of qps : 1 Transport type : IB Connection type : RC Using SRQ : OFF CQ Moderation : 100 Mtu : 4096[B] Link type : Ethernet GID index : 2 Max inline data : 0[B] rdma_cm QPs : OFF Data ex. method : Ethernet --------------------------------------------------------------------------------------- local address: LID 0000 QPN 0x035b PSN 0xb4be04 RKey 0x044fb2 VAddr 0x002ae0d7ed3000 GID: 00:00:00:00:00:00:00:00:00:00:255:255:172:31:45:92 remote address: LID 0000 QPN 0x035a PSN 0x313d6d RKey 0x04194a VAddr 0x002aed4b283000 GID: 00:00:00:00:00:00:00:00:00:00:255:255:172:31:45:93 --------------------------------------------------------------------------------------- #bytes #iterations BW peak[MB/sec] BW average[MB/sec] MsgRate[Mpps] 65536 5000 1163.91 1163.91 0.018623 --------------------------------------------------------------------------------------- [root@rdma-virt-03 ~]$ timeout 3m ib_write_bw 172.31.40.92 -c RC -d mlx5_2 --------------------------------------------------------------------------------------- RDMA_Write BW Test Dual-port : OFF Device : mlx5_2 Number of qps : 1 Transport type : IB Connection type : RC Using SRQ : OFF TX depth : 128 CQ Moderation : 100 Mtu : 4096[B] Link type : Ethernet GID index : 2 Max inline data : 0[B] rdma_cm QPs : OFF Data ex. method : Ethernet --------------------------------------------------------------------------------------- local address: LID 0000 QPN 0x035a PSN 0x313d6d RKey 0x04194a VAddr 0x002aed4b283000 GID: 00:00:00:00:00:00:00:00:00:00:255:255:172:31:45:93 remote address: LID 0000 QPN 0x035b PSN 0xb4be04 RKey 0x044fb2 VAddr 0x002ae0d7ed3000 GID: 00:00:00:00:00:00:00:00:00:00:255:255:172:31:45:92 --------------------------------------------------------------------------------------- #bytes #iterations BW peak[MB/sec] BW average[MB/sec] MsgRate[Mpps] 65536 5000 1163.91 1163.91 0.018623 --------------------------------------------------------------------------------------- Info: [root@rdma-virt-02 ~]$ ethtool -i mlx5_roce driver: mlx5_core version: 3.0-1 (January 2015) firmware-version: 14.18.1000 expansion-rom-version: bus-info: 0000:05:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: no supports-register-dump: no supports-priv-flags: yes [root@rdma-virt-02 ~]$ ibstat mlx5_2 CA 'mlx5_2' CA type: MT4117 Number of ports: 1 Firmware version: 14.18.1000 Hardware version: 0 Node GUID: 0xe41d2d0300fda72a System image GUID: 0xe41d2d0300fda72a Port 1: State: Active Physical state: LinkUp Rate: 40 Base lid: 0 LMC: 0 SM lid: 0 Capability mask: 0x04010000 Port GUID: 0xe61d2dfffefda72a Link layer: Ethernet [root@rdma-virt-02 ~]$ lspci | grep Mell 04:00.0 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4] 04:00.1 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4] 05:00.0 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx] 05:00.1 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx] [root@rdma-virt-02 ~]$ rpm -q perftest perftest-3.4-1.el7.x86_64 [root@rdma-virt-02 ~]$ uname -r 3.10.0-693.el7.x86_64 Hello @zguo, I'd like to ask if you could please share the solution which you came up with. I've encountered the exact same error during this test. I'm using two ConnectX-4 Lx cards (installed on separate machines in the same LAN). Any tips, advice would be much appreciated! Best regards, Hamed (In reply to Hamed from comment #5) > Hello @zguo, > > > I'd like to ask if you could please share the solution which you came up > with. I've encountered the exact same error during this test. > > I'm using two ConnectX-4 Lx cards (installed on separate machines in the > same LAN). > > > Any tips, advice would be much appreciated! > > > Best regards, > Hamed zguo offline atm. Try updating perftest to the latest release. That's all we did to not see the error any longer, and thus, closed the bz. (In reply to Don Dutile (Red Hat) from comment #6) > (In reply to Hamed from comment #5) > > Hello @zguo, > > > > > > I'd like to ask if you could please share the solution which you came up > > with. I've encountered the exact same error during this test. > > > > I'm using two ConnectX-4 Lx cards (installed on separate machines in the > > same LAN). > > > > > > Any tips, advice would be much appreciated! > > > > > > Best regards, > > Hamed > > > zguo offline atm. > Try updating perftest to the latest release. > That's all we did to not see the error any longer, and thus, closed the bz. Thanks Don. Hi Hamed, What I can tell is to make sure 1) server ConnectX-4 Lx can ping client ConnectX-4 Lx successfully 2) use the latest perftest 3) the command parameters are correct Hi Don, zguo, Your advice is incredibly helpful and appreciated. Thanks for your prompt replies! |