Bug 1015913

Summary: Missing device driver for Mellanox Technologies MT27500 Family [ConnectX-3] on HP SL45xx system
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Veda Shankar <veshanka>
Component: distributionAssignee: Sayan Saha <ssaha>
Status: CLOSED EOL QA Contact: storage-qa-internal <storage-qa-internal>
Severity: high Docs Contact:
Priority: unspecified    
Version: 2.1CC: amirv, bcompton, bruno.cornec, dledford, jcm, peterm, rhs-bugs, vagarwal
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-12-03 17:18:56 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Veda Shankar 2013-10-06 18:10:08 UTC
Description of problem:


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 2 Veda Shankar 2013-10-06 18:34:42 UTC
The MLNX 10gig NIC does not work out-of-the-box with the RHS 2.1 ISO.  The RPM listing does not show any *mlx* packages.  Output from lspci below:

# lspci -vvx -s 04:00.0
04:00.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3]
        Subsystem: Hewlett-Packard Company Device 17c9
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 26
        Region 0: Memory at f7d00000 (64-bit, non-prefetchable) [size=1M]
        Region 2: Memory at f4000000 (64-bit, prefetchable) [size=32M]
        Capabilities: [40] Power Management version 3
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [48] Vital Product Data
                Unknown large resource type 7f, will not decode more.
        Capabilities: [9c] MSI-X: Enable+ Count=128 Masked-
                Vector table: BAR=0 offset=0007c000
                PBA: BAR=0 offset=0007d000
        Capabilities: [60] Express (v2) Endpoint, MSI 00
                DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <64ns, L1 unlimited
                        ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+
                DevCtl: Report errors: Correctable- Non-Fatal+ Fatal+ Unsupported-
                        RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop- FLReset-
                        MaxPayload 256 bytes, MaxReadReq 4096 bytes
                DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
                LnkCap: Port #8, Speed 8GT/s, Width x8, ASPM L0s, Latency L0 unlimited, L1 unlimited
                        ClockPM- Surprise- LLActRep- BwNot-
                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk-
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, OBFF Not Supported
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
                LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                         Compliance De-emphasis: -6dB
                LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
                         EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
        Capabilities: [100 v1] Alternative Routing-ID Interpretation (ARI)
                ARICap: MFVC- ACS-, Next Function: 0
                ARICtl: MFVC- ACS-, Function Group: 0
        Capabilities: [148 v1] Device Serial Number 44-1e-a1-03-00-93-5d-70
        Capabilities: [108 v1] Single Root I/O Virtualization (SR-IOV)
                IOVCap: Migration-, Interrupt Message Number: 000
                IOVCtl: Enable- Migration- Interrupt- MSE- ARIHierarchy+
                IOVSta: Migration-
                Initial VFs: 16, Total VFs: 16, Number of VFs: 16, Function Dependency Link: 00
                VF offset: 1, stride: 1, Device ID: 1004
                Supported Page Size: 000007ff, System Page Size: 00000001
                Region 2: Memory at 0000000000000000 (64-bit, prefetchable)
                VF Migration: offset: 00000000, BIR: 0
        Capabilities: [154 v2] Advanced Error Reporting
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
                UESvrt: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
                AERCap: First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn-
        Capabilities: [18c v1] #19
        Kernel driver in use: mlx4_core
        Kernel modules: mlx4_en, mlx4_core
00: b3 15 03 10 46 04 10 00 00 00 80 02 10 00 00 00
10: 04 00 d0 f7 00 00 00 00 0c 00 00 f4 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 3c 10 c9 17
30: 00 00 00 00 40 00 00 00 00 00 00 00 05 01 00 00

Comment 3 Bruno Cornec 2014-07-24 19:19:45 UTC
I can confirm this doesn't work with RHSS 3.0 EA either. Driver is still missing.

Comment 4 Amir Vadai 2014-08-24 07:58:03 UTC
Bruno, If you install libmlx4 does it fix the issue?

Doug,

I know we talked about it in the past.
Isn't there a way to make the OOB experience better? Can't we add libmlx4 to the default set of RPM's?

Amir

Comment 5 Doug Ledford 2014-11-11 18:27:56 UTC
If the libmlx4 package was left out of the package set for this distro then that is the source of the problem here.  To fix the issue will require adding libmlx4 (and libcxgb3 and libcxgb4) in to the package set and installing them by default in order to make sure that the various multi-driver RDMA interfaces come up properly.

Comment 6 Vivek Agarwal 2015-12-03 17:18:56 UTC
Thank you for submitting this issue for consideration in Red Hat Gluster Storage. The release for which you requested us to review, is now End of Life. Please See https://access.redhat.com/support/policy/updates/rhs/

If you can reproduce this bug against a currently maintained version of Red Hat Gluster Storage, please feel free to file a new report against the current release.