Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1330719

Summary: dpdk_nic_bind --bind=vfio-pci failed to bind mlx4
Product: Red Hat Enterprise Linux 7 Reporter: Jean-Tsung Hsiao <jhsiao>
Component: openvswitch-dpdkAssignee: Thadeu Lima de Souza Cascardo <cascardo>
Status: CLOSED CANTFIX QA Contact: Jean-Tsung Hsiao <jhsiao>
Severity: low Docs Contact:
Priority: low    
Version: 7.3CC: aconole, atragler, cascardo, fleitner, jhsiao, kzhang, osabart, rcain, rkhan, weiyongjun
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-09-29 18:18:06 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
lspci and dpdk_nic_bind -s
none
allows any network class device to be consided by dpdk_nic_bind none

Description Jean-Tsung Hsiao 2016-04-26 19:21:59 UTC
Description of problem: dpdk_nic_bind --bind=vfio-pci failed to bind mlx4

[root@netqe5 dpdk-multique-scripts]# ethtool -i p6p1
driver: mlx4_en
version: 2.2-1 (Feb 2014)
firmware-version: 2.32.5100
bus-info: 0000:03:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: yes
[root@netqe5 dpdk-multique-scripts]# dpdk_nic_bind --bind=vfio-pci 0000:03:00.0
Unknown device: 0000:03:00.0. Please specify device in "bus:slot.func" format
[root@netqe5 dpdk-multique-scripts]# 


Version-Release number of selected component (if applicable):
[root@netqe5 dpdk-multique-scripts]# rpm -qa | grep dpdk
dpdk-tools-2.2.0-3.el7.x86_64
kernel-kernel-networking-dpdk-only-1.0-4.noarch
dpdk-2.2.0-3.el7.x86_64
openvswitch-dpdk-2.5.0-3.el7.x86_64
[root@netqe5 dpdk-multique-scripts]# uname -a
Linux netqe5.knqe.lab.eng.bos.redhat.com 3.10.0-382.el7.x86_64 #1 SMP Tue Apr 19 13:22:06 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux
[root@netqe5 dpdk-multique-scripts]#

How reproducible: reproducible


Steps to Reproduce:
1.install dpdk-2.2.0-3
2.modprobe vfio-pci
3.dpdk_nic_bind --bind=vfio-pci <mlx4 pci bud addr>

Actual results:
failed --- see description above

Expected results:
should succeed

Additional info:

Comment 1 Thadeu Lima de Souza Cascardo 2016-04-26 19:35:28 UTC
Hi, Jean-Tsung.

What is the output of dpdk_nic_bind -s?

Thanks.
Cascardo.

Comment 2 Aaron Conole 2016-04-26 20:00:16 UTC
Please include the output of the following commands:

lspci
lspci -vt
dmesg
dpdk_nic_bind -s

Thanks.

Comment 4 Jean-Tsung Hsiao 2016-04-26 21:00:16 UTC
Created attachment 1151087 [details]
lspci and dpdk_nic_bind -s

see attached log for lspci and "dpdk_nic_bind" info

Comment 5 Aaron Conole 2016-04-26 21:05:29 UTC
Agh, I shouldn't have even needed that information. Sorry.

We don't ship Mellanox with DPDK 2.2, because at that point in time it required non-upstream library changes. I don't know if that is still the case; I will ask and get back to you.

Comment 6 Panu Matilainen 2016-04-27 06:52:01 UTC
Seems to be the case still, neither mlx4 nor mlx5 comes anywhere near compiling with libibverbs 1.2.0 which is supposed to be the latest version.

Comment 7 Panu Matilainen 2016-04-27 09:20:58 UTC
...but actually whether the PMD is shipped or not doesn't even come to play at this stage, dpdk_nic_bind knows nothing about the actual DPDK-side driver.

The actual catch here is that dpdk_nic_bind thinks the Mellanox device doesn't even exist, or at least is not a NIC at all. From our POV it doesn't matter because it wouldn't work anyway but it does suggest there is a bug, perhaps in dpdk_nic_bind.

OTOH if you use driverctl instead of dpdk_nic_bind such issues wont come to play because it doesn't try to be overly clever.

Comment 8 Thadeu Lima de Souza Cascardo 2016-04-27 19:42:30 UTC
Created attachment 1151603 [details]
allows any network class device to be consided by dpdk_nic_bind

This is just to show that we can make dpdk_nic_bind accept other devices as well. In this case, any network devices would be included. On a laptop, this would include a Wifi PCI board, for example. The Mellanox card is a single PCI function that also supports other functions like RoCE, so its configuration is not of an Ethernet class. This patch should work for it too.

As Panu has argued, there is not much point in preventing devices to be bound to vfio-pci. dpdk_nic_bind is just a nice wrapper to verify and change to which driver a device is bound. Binding it to vfio-pci could be done manually as well.

But, please, try this patch and see if it fixes the problem. Maybe it's something DPDK upstream would accept on the basis that mlx4 device requires it.

Cascardo.

Comment 9 Thadeu Lima de Souza Cascardo 2016-04-27 19:43:39 UTC
Hi, Jean-Tsung.

Can you apply the attached patch to the installed version of dpdk_nic_bind and see if that works for you. It's just two lines, you can edit it by hand as well.

Thanks.
Cascardo.

Comment 10 Jean-Tsung Hsiao 2016-04-28 13:41:43 UTC
(In reply to Thadeu Lima de Souza Cascardo from comment #9)
> Hi, Jean-Tsung.
> 
> Can you apply the attached patch to the installed version of dpdk_nic_bind
> and see if that works for you. It's just two lines, you can edit it by hand
> as well.
> 
> Thanks.
> Cascardo.

Hi Cascardo,

Yes, the patch works. But, like bnx2x its dpdk is rejected by ovs-dpdk bridge.

Network devices using DPDK-compatible driver
============================================
0000:03:00.0 'MT27520 Family [ConnectX-3 Pro]' drv=vfio-pci unused=

ovs-vsctl: Error detected while setting up 'dpdk0'.  See ovs-vswitchd log for details.
029d107c-e529-48cf-bca5-36b70b8e3eb8
    Bridge "ovsbr0"
        Port "ovsbr0"
            Interface "ovsbr0"
                type: internal
        Port "dpdk0"
            Interface "dpdk0"
                type: dpdk
                error: "could not open network device dpdk0 (No such device)"
        Port "int0"
            Interface "int0"
                type: internal
    ovs_version: "2.5.0"
OFPST_PORT reply (xid=0x2): 2 ports
  port LOCAL: rx pkts=0, bytes=0, drop=0, errs=0, frame=0, over=0, crc=0
           tx pkts=2, bytes=150, drop=2, errs=0, coll=0
  port  1: rx pkts=2, bytes=132, drop=0, errs=0, frame=0, over=0, crc=0
           tx pkts=0, bytes=0, drop=0, errs=0, coll=0

Comment 11 Thadeu Lima de Souza Cascardo 2016-04-28 17:32:45 UTC
As Aaron and Panu have pointed out, we don't ship the mlx4 driver because it requires unreleased software.

Panu, do you think this patch could fly upstream? Or do you suggest we just ignore it and just recommend driverctl?

Cascardo.

Comment 12 Panu Matilainen 2016-04-29 07:49:16 UTC
Regardless of what we recommend, upstream ought to be interested in the patch because its preventing binding to an otherwise supported (I guess) adapter. Even if others dont care, Mellanox should!

Whether its acceptable like or as an additional option to display all network class adapters instead of just ethernet ones I dunno, both seem quite reasonable to me.

Comment 13 Thadeu Lima de Souza Cascardo 2016-05-06 18:29:31 UTC
Submitted upstream.

http://dpdk.org/ml/archives/dev/2016-May/038562.html

Comment 14 Flavio Leitner 2016-09-29 18:18:06 UTC
Hi,

My understanding is that driverctl can handle this correctly and Thadeu's patch fixing dpdk_nic_bind is merged upstream, so it will land in RHEL at some point.
However, we can't enable mlx driver at this point so I am going to close this bug as I don't see anything else left for us to help.

If any of you disagree please re-open it.
Thanks,
fbl