Bug 1847651
| Summary: | Failing to configure nic partitioning over mellanox network cards | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Miguel Angel Nieto <mnietoji> |
| Component: | os-net-config | Assignee: | RHOS Maint <rhos-maint> |
| Status: | CLOSED NOTABUG | QA Contact: | nlevinki <nlevinki> |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 13.0 (Queens) | CC: | bfournie, cfields, hakhande, hbrock, jslagle, mburns, supadhya |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2020-06-17 15:52:46 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Sorry, addresses exist, but ovs fails to attach to them 04:00.5 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function] 04:02.5 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function] [root@computeovsdpdksriov-1 etc]# ovs-vsctl -t 10 -- --if-exists del-port br-link0 dpdkbond0 -- add-bond br-link0 dpdkbond0 dpdk0 dpdk1 -- set interface dpdk0 type=dpdk -- set interface dpdk1 type=dpdk -- set Interface dpdk0 options:dpdk-devargs=0000:04:02.5 -- set Interface dpdk1 options:dpdk-devargs=0000:04:00.5 -- set Interface dpdk0 mtu_request=9000 -- set Interface dpdk1 mtu_request=9000 -- set Interface dpdk0 options:n_rxq=2 -- set Interface dpdk1 options:n_rxq=2 ovs-vsctl: Error detected while setting up 'dpdk0': Error attaching device '0000:04:02.5' to DPDK. See ovs-vswitchd log for details. ovs-vsctl: Error detected while setting up 'dpdk1': Error attaching device '0000:04:00.5' to DPDK. See ovs-vswitchd log for details. ovs-vsctl: The default log directory is "/var/log/openvswitch". 020-06-16T18:46:13.798Z|00563|dpdk|INFO|EAL: PCI device 0000:04:02.5 on NUMA socket 0 2020-06-16T18:46:13.798Z|00564|dpdk|INFO|EAL: probe driver: 15b3:1018 net_mlx5 2020-06-16T18:46:13.800Z|00565|dpdk|WARN|net_mlx5: no Verbs device matches PCI device 0000:04:02.5, are kernel drivers loaded? 2020-06-16T18:46:13.800Z|00566|dpdk|ERR|EAL: Driver cannot attach the device (0000:04:02.5) 2020-06-16T18:46:13.800Z|00567|dpdk|ERR|EAL: Failed to attach device on primary process 2020-06-16T18:46:13.800Z|00568|netdev_dpdk|WARN|Error attaching device '0000:04:02.5' to DPDK 2020-06-16T18:46:13.800Z|00569|netdev|WARN|dpdk0: could not set configuration (Invalid argument) 2020-06-16T18:46:13.800Z|00570|dpdk|ERR|Invalid port_id=128 2020-06-16T18:46:43.660Z|00571|dpdk|INFO|EAL: PCI device 0000:04:00.5 on NUMA socket 0 2020-06-16T18:46:43.660Z|00572|dpdk|INFO|EAL: probe driver: 15b3:1018 net_mlx5 2020-06-16T18:46:43.662Z|00573|dpdk|WARN|net_mlx5: no Verbs device matches PCI device 0000:04:00.5, are kernel drivers loaded? 2020-06-16T18:46:43.662Z|00574|dpdk|ERR|EAL: Driver cannot attach the device (0000:04:00.5) 2020-06-16T18:46:43.662Z|00575|dpdk|ERR|EAL: Failed to attach device on primary process 2020-06-16T18:46:43.662Z|00576|netdev_dpdk|WARN|Error attaching device '0000:04:00.5' to DPDK 2020-06-16T18:46:43.662Z|00577|netdev|WARN|dpdk1: could not set configuration (Invalid argument) 2020-06-16T18:46:43.662Z|00578|dpdk|ERR|Invalid port_id=128 2020-06-16T18:46:43.707Z|00579|dpdk|INFO|EAL: PCI device 0000:04:02.5 on NUMA socket 0 2020-06-16T18:46:43.707Z|00580|dpdk|INFO|EAL: probe driver: 15b3:1018 net_mlx5 2020-06-16T18:46:43.709Z|00581|dpdk|WARN|net_mlx5: no Verbs device matches PCI device 0000:04:02.5, are kernel drivers loaded? 2020-06-16T18:46:43.709Z|00582|dpdk|ERR|EAL: Driver cannot attach the device (0000:04:02.5) 2020-06-16T18:46:43.709Z|00583|dpdk|ERR|EAL: Failed to attach device on primary process 2020-06-16T18:46:43.709Z|00584|netdev_dpdk|WARN|Error attaching device '0000:04:02.5' to DPDK 2020-06-16T18:46:43.709Z|00585|netdev|WARN|dpdk0: could not set configuration (Invalid argument) 2020-06-16T18:46:43.709Z|00586|dpdk|ERR|Invalid port_id=128 Closing as configuration was wrong, missing driver. It should be something like this:
- type: ovs_user_bridge
name: br-link0
use_dhcp: false
ovs_extra:
- str_replace:
template: set port br-link0 tag=_VLAN_TAG_
params:
_VLAN_TAG_:
get_param: TenantNetworkVlanID
addresses:
- ip_netmask:
get_param: TenantIpSubnet
members:
- type: ovs_dpdk_bond
name: dpdkbond0
mtu: 9000
rx_queue: 2
members:
- type: ovs_dpdk_port
driver: mlx5_core
name: dpdk0
members:
- type: sriov_vf
device: nic12
vfid: 3
- type: ovs_dpdk_port
driver: mlx5_core
name: dpdk1
members:
- type: sriov_vf
device: nic11
vfid: 3
Well documented, Miguel. I wrote this KCS based on your findings to get these symptoms and the solution into customers hands: https://access.redhat.com/solutions/5165331 CFields |
Description of problem: Failing os-net-config to configure openvswitch with VF ports belonging to a mellanox network card Template used: - type: sriov_pf name: nic11 mtu: 9000 numvfs: 14 use_dhcp: false defroute: false nm_controlled: true hotplug: true promisc: false - type: sriov_pf name: nic12 mtu: 9000 numvfs: 14 use_dhcp: false defroute: false nm_controlled: true hotplug: true promisc: false - type: linux_bond name: storage_bond bonding_options: mode=active-backup use_dhcp: false members: - type: sriov_vf device: nic11 vfid: 2 - type: sriov_vf device: nic12 vfid: 2 - type: ovs_user_bridge name: br-link0 use_dhcp: false ovs_extra: - str_replace: template: set port br-link0 tag=_VLAN_TAG_ params: _VLAN_TAG_: get_param: TenantNetworkVlanID addresses: - ip_netmask: get_param: TenantIpSubnet members: - type: ovs_dpdk_bond name: dpdkbond0 mtu: 9000 rx_queue: 2 members: - type: ovs_dpdk_port name: dpdk0 members: - type: sriov_vf device: nic12 vfid: 3 - type: ovs_dpdk_port name: dpdk1 members: - type: sriov_vf device: nic11 vfid: 3 nic11 and nic12 are: nic11: p6p1 nic12: p6p2 In the compute I can see VFs: [root@computeovsdpdksriov-1 log]# lspci | grep Mellanox 04:00.0 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5] 04:00.1 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5] 04:00.2 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function] 04:00.3 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function] 04:00.4 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function] 04:00.5 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function] 04:00.6 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function] 04:00.7 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function] 04:01.0 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function] 04:01.1 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function] 04:01.2 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function] 04:01.3 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function] 04:01.4 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function] 04:01.5 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function] 04:01.6 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function] 04:01.7 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function] 04:02.2 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function] 04:02.3 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function] 04:02.4 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function] 04:02.5 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function] 04:02.6 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function] 04:02.7 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function] 04:03.0 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function] 04:03.1 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function] 04:03.2 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function] 04:03.3 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function] 04:03.4 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function] 04:03.5 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function] 04:03.6 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function] 04:03.7 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function] [root@computeovsdpdksriov-1 log]# find /sys | grep p6p1 | awk -F '/' '{print $6,$8}' | sort | uniq 0000:04:00.0 p6p1 0000:04:00.2 p6p1_0 0000:04:00.3 p6p1_1 0000:04:00.4 p6p1_2 0000:04:00.6 p6p1_4 0000:04:00.7 p6p1_5 0000:04:01.0 p6p1_6 0000:04:01.1 p6p1_7 0000:04:01.2 p6p1_8 0000:04:01.3 p6p1_9 0000:04:01.4 p6p1_10 0000:04:01.5 p6p1_11 0000:04:01.6 p6p1_12 0000:04:01.7 p6p1_13 internal_bond storage_bond [root@computeovsdpdksriov-1 log]# find /sys | grep p6p2 | awk -F '/' '{print $6,$8}' | sort | uniq 0000:04:00.1 p6p2 0000:04:02.2 p6p2_0 0000:04:02.3 p6p2_1 0000:04:02.4 p6p2_2 0000:04:02.6 p6p2_4 0000:04:02.7 p6p2_5 0000:04:03.0 p6p2_6 0000:04:03.1 p6p2_7 0000:04:03.2 p6p2_8 0000:04:03.3 p6p2_9 0000:04:03.4 p6p2_10 0000:04:03.5 p6p2_11 0000:04:03.6 p6p2_12 0000:04:03.7 p6p2_13 internal_bond storage_bond Getting this warning with os-net-config [2020/06/16 02:30:19 PM] [INFO] Active nics are ['em1', 'em2', 'p4p1', 'p4p2', 'p6p1', 'p6p1_3', 'p6p2', 'p6p2_3', 'p7p3', 'p7p4'] [2020/06/16 02:30:19 PM] [WARNING] no mapping for interface p6p1_3 because nic6 is mapped to p4p4 [2020/06/16 02:30:19 PM] [WARNING] no mapping for interface p6p2_3 because nic8 is mapped to p7p2 OVS [root@computeovsdpdksriov-1 log]# ovs-vsctl show 22f91518-3484-4708-9844-0d72d34c28a9 Bridge "br-link0" fail_mode: standalone Port "dpdkbond0" Interface "dpdk1" type: dpdk options: {dpdk-devargs="c", n_rxq="2"} error: "Error attaching device '0000:04:00.5' to DPDK" Interface "dpdk0" type: dpdk options: {dpdk-devargs="0000:04:02.5", n_rxq="2"} error: "Error attaching device '0000:04:02.5' to DPDK" Port "br-link0" tag: 121 Interface "br-link0" type: internal ovs_version: "2.11.0" ovs is being configured with addresses 0000:04:00.5 and 0000:04:02.5 that does not exist, so, it fail to attach to dpdk Version-Release number of selected component (if applicable): 2020-06-09.2(undercloud) How reproducible: Configure a bonding in ovs using mellanox VFs Actual results: Deployment fails Expected results: Deployment shoud work Additional info: