Bug 2036948
| Summary: | SR-IOV Network Device Plugin should handle offloaded VF instead of supporting only PF | |||
|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Emilien Macchi <emacchi> | |
| Component: | Networking | Assignee: | Emilien Macchi <emacchi> | |
| Networking sub component: | SR-IOV | QA Contact: | Ziv Greenberg <zgreenbe> | |
| Status: | CLOSED ERRATA | Docs Contact: | Tomas 'Sheldon' Radej <tradej> | |
| Severity: | high | |||
| Priority: | medium | CC: | aos-bugs, cgoncalves, gcheresh, juriarte, lmurthy, rlobillo, sscheink, zshi, zzhao | |
| Version: | 4.10 | Keywords: | Reopened | |
| Target Milestone: | --- | |||
| Target Release: | 4.11.0 | |||
| Hardware: | All | |||
| OS: | All | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | No Doc Update | ||
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 2077506 (view as bug list) | Environment: | ||
| Last Closed: | 2022-08-10 10:41:05 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 2077506 | |||
|
Description
Emilien Macchi
2022-01-04 13:52:53 UTC
The error: I1229 07:25:21.103010 1 manager.go:200] validating resource name "openshift.io/hwoffload10" I1229 07:25:21.103019 1 manager.go:200] validating resource name "openshift.io/hwoffload9" I1229 07:25:21.103022 1 main.go:60] Discovering host devices I1229 07:25:21.130290 1 netDeviceProvider.go:84] netdevice AddTargetDevices(): device found: 0000:00:03.0 02 Red Hat, Inc. Virtio network device I1229 07:25:21.130322 1 netDeviceProvider.go:84] netdevice AddTargetDevices(): device found: 0000:00:05.0 02 Mellanox Technolo... MT27800 Family [ConnectX-5 Virtual Fu... I1229 07:25:21.130478 1 netDeviceProvider.go:84] netdevice AddTargetDevices(): device found: 0000:00:06.0 02 Mellanox Technolo... MT27800 Family [ConnectX-5 Virtual Fu... I1229 07:25:21.130566 1 main.go:66] Initializing resource servers I1229 07:25:21.130576 1 manager.go:105] number of config: 2 I1229 07:25:21.130579 1 manager.go:109] I1229 07:25:21.130581 1 manager.go:110] Creating new ResourcePool: hwoffload10 I1229 07:25:21.130583 1 manager.go:111] DeviceType: netDevice I1229 07:25:21.130704 1 utils.go:71] Devlink query for eswitch mode is not supported for device 0000:00:03.0. error getting devlink device attributes for net device 0000:00:03.0 no such device W1229 07:25:21.130720 1 pciNetDevice.go:81] unable to get PF name "open /sys/bus/pci/devices/0000:00:03.0/net: no such file or directory" I1229 07:25:21.130957 1 utils.go:71] Devlink query for eswitch mode is not supported for device 0000:00:05.0. <nil> I1229 07:25:21.131225 1 utils.go:71] Devlink query for eswitch mode is not supported for device 0000:00:06.0. <nil> I1229 07:25:21.131279 1 manager.go:125] no devices in device pool, skipping creating resource server for hwoffload10 I1229 07:25:21.131282 1 manager.go:109] I1229 07:25:21.131284 1 manager.go:110] Creating new ResourcePool: hwoffload9 I1229 07:25:21.131286 1 manager.go:111] DeviceType: netDevice I1229 07:25:21.131376 1 utils.go:71] Devlink query for eswitch mode is not supported for device 0000:00:03.0. error getting devlink device attributes for net device 0000:00:03.0 no such device W1229 07:25:21.131391 1 pciNetDevice.go:81] unable to get PF name "open /sys/bus/pci/devices/0000:00:03.0/net: no such file or directory" I1229 07:25:21.131586 1 utils.go:71] Devlink query for eswitch mode is not supported for device 0000:00:05.0. <nil> I1229 07:25:21.131885 1 utils.go:71] Devlink query for eswitch mode is not supported for device 0000:00:06.0. <nil> I1229 07:25:21.131946 1 manager.go:125] no devices in device pool, skipping creating resource server for hwoffload9 I1229 07:25:21.131951 1 main.go:72] Starting all servers... I1229 07:25:21.131953 1 main.go:77] All servers started. I1229 07:25:21.131956 1 main.go:78] Listening for term signals After a live debug session on the cluster, I was able to find the issue.
The sriov-config-daemon is not able to show netfilder on the sriovnetworknodestate status section.
sriov-config-daemon logs:
I0105 12:47:28.201930 2842239 utils_virtual.go:165] DiscoverSriovDevicesVirtual
I0105 12:47:28.223805 2842239 virtual_plugin.go:52] virtual-plugin OnNodeStateAdd()
I0105 12:47:28.223891 2842239 daemon.go:479] nodeStateSyncHandler(): plugin virtual_plugin: reqDrain false, reqReboot false
I0105 12:47:28.223901 2842239 daemon.go:483] nodeStateSyncHandler(): reqDrain false, reqReboot false disableDrain false
I0105 12:47:28.223911 2842239 virtual_plugin.go:84] virtual-plugin Apply(): desiredState={5006414 []}
The problem is when the nics are pass with hw-offload the interfaces are not in the OpenStack metadata section
```
{
"uuid": "6a75c23b-5efc-4a75-97b6-1484b18f99db",
"admin_pass": "dc7DcAzoMz8A",
"hostname": "ostest-kdjnt-worker-1",
"name": "ostest-kdjnt-worker-1",
"launch_index": 0,
"availability_zone": "nova",
"random_seed": "RmN4d7iCA+00JjSYz89yElzwaSF3Z4ERerlbVu0daj+kSTszAf1UvVszgIRKHlgs/BwVPls0Gz3pFod/VetpeEAsownYuZPlX9RLtpsea6ZGMEGqYGlCeanxOQh4UTQWSN7hkzrpB5AzYppI5p7yO2hPQ5R6scp+4J73SkCHSbHs2bgvBXLyuikBLoTsuE23S9ma3rmYsaOqe6glsIhRjoLD06vDwpXkPfoolgeIsFfM1VMwzxmtE5GmC+ZAEhAHteynN1+uxxZ4GQtFTnF8QpVMsHhYr/5F/KdC59mBTlzyFFgXPAbHRpjDj0Ydng1VxI3bDIPiLBx/N3a6R2qAifzeDwphQBKgvhMlbmlBCZnfizViEDtQ45k6S4LrFhC9m+WLt/P43iWNOOqD4UVi5gjYOHwJNWSNuni4mkwM5dh8PP5xdEYOq2XUL1oYo/VV/Stxgc9L8dweZdqzvBWCtcp7MFQzuV0gYb8Me6lWmLqrhORDMfhg4AjzAmk0Ui7ryjTGY+6EC6HwjbtLJm77GfuoQTw0UtVVH2SnciwdKiN9gb06Vb0wkyimyxO77Eey6DQL9io8pnceSA6vVsZKD6IJ1KNE3NDvGqBByIsfl5WkaHBwW8eAO7MrRj7YH84jrE6aIzrvSjeAg0t0/fbfUJmynoFGemRL4XBo8+HrNgc=",
"project_id": "e26bdb8747a94593a9b2ba5919439e57",
"devices": [
]
}
```
so the sriov daemon is not able to connect the nic pci that should be under meta_data.json devices with the network_id from the network_meta.json
```
{
"links": [
{
"id": "tap8626696e-4c",
"vif_id": "8626696e-4cf4-47a7-9490-fe2f046923a6",
"type": "ovs",
"mtu": 1442,
"ethernet_mac_address": "fa:16:3e:d6:98:de"
},
{
"id": "tap3fbf098c-30",
"vif_id": "3fbf098c-302d-40c2-a273-3c266398557b",
"type": "ovs",
"mtu": 1500,
"ethernet_mac_address": "fa:16:3e:65:0f:62"
},
{
"id": "tap81855815-39",
"vif_id": "81855815-396c-4e0c-95bd-e48baf73cc69",
"type": "ovs",
"mtu": 1500,
"ethernet_mac_address": "fa:16:3e:f4:79:a2"
}
],
"networks": [
{
"id": "network0",
"type": "ipv4_dhcp",
"link": "tap8626696e-4c",
"network_id": "c0f4c670-9bf5-4867-8f31-1fed1ac137b6"
},
{
"id": "network1",
"type": "ipv4",
"link": "tap3fbf098c-30",
"ip_address": "10.10.9.147",
"netmask": "255.255.255.0",
"routes": [
{
"network": "0.0.0.0",
"netmask": "0.0.0.0",
"gateway": "10.10.9.254"
}
],
"network_id": "34591cf7-55e3-4229-8b00-17d3e0723e72",
"services": [
{
"type": "dns",
"address": "10.46.0.31"
},
{
"type": "dns",
"address": "8.8.8.8"
}
]
},
{
"id": "network2",
"type": "ipv4",
"link": "tap81855815-39",
"ip_address": "10.10.10.154",
"netmask": "255.255.255.0",
"routes": [
{
"network": "0.0.0.0",
"netmask": "0.0.0.0",
"gateway": "10.10.10.254"
}
],
"network_id": "9e19273b-f0b7-4401-92ed-e0c775472a7c",
"services": [
{
"type": "dns",
"address": "10.46.0.31"
},
{
"type": "dns",
"address": "8.8.8.8"
}
]
}
],
"services": [
{
"type": "dns",
"address": "10.46.0.31"
},
{
"type": "dns",
"address": "8.8.8.8"
}
]
}
```
we also check a regular passthrough with intel nics ad the devices are preset there.
one possible solution is to use only the network_meta.json file and connect between:
1. mac inside the host(vm) and the links section in the network_meta.json
2. the id in the link section and the id in the networks section to find the network_id
update, I managed to make this deployment work.
here is the configuration needed.
1. disable the operator validation webhook - this is needed because the PF deviceID is a VF deviceID so it will not allow the policy creation
apiVersion: v1
items:
- apiVersion: sriovnetwork.openshift.io/v1
kind: SriovOperatorConfig
metadata:
creationTimestamp: "2021-12-28T09:51:16Z"
generation: 2
name: default
namespace: openshift-sriov-network-operator
resourceVersion: "351755"
uid: 69a6edc0-87d0-401c-8a71-3598a69c7c9d
spec:
enableInjector: true
enableOperatorWebhook: false
logLevel: 2
kind: List
metadata:
resourceVersion: ""
selfLink: ""
2. configure the mlx VFs for dpdk applications using the sriovnetworknodepolicy
apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetworkNodePolicy
metadata:
name: "hwoffload10"
namespace: openshift-sriov-network-operator
spec:
deviceType: netdevice
isRdma: true
nicSelector:
pfNames:
- ens5
nodeSelector:
feature.node.kubernetes.io/network-sriov.capable: 'true'
numVfs: 1
priority: 99
resourceName: "hwoffload10"
3. configure a network-attachment definition using host-device - this is needed because we must past the kernel nic into the host to support the dpdk (https://doc.dpdk.org/guides/howto/flow_bifurcation.html) and we can't use the sriov-cni as there is no PF on the host.
apiVersion: k8s.cni.cncf.io/v1
kind: NetworkAttachmentDefinition
metadata:
annotations:
k8s.v1.cni.cncf.io/resourceName: openshift.io/hwoffload10
name: hwoffload10
namespace: default
spec:
config: '{ "cniVersion":"0.3.1", "name":"hwoffload10","type":"host-device","device":"ens5"
}'
4. start a pod requesting the network in the annotation and with some capabilities
apiVersion: v1
kind: Pod
metadata:
name: dpdk-testpmd
namespace: default
annotations:
irq-load-balancing.crio.io: disable
cpu-quota.crio.io: disable
k8s.v1.cni.cncf.io/networks: '[
{
"name": "hwoffload10",
"namespace": "default"
}
]'
spec:
restartPolicy: Never
containers:
- name: dpdk-testpmd
image: <image>
imagePullPolicy: Always
securityContext:
capabilities:
add:
- IPC_LOCK
- SYS_RESOURCE
- NET_RAW
resources:
limits:
cpu: 3
memory: "1000Mi"
hugepages-1Gi: "7Gi"
requests:
cpu: 3
memory: "1000Mi"
hugepages-1Gi: "7Gi"
volumeMounts:
- mountPath: /dev/hugepages
name: hugepage
nodeSelector:
node-role.kubernetes.io/dpdk: ""
volumes:
- name: hugepage
emptyDir:
medium: HugePages
Then the dpdk application will be able to start.
sh-4.4# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
3: eth0@if4283: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1392 qdisc noqueue state UP group default
link/ether 0a:58:0a:80:03:1c brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet 10.128.3.28/23 brd 10.128.3.255 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::68d6:aeff:fe91:37a6/64 scope link
valid_lft forever preferred_lft forever
4: net1: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether fa:16:3e:f4:79:a2 brd ff:ff:ff:ff:ff:ff
inet6 fe80::f816:3eff:fef4:79a2/64 scope link
valid_lft forever preferred_lft forever
5: net2: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether fa:16:3e:65:0f:62 brd ff:ff:ff:ff:ff:ff
inet6 fe80::f816:3eff:fe65:f62/64 scope link
valid_lft forever preferred_lft forever
sh-4.4#
sh-4.4# cat /sys/fs/cgroup/cpuset/cpuset.cpu
cat: /sys/fs/cgroup/cpuset/cpuset.cpu: No such file or directory
sh-4.4# cat /sys/fs/cgroup/cpuset/cpuset.cpus
4-6
sh-4.4# testpmd -l 4,5,6 -n 4 --socket-mem 1024 --log-level="*:debug" -- -i --nb-cores=2 --auto-start --rxd=1024 --txd=1024 --rxq=1 --txq=1
EAL: Detected lcore 0 as core 0 on socket 0
EAL: Detected lcore 1 as core 0 on socket 0
EAL: Detected lcore 2 as core 0 on socket 0
EAL: Detected lcore 3 as core 0 on socket 0
EAL: Detected lcore 4 as core 0 on socket 0
EAL: Detected lcore 5 as core 0 on socket 0
EAL: Detected lcore 6 as core 0 on socket 0
EAL: Detected lcore 7 as core 0 on socket 0
EAL: Detected lcore 8 as core 0 on socket 0
EAL: Detected lcore 9 as core 0 on socket 0
EAL: Detected lcore 10 as core 0 on socket 0
EAL: Detected lcore 11 as core 0 on socket 0
EAL: Detected lcore 12 as core 0 on socket 0
EAL: Detected lcore 13 as core 0 on socket 0
EAL: Detected lcore 14 as core 0 on socket 0
EAL: Detected lcore 15 as core 0 on socket 0
EAL: Detected lcore 16 as core 0 on socket 0
EAL: Detected lcore 17 as core 0 on socket 0
EAL: Detected lcore 18 as core 0 on socket 0
EAL: Detected lcore 19 as core 0 on socket 0
EAL: Detected lcore 20 as core 0 on socket 0
EAL: Support maximum 128 logical core(s) by configuration.
EAL: Detected 21 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: open shared lib /usr/lib64/dpdk-pmds/librte_pmd_bnxt.so.20.0
EAL: open shared lib /usr/lib64/dpdk-pmds/librte_pmd_e1000.so.20.0
EAL: open shared lib /usr/lib64/dpdk-pmds/librte_pmd_enic.so.20.0
EAL: open shared lib /usr/lib64/dpdk-pmds/librte_pmd_failsafe.so.20.0
EAL: open shared lib /usr/lib64/dpdk-pmds/librte_pmd_i40e.so.20.0
EAL: open shared lib /usr/lib64/dpdk-pmds/librte_pmd_ixgbe.so.20.0
EAL: open shared lib /usr/lib64/dpdk-pmds/librte_pmd_mlx4.so.20.0
EAL: open shared lib /usr/lib64/dpdk-pmds/librte_pmd_mlx5.so.20.0
EAL: open shared lib /usr/lib64/dpdk-pmds/librte_pmd_netvsc.so.20.0
EAL: Registered [vmbus] bus.
EAL: open shared lib /usr/lib64/dpdk-pmds/librte_pmd_nfp.so.20.0
EAL: open shared lib /usr/lib64/dpdk-pmds/librte_pmd_qede.so.20.0
EAL: open shared lib /usr/lib64/dpdk-pmds/librte_pmd_ring.so.20.0
EAL: open shared lib /usr/lib64/dpdk-pmds/librte_pmd_tap.so.20.0
EAL: open shared lib /usr/lib64/dpdk-pmds/librte_pmd_vdev_netvsc.so.20.0
EAL: open shared lib /usr/lib64/dpdk-pmds/librte_pmd_vhost.so.20.0
EAL: open shared lib /usr/lib64/dpdk-pmds/librte_pmd_virtio.so.20.0
EAL: Ask a virtual area of 0x5000 bytes
EAL: Virtual area found at 0x100000000 (size = 0x5000)
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Module /sys/module/vfio_pci not found! error 2 (No such file or directory)
EAL: VFIO PCI modules not loaded
EAL: Bus pci wants IOVA as 'DC'
EAL: Buses did not request a specific IOVA mode.
EAL: IOMMU is not available, selecting IOVA as PA mode.
EAL: Selected IOVA mode 'PA'
EAL: No available hugepages reported in hugepages-2048kB
EAL: Probing VFIO support...
EAL: Module /sys/module/vfio not found! error 2 (No such file or directory)
EAL: VFIO modules not loaded, skipping VFIO support...
EAL: Ask a virtual area of 0x2e000 bytes
EAL: Virtual area found at 0x100005000 (size = 0x2e000)
EAL: Setting up physically contiguous memory...
EAL: Setting maximum number of open files to 1048576
EAL: Detected memory type: socket_id:0 hugepage_sz:1073741824
EAL: Creating 4 segment lists: n_segs:32 socket_id:0 hugepage_sz:1073741824
EAL: Ask a virtual area of 0x1000 bytes
EAL: Virtual area found at 0x100033000 (size = 0x1000)
EAL: Memseg list allocated: 0x100000kB at socket 0
EAL: Ask a virtual area of 0x800000000 bytes
EAL: Virtual area found at 0x140000000 (size = 0x800000000)
EAL: Ask a virtual area of 0x1000 bytes
EAL: Virtual area found at 0x940000000 (size = 0x1000)
EAL: Memseg list allocated: 0x100000kB at socket 0
EAL: Ask a virtual area of 0x800000000 bytes
EAL: Virtual area found at 0x980000000 (size = 0x800000000)
EAL: Ask a virtual area of 0x1000 bytes
EAL: Virtual area found at 0x1180000000 (size = 0x1000)
EAL: Memseg list allocated: 0x100000kB at socket 0
EAL: Ask a virtual area of 0x800000000 bytes
EAL: Virtual area found at 0x11c0000000 (size = 0x800000000)
EAL: Ask a virtual area of 0x1000 bytes
EAL: Virtual area found at 0x19c0000000 (size = 0x1000)
EAL: Memseg list allocated: 0x100000kB at socket 0
EAL: Ask a virtual area of 0x800000000 bytes
EAL: Virtual area found at 0x1a00000000 (size = 0x800000000)
EAL: Allocating 1 pages of size 1024M on socket 0
EAL: Trying to obtain current memory policy.
EAL: Setting policy MPOL_PREFERRED for socket 0
EAL: Restoring previous memory policy: 0
EAL: Added 1024M to heap on socket 0
EAL: TSC frequency is ~2990000 KHz
EAL: Master lcore 4 is ready (tid=7f1b4b950900;cpuset=[4])
EAL: lcore 5 is ready (tid=7f1b42f31700;cpuset=[5])
EAL: lcore 6 is ready (tid=7f1b42730700;cpuset=[6])
EAL: PCI device 0000:00:03.0 on NUMA socket -1
EAL: Invalid NUMA socket, default to 0
EAL: probe driver: 1af4:1000 net_virtio
EAL: Not managed by a supported kernel driver, skipped
EAL: Not managed by a supported kernel driver, skipped
EAL: PCI device 0000:00:05.0 on NUMA socket -1
EAL: Invalid NUMA socket, default to 0
EAL: probe driver: 15b3:1018 net_mlx5
EAL: Mem event callback 'MLX5_MEM_EVENT_CB:(nil)' registered
net_mlx5: DV flow is not supported
EAL: PCI device 0000:00:06.0 on NUMA socket -1
EAL: Invalid NUMA socket, default to 0
EAL: probe driver: 15b3:1018 net_mlx5
net_mlx5: DV flow is not supported
EAL: Module /sys/module/vfio not found! error 2 (No such file or directory)
Interactive-mode selected
Auto-start selected
testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=163456, size=2176, socket=0
testpmd: preferred mempool ops selected: ring_mp_mc
Configuring Port 0 (socket 0)
Port 0: FA:16:3E:65:0F:62
Configuring Port 1 (socket 0)
Port 1: FA:16:3E:F4:79:A2
Checking link statuses...
Done
Start automatic packet forwarding
io packet forwarding - ports=2 - cores=2 - streams=2 - NUMA support enabled, MP allocation mode: native
Logical Core 5 (socket 0) forwards packets on 1 streams:
RX P=0/Q=0 (socket 0) -> TX P=1/Q=0 (socket 0) peer=02:00:00:00:00:01
Logical Core 6 (socket 0) forwards packets on 1 streams:
RX P=1/Q=0 (socket 0) -> TX P=0/Q=0 (socket 0) peer=02:00:00:00:00:00
io packet forwarding packets/burst=32
nb forwarding cores=2 - nb forwarding ports=2
port 0: RX queue number: 1 Tx queue number: 1
Rx offloads=0x0 Tx offloads=0x0
RX queue: 0
RX desc=1024 - RX free threshold=0
RX threshold registers: pthresh=0 hthresh=0 wthresh=0
RX Offloads=0x0
TX queue: 0
TX desc=1024 - TX free threshold=0
TX threshold registers: pthresh=0 hthresh=0 wthresh=0
TX offloads=0x0 - TX RS bit threshold=0
port 1: RX queue number: 1 Tx queue number: 1
Rx offloads=0x0 Tx offloads=0x0
RX queue: 0
RX desc=1024 - RX free threshold=0
RX threshold registers: pthresh=0 hthresh=0 wthresh=0
RX Offloads=0x0
TX queue: 0
TX desc=1024 - TX free threshold=0
TX threshold registers: pthresh=0 hthresh=0 wthresh=0
TX offloads=0x0 - TX RS bit threshold=0
testpmd>
testpmd> quit
Telling cores to stop...
Waiting for lcores to finish...
---------------------- Forward statistics for port 0 ----------------------
RX-packets: 0 RX-dropped: 0 RX-total: 0
TX-packets: 0 TX-dropped: 0 TX-total: 0
----------------------------------------------------------------------------
---------------------- Forward statistics for port 1 ----------------------
RX-packets: 0 RX-dropped: 0 RX-total: 0
TX-packets: 0 TX-dropped: 0 TX-total: 0
----------------------------------------------------------------------------
+++++++++++++++ Accumulated forward statistics for all ports+++++++++++++++
RX-packets: 0 RX-dropped: 0 RX-total: 0
TX-packets: 0 TX-dropped: 0 TX-total: 0
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Done.
Stopping port 0...
Stopping ports...
Done
Stopping port 1...
Stopping ports...
Done
Shutting down port 0...
Closing ports...
Done
Shutting down port 1...
Closing ports...
Done
Bye...
I'm re-opening and re-assigning to the documentation team. We need to add a section for OVS HW offload into https://docs.openshift.com/container-platform/4.9/installing/installing_openstack/installing-openstack-installer-sr-iov.html and explain what Sebastian described in the last comment. Hi, Ziv, Assign QA to you for checking openstack part, thanks. Assign QA to Ziv for checking openstack part, thanks. *** Bug 2061980 has been marked as a duplicate of this bug. *** Hello,
I have verified that the "Net Filter" is working now in the 4.11 release.
(shiftstack) [cloud-user@installer-host ~]$ oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.11.0-0.nightly-2022-05-18-053037 True False 14h Cluster version is 4.11.0-0.nightly-2022-05-18-053037
(shiftstack) [cloud-user@installer-host ~]$
(shiftstack) [cloud-user@installer-host ~]$
(shiftstack) [cloud-user@installer-host ~]$ oc get csv -n openshift-sriov-network-operator
NAME DISPLAY VERSION REPLACES PHASE
sriov-network-operator.4.11.0-202205182037 SR-IOV Network Operator 4.11.0-202205182037 Succeeded
(shiftstack) [cloud-user@installer-host ~]$
(shiftstack) [cloud-user@installer-host ~]$
(shiftstack) [cloud-user@installer-host ~]$ oc get sriovnetworknodepolicy -n openshift-sriov-network-operator -o yaml hwoffload9
apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetworkNodePolicy
metadata:
creationTimestamp: "2022-05-19T12:41:31Z"
generation: 1
name: hwoffload9
namespace: openshift-sriov-network-operator
resourceVersion: "309316"
uid: 79d2057c-99b3-4370-8146-2d6e94d61f2c
spec:
deviceType: netdevice
isRdma: true
nicSelector:
netFilter: openstack/NetworkID:0373d387-43ba-4896-9f2f-7574c6284953
nodeSelector:
feature.node.kubernetes.io/network-sriov.capable: "true"
numVfs: 1
priority: 99
resourceName: hwoffload9
(shiftstack) [cloud-user@installer-host ~]$
(shiftstack) [cloud-user@installer-host ~]$
(shiftstack) [cloud-user@installer-host ~]$ oc get net-attach-def -o yaml hwoffload9
apiVersion: k8s.cni.cncf.io/v1
kind: NetworkAttachmentDefinition
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"k8s.cni.cncf.io/v1","kind":"NetworkAttachmentDefinition","metadata":{"annotations":{},"k8s.v1.cni.cncf.io/resourceName":"openshift.io/hwoffload9","name":"hwoffload9","namespace":"default"},"spec":{"config":"{\"cniVersion\":\"0.3.1\", \"name\":\"hwoffload9\",\"type\":\"host-device\",\"pciBusId\":\"0000:00:05.0\",\"ipam\":{}}"}}
creationTimestamp: "2022-05-19T12:55:26Z"
generation: 1
name: hwoffload9
namespace: default
resourceVersion: "314300"
uid: 6201e277-7c0f-464f-965e-9c1c56389afb
spec:
config: '{"cniVersion":"0.3.1", "name":"hwoffload9","type":"host-device","pciBusId":"0000:00:05.0","ipam":{}}'
(shiftstack) [cloud-user@installer-host ~]$
(shiftstack) [cloud-user@installer-host ~]$
(shiftstack) [cloud-user@installer-host ~]$ oc describe SriovNetworkNodeState -n openshift-sriov-network-operator ostest-7pqj8-worker-1
Name: ostest-7pqj8-worker-1
Namespace: openshift-sriov-network-operator
Labels: <none>
Annotations: <none>
API Version: sriovnetwork.openshift.io/v1
Kind: SriovNetworkNodeState
Metadata:
Creation Timestamp: 2022-05-19T09:57:17Z
Generation: 3
Managed Fields:
API Version: sriovnetwork.openshift.io/v1
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:ownerReferences:
.:
k:{"uid":"3026d570-279d-4fbb-8de3-419666cf9dbc"}:
f:spec:
.:
f:dpConfigVersion:
f:interfaces:
Manager: sriov-network-operator
Operation: Update
Time: 2022-05-19T12:41:32Z
API Version: sriovnetwork.openshift.io/v1
Fields Type: FieldsV1
fieldsV1:
f:status:
f:interfaces:
f:syncStatus:
Manager: sriov-network-config-daemon
Operation: Update
Subresource: status
Time: 2022-05-19T12:59:18Z
Owner References:
API Version: sriovnetwork.openshift.io/v1
Block Owner Deletion: true
Controller: true
Kind: SriovNetworkNodePolicy
Name: default
UID: 3026d570-279d-4fbb-8de3-419666cf9dbc
Resource Version: 315641
UID: 405f2c71-669d-468a-a8e5-1b6a721dfe79
Spec:
Dp Config Version: 309366
Interfaces:
Name: ens6
Num Vfs: 1
Pci Address: 0000:00:06.0
Vf Groups:
Device Type: netdevice
Is Rdma: true
Policy Name: hwoffload10
Resource Name: hwoffload10
Vf Range: 0-0
Name: ens5
Num Vfs: 1
Pci Address: 0000:00:05.0
Vf Groups:
Device Type: netdevice
Is Rdma: true
Policy Name: hwoffload9
Resource Name: hwoffload9
Vf Range: 0-0
Status:
Interfaces:
Vfs:
Device ID: 1000
Driver: virtio-pci
Mac: fa:16:3e:f7:86:ae
Pci Address: 0000:00:03.0
Vendor: 1af4
Vf ID: 0
Device ID: 1000
Driver: virtio-pci
Link Speed: -1 Mb/s
Link Type: ETH
Mac: fa:16:3e:f7:86:ae
Name: ens3
Net Filter: openstack/NetworkID:6609c73b-0aad-426b-ac48-3e886c162693
Num Vfs: 1
Pci Address: 0000:00:03.0
Totalvfs: 1
Vendor: 1af4
Vfs:
Device ID: 1018
Driver: mlx5_core
Pci Address: 0000:00:05.0
Vendor: 15b3
Vf ID: 0
Device ID: 1018
Driver: mlx5_core
Net Filter: openstack/NetworkID:0373d387-43ba-4896-9f2f-7574c6284953
Num Vfs: 1
Pci Address: 0000:00:05.0
Totalvfs: 1
Vendor: 15b3
Vfs:
Device ID: 1018
Driver: mlx5_core
Pci Address: 0000:00:06.0
Vendor: 15b3
Vf ID: 0
Device ID: 1018
Driver: mlx5_core
Net Filter: openstack/NetworkID:6b295bdc-7e75-4250-8288-71ac2a7aa71d
Num Vfs: 1
Pci Address: 0000:00:06.0
Totalvfs: 1
Vendor: 15b3
Sync Status: Succeeded
Events: <none>
(shiftstack) [cloud-user@installer-host ~]$ oc logs pods/hwoffload-testpmd
EAL: Detected 21 lcore(s)
EAL: Detected 1 NUMA nodes
Option -w, --pci-whitelist is deprecated, use -a, --allow option instead
Option -w, --pci-whitelist is deprecated, use -a, --allow option instead
EAL: Selected IOVA mode 'PA'
EAL: No available hugepages reported in hugepages-2048kB
EAL: No free hugepages reported in hugepages-2048kB
EAL: No available hugepages reported in hugepages-2048kB
EAL: Probing VFIO support...
EAL: Invalid NUMA socket, default to 0
EAL: Probe PCI driver: mlx5_pci (15b3:1018) device: 0000:00:05.0 (socket 0)
mlx5_pci: No available register for Sampler.
mlx5_pci: Size 0xFFFF is not power of 2, will be aligned to 0x10000.
EAL: Invalid NUMA socket, default to 0
EAL: Probe PCI driver: mlx5_pci (15b3:1018) device: 0000:00:06.0 (socket 0)
mlx5_pci: No available register for Sampler.
mlx5_pci: Size 0xFFFF is not power of 2, will be aligned to 0x10000.
EAL: No legacy callbacks, legacy socket not created
Auto-start selected
testpmd: create a new mbuf pool <mb_pool_0>: n=163456, size=2176, socket=0
testpmd: preferred mempool ops selected: ring_mp_mc
Configuring Port 0 (socket 0)
Port 0: FA:16:3E:34:70:AA
Configuring Port 1 (socket 0)
Port 1: FA:16:3E:5B:B7:B6
Checking link statuses...
Done
No commandline core given, start packet forwarding
io packet forwarding - ports=2 - cores=1 - streams=2 - NUMA support enabled, MP allocation mode: native
Logical Core 5 (socket 0) forwards packets on 2 streams:
RX P=0/Q=0 (socket 0) -> TX P=1/Q=0 (socket 0) peer=02:00:00:00:00:01
RX P=1/Q=0 (socket 0) -> TX P=0/Q=0 (socket 0) peer=02:00:00:00:00:00
io packet forwarding packets/burst=32
nb forwarding cores=1 - nb forwarding ports=2
port 0: RX queue number: 1 Tx queue number: 1
Rx offloads=0x0 Tx offloads=0x0
RX queue: 0
RX desc=256 - RX free threshold=64
RX threshold registers: pthresh=0 hthresh=0 wthresh=0
RX Offloads=0x0
TX queue: 0
TX desc=256 - TX free threshold=0
TX threshold registers: pthresh=0 hthresh=0 wthresh=0
TX offloads=0x0 - TX RS bit threshold=0
port 1: RX queue number: 1 Tx queue number: 1
Rx offloads=0x0 Tx offloads=0x0
RX queue: 0
RX desc=256 - RX free threshold=64
RX threshold registers: pthresh=0 hthresh=0 wthresh=0
RX Offloads=0x0
TX queue: 0
TX desc=256 - TX free threshold=0
TX threshold registers: pthresh=0 hthresh=0 wthresh=0
TX offloads=0x0 - TX RS bit threshold=0
Port statistics ====================================
######################## NIC statistics for port 0 ########################
RX-packets: 0 RX-missed: 0 RX-bytes: 0
RX-errors: 0
RX-nombuf: 0
TX-packets: 0 TX-errors: 0 TX-bytes: 0
Throughput (since last show)
Rx-pps: 0 Rx-bps: 0
Tx-pps: 0 Tx-bps: 0
############################################################################
######################## NIC statistics for port 1 ########################
RX-packets: 0 RX-missed: 0 RX-bytes: 0
RX-errors: 0
RX-nombuf: 0
TX-packets: 0 TX-errors: 0 TX-bytes: 0
Throughput (since last show)
Rx-pps: 0 Rx-bps: 0
Tx-pps: 0 Tx-bps: 0
############################################################################
(shiftstack) [cloud-user@installer-host ~]$
Thanks,
Ziv
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5069 |