Bug 2036948 - SR-IOV Network Device Plugin should handle offloaded VF instead of supporting only PF
Summary: SR-IOV Network Device Plugin should handle offloaded VF instead of supporting...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.10
Hardware: All
OS: All
medium
high
Target Milestone: ---
: 4.11.0
Assignee: Emilien Macchi
QA Contact: Ziv Greenberg
Tomas 'Sheldon' Radej
URL:
Whiteboard:
: 2061980 (view as bug list)
Depends On:
Blocks: 2077506
TreeView+ depends on / blocked
 
Reported: 2022-01-04 13:52 UTC by Emilien Macchi
Modified: 2022-08-10 10:41 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
: 2077506 (view as bug list)
Environment:
Last Closed: 2022-08-10 10:41:05 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift sriov-network-operator pull 655 0 None open Bug 2036948: improve the virtual plugin support 2022-04-21 13:10:30 UTC
Red Hat Product Errata RHSA-2022:5069 0 None None None 2022-08-10 10:41:27 UTC

Description Emilien Macchi 2022-01-04 13:52:53 UTC
Description of problem:

When doing OVS Hardware offload in OpenStack, we attach a VF to the OpenShift workers (not a PF) but the SR-IOV Network Device Plugin can only fetch eswitch attributes when the device is a PF:
https://github.com/openshift/sriov-network-device-plugin/blob/c5576c32c8e5319bdc760419fb09bda8aa552b7f/pkg/utils/netlink_provider.go#L51

So netlink is failing to get infos and we can't use OVS HW offload with OpenStack right now.

Comment 1 Emilien Macchi 2022-01-04 13:59:25 UTC
The error:

I1229 07:25:21.103010       1 manager.go:200] validating resource name "openshift.io/hwoffload10"
I1229 07:25:21.103019       1 manager.go:200] validating resource name "openshift.io/hwoffload9"
I1229 07:25:21.103022       1 main.go:60] Discovering host devices
I1229 07:25:21.130290       1 netDeviceProvider.go:84] netdevice AddTargetDevices(): device found: 0000:00:03.0	02          	Red Hat, Inc.       	Virtio network device
I1229 07:25:21.130322       1 netDeviceProvider.go:84] netdevice AddTargetDevices(): device found: 0000:00:05.0	02          	Mellanox Technolo...	MT27800 Family [ConnectX-5 Virtual Fu...
I1229 07:25:21.130478       1 netDeviceProvider.go:84] netdevice AddTargetDevices(): device found: 0000:00:06.0	02          	Mellanox Technolo...	MT27800 Family [ConnectX-5 Virtual Fu...
I1229 07:25:21.130566       1 main.go:66] Initializing resource servers
I1229 07:25:21.130576       1 manager.go:105] number of config: 2
I1229 07:25:21.130579       1 manager.go:109]
I1229 07:25:21.130581       1 manager.go:110] Creating new ResourcePool: hwoffload10
I1229 07:25:21.130583       1 manager.go:111] DeviceType: netDevice
I1229 07:25:21.130704       1 utils.go:71] Devlink query for eswitch mode is not supported for device 0000:00:03.0. error getting devlink device attributes for net device 0000:00:03.0 no such device
W1229 07:25:21.130720       1 pciNetDevice.go:81] unable to get PF name "open /sys/bus/pci/devices/0000:00:03.0/net: no such file or directory"
I1229 07:25:21.130957       1 utils.go:71] Devlink query for eswitch mode is not supported for device 0000:00:05.0. <nil>
I1229 07:25:21.131225       1 utils.go:71] Devlink query for eswitch mode is not supported for device 0000:00:06.0. <nil>
I1229 07:25:21.131279       1 manager.go:125] no devices in device pool, skipping creating resource server for hwoffload10
I1229 07:25:21.131282       1 manager.go:109]
I1229 07:25:21.131284       1 manager.go:110] Creating new ResourcePool: hwoffload9
I1229 07:25:21.131286       1 manager.go:111] DeviceType: netDevice
I1229 07:25:21.131376       1 utils.go:71] Devlink query for eswitch mode is not supported for device 0000:00:03.0. error getting devlink device attributes for net device 0000:00:03.0 no such device
W1229 07:25:21.131391       1 pciNetDevice.go:81] unable to get PF name "open /sys/bus/pci/devices/0000:00:03.0/net: no such file or directory"
I1229 07:25:21.131586       1 utils.go:71] Devlink query for eswitch mode is not supported for device 0000:00:05.0. <nil>
I1229 07:25:21.131885       1 utils.go:71] Devlink query for eswitch mode is not supported for device 0000:00:06.0. <nil>
I1229 07:25:21.131946       1 manager.go:125] no devices in device pool, skipping creating resource server for hwoffload9
I1229 07:25:21.131951       1 main.go:72] Starting all servers...
I1229 07:25:21.131953       1 main.go:77] All servers started.
I1229 07:25:21.131956       1 main.go:78] Listening for term signals

Comment 2 Sebastian Scheinkman 2022-01-05 13:31:38 UTC
After a live debug session on the cluster, I was able to find the issue.

The sriov-config-daemon is not able to show netfilder on the sriovnetworknodestate status section.

sriov-config-daemon logs:
I0105 12:47:28.201930 2842239 utils_virtual.go:165] DiscoverSriovDevicesVirtual
I0105 12:47:28.223805 2842239 virtual_plugin.go:52] virtual-plugin OnNodeStateAdd()
I0105 12:47:28.223891 2842239 daemon.go:479] nodeStateSyncHandler(): plugin virtual_plugin: reqDrain false, reqReboot false
I0105 12:47:28.223901 2842239 daemon.go:483] nodeStateSyncHandler(): reqDrain false, reqReboot false disableDrain false
I0105 12:47:28.223911 2842239 virtual_plugin.go:84] virtual-plugin Apply(): desiredState={5006414 []}


The problem is when the nics are pass with hw-offload the interfaces are not in the OpenStack metadata section

```
{
  "uuid": "6a75c23b-5efc-4a75-97b6-1484b18f99db",
  "admin_pass": "dc7DcAzoMz8A",
  "hostname": "ostest-kdjnt-worker-1",
  "name": "ostest-kdjnt-worker-1",
  "launch_index": 0,
  "availability_zone": "nova",
  "random_seed": "RmN4d7iCA+00JjSYz89yElzwaSF3Z4ERerlbVu0daj+kSTszAf1UvVszgIRKHlgs/BwVPls0Gz3pFod/VetpeEAsownYuZPlX9RLtpsea6ZGMEGqYGlCeanxOQh4UTQWSN7hkzrpB5AzYppI5p7yO2hPQ5R6scp+4J73SkCHSbHs2bgvBXLyuikBLoTsuE23S9ma3rmYsaOqe6glsIhRjoLD06vDwpXkPfoolgeIsFfM1VMwzxmtE5GmC+ZAEhAHteynN1+uxxZ4GQtFTnF8QpVMsHhYr/5F/KdC59mBTlzyFFgXPAbHRpjDj0Ydng1VxI3bDIPiLBx/N3a6R2qAifzeDwphQBKgvhMlbmlBCZnfizViEDtQ45k6S4LrFhC9m+WLt/P43iWNOOqD4UVi5gjYOHwJNWSNuni4mkwM5dh8PP5xdEYOq2XUL1oYo/VV/Stxgc9L8dweZdqzvBWCtcp7MFQzuV0gYb8Me6lWmLqrhORDMfhg4AjzAmk0Ui7ryjTGY+6EC6HwjbtLJm77GfuoQTw0UtVVH2SnciwdKiN9gb06Vb0wkyimyxO77Eey6DQL9io8pnceSA6vVsZKD6IJ1KNE3NDvGqBByIsfl5WkaHBwW8eAO7MrRj7YH84jrE6aIzrvSjeAg0t0/fbfUJmynoFGemRL4XBo8+HrNgc=",
  "project_id": "e26bdb8747a94593a9b2ba5919439e57",
  "devices": [
    
  ]
}
```

so the sriov daemon is not able to connect the nic pci that should be under meta_data.json devices with the network_id from the network_meta.json

```
{
  "links": [
    {
      "id": "tap8626696e-4c",
      "vif_id": "8626696e-4cf4-47a7-9490-fe2f046923a6",
      "type": "ovs",
      "mtu": 1442,
      "ethernet_mac_address": "fa:16:3e:d6:98:de"
    },
    {
      "id": "tap3fbf098c-30",
      "vif_id": "3fbf098c-302d-40c2-a273-3c266398557b",
      "type": "ovs",
      "mtu": 1500,
      "ethernet_mac_address": "fa:16:3e:65:0f:62"
    },
    {
      "id": "tap81855815-39",
      "vif_id": "81855815-396c-4e0c-95bd-e48baf73cc69",
      "type": "ovs",
      "mtu": 1500,
      "ethernet_mac_address": "fa:16:3e:f4:79:a2"
    }
  ],
  "networks": [
    {
      "id": "network0",
      "type": "ipv4_dhcp",
      "link": "tap8626696e-4c",
      "network_id": "c0f4c670-9bf5-4867-8f31-1fed1ac137b6"
    },
    {
      "id": "network1",
      "type": "ipv4",
      "link": "tap3fbf098c-30",
      "ip_address": "10.10.9.147",
      "netmask": "255.255.255.0",
      "routes": [
        {
          "network": "0.0.0.0",
          "netmask": "0.0.0.0",
          "gateway": "10.10.9.254"
        }
      ],
      "network_id": "34591cf7-55e3-4229-8b00-17d3e0723e72",
      "services": [
        {
          "type": "dns",
          "address": "10.46.0.31"
        },
        {
          "type": "dns",
          "address": "8.8.8.8"
        }
      ]
    },
    {
      "id": "network2",
      "type": "ipv4",
      "link": "tap81855815-39",
      "ip_address": "10.10.10.154",
      "netmask": "255.255.255.0",
      "routes": [
        {
          "network": "0.0.0.0",
          "netmask": "0.0.0.0",
          "gateway": "10.10.10.254"
        }
      ],
      "network_id": "9e19273b-f0b7-4401-92ed-e0c775472a7c",
      "services": [
        {
          "type": "dns",
          "address": "10.46.0.31"
        },
        {
          "type": "dns",
          "address": "8.8.8.8"
        }
      ]
    }
  ],
  "services": [
    {
      "type": "dns",
      "address": "10.46.0.31"
    },
    {
      "type": "dns",
      "address": "8.8.8.8"
    }
  ]
}
```

we also check a regular passthrough with intel nics ad the devices are preset there.


one possible solution is to use only the network_meta.json file and connect between:
1. mac inside the host(vm) and the links section in the network_meta.json
2. the id in the link section and the id in the networks section to find the network_id

Comment 4 Sebastian Scheinkman 2022-01-06 14:37:30 UTC
update, I managed to make this deployment work.

here is the configuration needed.

1. disable the operator validation webhook - this is needed because the PF deviceID is a VF deviceID so it will not allow the policy creation

apiVersion: v1
items:
- apiVersion: sriovnetwork.openshift.io/v1
  kind: SriovOperatorConfig
  metadata:
    creationTimestamp: "2021-12-28T09:51:16Z"
    generation: 2
    name: default
    namespace: openshift-sriov-network-operator
    resourceVersion: "351755"
    uid: 69a6edc0-87d0-401c-8a71-3598a69c7c9d
  spec:
    enableInjector: true
    enableOperatorWebhook: false
    logLevel: 2
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""

2. configure the mlx VFs for dpdk applications using the sriovnetworknodepolicy

apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetworkNodePolicy
metadata:
  name: "hwoffload10"
  namespace: openshift-sriov-network-operator
spec:
  deviceType: netdevice
  isRdma: true
  nicSelector:
    pfNames:
    - ens5
  nodeSelector:
    feature.node.kubernetes.io/network-sriov.capable: 'true'
  numVfs: 1
  priority: 99
  resourceName: "hwoffload10"

3. configure a network-attachment definition using host-device - this is needed because we must past the kernel nic into the host to support the dpdk (https://doc.dpdk.org/guides/howto/flow_bifurcation.html) and we can't use the sriov-cni as there is no PF on the host.

apiVersion: k8s.cni.cncf.io/v1
kind: NetworkAttachmentDefinition
metadata:
  annotations:
    k8s.v1.cni.cncf.io/resourceName: openshift.io/hwoffload10
  name: hwoffload10
  namespace: default
spec:
    config: '{ "cniVersion":"0.3.1", "name":"hwoffload10","type":"host-device","device":"ens5"
    }'


4. start a pod requesting the network in the annotation and with some capabilities

apiVersion: v1
kind: Pod
metadata:
  name: dpdk-testpmd
  namespace: default
  annotations:
    irq-load-balancing.crio.io: disable
    cpu-quota.crio.io: disable
    k8s.v1.cni.cncf.io/networks: '[
      {
       "name": "hwoffload10",
       "namespace": "default"
      }
    ]'
spec:
  restartPolicy: Never
  containers:
  - name: dpdk-testpmd
    image: <image>
    imagePullPolicy: Always
    securityContext:
      capabilities:
        add:
        - IPC_LOCK
        - SYS_RESOURCE
        - NET_RAW    
    resources:
      limits:
        cpu: 3
        memory: "1000Mi"
        hugepages-1Gi: "7Gi"
      requests:
        cpu: 3
        memory: "1000Mi"
        hugepages-1Gi: "7Gi"
    volumeMounts:
    - mountPath: /dev/hugepages
      name: hugepage
  nodeSelector:
    node-role.kubernetes.io/dpdk: ""
  volumes:
  - name: hugepage
    emptyDir:
      medium: HugePages


Then the dpdk application will be able to start.

sh-4.4# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
3: eth0@if4283: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1392 qdisc noqueue state UP group default 
    link/ether 0a:58:0a:80:03:1c brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.128.3.28/23 brd 10.128.3.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::68d6:aeff:fe91:37a6/64 scope link 
       valid_lft forever preferred_lft forever
4: net1: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether fa:16:3e:f4:79:a2 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::f816:3eff:fef4:79a2/64 scope link 
       valid_lft forever preferred_lft forever
5: net2: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether fa:16:3e:65:0f:62 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::f816:3eff:fe65:f62/64 scope link 
       valid_lft forever preferred_lft forever
sh-4.4# 
sh-4.4# cat /sys/fs/cgroup/cpuset/cpuset.cpu
cat: /sys/fs/cgroup/cpuset/cpuset.cpu: No such file or directory
sh-4.4# cat /sys/fs/cgroup/cpuset/cpuset.cpus 
4-6
sh-4.4# testpmd -l 4,5,6 -n 4 --socket-mem 1024 --log-level="*:debug" -- -i --nb-cores=2 --auto-start --rxd=1024 --txd=1024 --rxq=1 --txq=1
EAL: Detected lcore 0 as core 0 on socket 0
EAL: Detected lcore 1 as core 0 on socket 0
EAL: Detected lcore 2 as core 0 on socket 0
EAL: Detected lcore 3 as core 0 on socket 0
EAL: Detected lcore 4 as core 0 on socket 0
EAL: Detected lcore 5 as core 0 on socket 0
EAL: Detected lcore 6 as core 0 on socket 0
EAL: Detected lcore 7 as core 0 on socket 0
EAL: Detected lcore 8 as core 0 on socket 0
EAL: Detected lcore 9 as core 0 on socket 0
EAL: Detected lcore 10 as core 0 on socket 0
EAL: Detected lcore 11 as core 0 on socket 0
EAL: Detected lcore 12 as core 0 on socket 0
EAL: Detected lcore 13 as core 0 on socket 0
EAL: Detected lcore 14 as core 0 on socket 0
EAL: Detected lcore 15 as core 0 on socket 0
EAL: Detected lcore 16 as core 0 on socket 0
EAL: Detected lcore 17 as core 0 on socket 0
EAL: Detected lcore 18 as core 0 on socket 0
EAL: Detected lcore 19 as core 0 on socket 0
EAL: Detected lcore 20 as core 0 on socket 0
EAL: Support maximum 128 logical core(s) by configuration.
EAL: Detected 21 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: open shared lib /usr/lib64/dpdk-pmds/librte_pmd_bnxt.so.20.0
EAL: open shared lib /usr/lib64/dpdk-pmds/librte_pmd_e1000.so.20.0
EAL: open shared lib /usr/lib64/dpdk-pmds/librte_pmd_enic.so.20.0
EAL: open shared lib /usr/lib64/dpdk-pmds/librte_pmd_failsafe.so.20.0
EAL: open shared lib /usr/lib64/dpdk-pmds/librte_pmd_i40e.so.20.0
EAL: open shared lib /usr/lib64/dpdk-pmds/librte_pmd_ixgbe.so.20.0
EAL: open shared lib /usr/lib64/dpdk-pmds/librte_pmd_mlx4.so.20.0
EAL: open shared lib /usr/lib64/dpdk-pmds/librte_pmd_mlx5.so.20.0
EAL: open shared lib /usr/lib64/dpdk-pmds/librte_pmd_netvsc.so.20.0
EAL: Registered [vmbus] bus.
EAL: open shared lib /usr/lib64/dpdk-pmds/librte_pmd_nfp.so.20.0
EAL: open shared lib /usr/lib64/dpdk-pmds/librte_pmd_qede.so.20.0
EAL: open shared lib /usr/lib64/dpdk-pmds/librte_pmd_ring.so.20.0
EAL: open shared lib /usr/lib64/dpdk-pmds/librte_pmd_tap.so.20.0
EAL: open shared lib /usr/lib64/dpdk-pmds/librte_pmd_vdev_netvsc.so.20.0
EAL: open shared lib /usr/lib64/dpdk-pmds/librte_pmd_vhost.so.20.0
EAL: open shared lib /usr/lib64/dpdk-pmds/librte_pmd_virtio.so.20.0
EAL: Ask a virtual area of 0x5000 bytes
EAL: Virtual area found at 0x100000000 (size = 0x5000)
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Module /sys/module/vfio_pci not found! error 2 (No such file or directory)
EAL: VFIO PCI modules not loaded
EAL: Bus pci wants IOVA as 'DC'
EAL: Buses did not request a specific IOVA mode.
EAL: IOMMU is not available, selecting IOVA as PA mode.
EAL: Selected IOVA mode 'PA'
EAL: No available hugepages reported in hugepages-2048kB
EAL: Probing VFIO support...
EAL: Module /sys/module/vfio not found! error 2 (No such file or directory)
EAL: VFIO modules not loaded, skipping VFIO support...
EAL: Ask a virtual area of 0x2e000 bytes
EAL: Virtual area found at 0x100005000 (size = 0x2e000)
EAL: Setting up physically contiguous memory...
EAL: Setting maximum number of open files to 1048576
EAL: Detected memory type: socket_id:0 hugepage_sz:1073741824
EAL: Creating 4 segment lists: n_segs:32 socket_id:0 hugepage_sz:1073741824
EAL: Ask a virtual area of 0x1000 bytes
EAL: Virtual area found at 0x100033000 (size = 0x1000)
EAL: Memseg list allocated: 0x100000kB at socket 0
EAL: Ask a virtual area of 0x800000000 bytes
EAL: Virtual area found at 0x140000000 (size = 0x800000000)
EAL: Ask a virtual area of 0x1000 bytes
EAL: Virtual area found at 0x940000000 (size = 0x1000)
EAL: Memseg list allocated: 0x100000kB at socket 0
EAL: Ask a virtual area of 0x800000000 bytes
EAL: Virtual area found at 0x980000000 (size = 0x800000000)
EAL: Ask a virtual area of 0x1000 bytes
EAL: Virtual area found at 0x1180000000 (size = 0x1000)
EAL: Memseg list allocated: 0x100000kB at socket 0
EAL: Ask a virtual area of 0x800000000 bytes
EAL: Virtual area found at 0x11c0000000 (size = 0x800000000)
EAL: Ask a virtual area of 0x1000 bytes
EAL: Virtual area found at 0x19c0000000 (size = 0x1000)
EAL: Memseg list allocated: 0x100000kB at socket 0
EAL: Ask a virtual area of 0x800000000 bytes
EAL: Virtual area found at 0x1a00000000 (size = 0x800000000)
EAL: Allocating 1 pages of size 1024M on socket 0
EAL: Trying to obtain current memory policy.
EAL: Setting policy MPOL_PREFERRED for socket 0
EAL: Restoring previous memory policy: 0
EAL: Added 1024M to heap on socket 0
EAL: TSC frequency is ~2990000 KHz
EAL: Master lcore 4 is ready (tid=7f1b4b950900;cpuset=[4])
EAL: lcore 5 is ready (tid=7f1b42f31700;cpuset=[5])
EAL: lcore 6 is ready (tid=7f1b42730700;cpuset=[6])
EAL: PCI device 0000:00:03.0 on NUMA socket -1
EAL:   Invalid NUMA socket, default to 0
EAL:   probe driver: 1af4:1000 net_virtio
EAL:   Not managed by a supported kernel driver, skipped
EAL:   Not managed by a supported kernel driver, skipped
EAL: PCI device 0000:00:05.0 on NUMA socket -1
EAL:   Invalid NUMA socket, default to 0
EAL:   probe driver: 15b3:1018 net_mlx5
EAL: Mem event callback 'MLX5_MEM_EVENT_CB:(nil)' registered
net_mlx5: DV flow is not supported
EAL: PCI device 0000:00:06.0 on NUMA socket -1
EAL:   Invalid NUMA socket, default to 0
EAL:   probe driver: 15b3:1018 net_mlx5
net_mlx5: DV flow is not supported
EAL: Module /sys/module/vfio not found! error 2 (No such file or directory)
Interactive-mode selected
Auto-start selected
testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=163456, size=2176, socket=0
testpmd: preferred mempool ops selected: ring_mp_mc
Configuring Port 0 (socket 0)
Port 0: FA:16:3E:65:0F:62
Configuring Port 1 (socket 0)
Port 1: FA:16:3E:F4:79:A2
Checking link statuses...
Done
Start automatic packet forwarding
io packet forwarding - ports=2 - cores=2 - streams=2 - NUMA support enabled, MP allocation mode: native
Logical Core 5 (socket 0) forwards packets on 1 streams:
  RX P=0/Q=0 (socket 0) -> TX P=1/Q=0 (socket 0) peer=02:00:00:00:00:01
Logical Core 6 (socket 0) forwards packets on 1 streams:
  RX P=1/Q=0 (socket 0) -> TX P=0/Q=0 (socket 0) peer=02:00:00:00:00:00

  io packet forwarding packets/burst=32
  nb forwarding cores=2 - nb forwarding ports=2
  port 0: RX queue number: 1 Tx queue number: 1
    Rx offloads=0x0 Tx offloads=0x0
    RX queue: 0
      RX desc=1024 - RX free threshold=0
      RX threshold registers: pthresh=0 hthresh=0  wthresh=0
      RX Offloads=0x0
    TX queue: 0
      TX desc=1024 - TX free threshold=0
      TX threshold registers: pthresh=0 hthresh=0  wthresh=0
      TX offloads=0x0 - TX RS bit threshold=0
  port 1: RX queue number: 1 Tx queue number: 1
    Rx offloads=0x0 Tx offloads=0x0
    RX queue: 0
      RX desc=1024 - RX free threshold=0
      RX threshold registers: pthresh=0 hthresh=0  wthresh=0
      RX Offloads=0x0
    TX queue: 0
      TX desc=1024 - TX free threshold=0
      TX threshold registers: pthresh=0 hthresh=0  wthresh=0
      TX offloads=0x0 - TX RS bit threshold=0
testpmd> 
testpmd> quit
Telling cores to stop...
Waiting for lcores to finish...

  ---------------------- Forward statistics for port 0  ----------------------
  RX-packets: 0              RX-dropped: 0             RX-total: 0
  TX-packets: 0              TX-dropped: 0             TX-total: 0
  ----------------------------------------------------------------------------

  ---------------------- Forward statistics for port 1  ----------------------
  RX-packets: 0              RX-dropped: 0             RX-total: 0
  TX-packets: 0              TX-dropped: 0             TX-total: 0
  ----------------------------------------------------------------------------

  +++++++++++++++ Accumulated forward statistics for all ports+++++++++++++++
  RX-packets: 0              RX-dropped: 0             RX-total: 0
  TX-packets: 0              TX-dropped: 0             TX-total: 0
  ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Done.

Stopping port 0...
Stopping ports...
Done

Stopping port 1...
Stopping ports...
Done

Shutting down port 0...
Closing ports...
Done

Shutting down port 1...
Closing ports...
Done

Bye...

Comment 5 Emilien Macchi 2022-01-06 16:01:15 UTC
I'm re-opening and re-assigning to the documentation team. We need to add a section for OVS HW offload into https://docs.openshift.com/container-platform/4.9/installing/installing_openstack/installing-openstack-installer-sr-iov.html and explain what Sebastian described in the last comment.

Comment 6 zhaozhanqi 2022-01-07 08:25:44 UTC
Hi, Ziv,  Assign QA to you for checking openstack part, thanks.

Comment 11 zhaozhanqi 2022-04-24 01:30:50 UTC
Assign QA to Ziv for checking openstack part, thanks.

Comment 12 zenghui.shi 2022-05-09 14:22:44 UTC
*** Bug 2061980 has been marked as a duplicate of this bug. ***

Comment 13 Ziv Greenberg 2022-05-19 13:59:08 UTC
Hello,

I have verified that the "Net Filter" is working now in the 4.11 release.


(shiftstack) [cloud-user@installer-host ~]$ oc get clusterversion
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.11.0-0.nightly-2022-05-18-053037   True        False         14h     Cluster version is 4.11.0-0.nightly-2022-05-18-053037
(shiftstack) [cloud-user@installer-host ~]$
(shiftstack) [cloud-user@installer-host ~]$
(shiftstack) [cloud-user@installer-host ~]$ oc get csv -n openshift-sriov-network-operator
NAME                                         DISPLAY                   VERSION               REPLACES   PHASE
sriov-network-operator.4.11.0-202205182037   SR-IOV Network Operator   4.11.0-202205182037              Succeeded
(shiftstack) [cloud-user@installer-host ~]$
(shiftstack) [cloud-user@installer-host ~]$
(shiftstack) [cloud-user@installer-host ~]$ oc get sriovnetworknodepolicy -n openshift-sriov-network-operator -o yaml hwoffload9
apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetworkNodePolicy
metadata:
  creationTimestamp: "2022-05-19T12:41:31Z"
  generation: 1
  name: hwoffload9
  namespace: openshift-sriov-network-operator
  resourceVersion: "309316"
  uid: 79d2057c-99b3-4370-8146-2d6e94d61f2c
spec:
  deviceType: netdevice
  isRdma: true
  nicSelector:
    netFilter: openstack/NetworkID:0373d387-43ba-4896-9f2f-7574c6284953
  nodeSelector:
    feature.node.kubernetes.io/network-sriov.capable: "true"
  numVfs: 1
  priority: 99
  resourceName: hwoffload9
(shiftstack) [cloud-user@installer-host ~]$
(shiftstack) [cloud-user@installer-host ~]$
(shiftstack) [cloud-user@installer-host ~]$ oc get net-attach-def -o yaml hwoffload9
apiVersion: k8s.cni.cncf.io/v1
kind: NetworkAttachmentDefinition
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"k8s.cni.cncf.io/v1","kind":"NetworkAttachmentDefinition","metadata":{"annotations":{},"k8s.v1.cni.cncf.io/resourceName":"openshift.io/hwoffload9","name":"hwoffload9","namespace":"default"},"spec":{"config":"{\"cniVersion\":\"0.3.1\", \"name\":\"hwoffload9\",\"type\":\"host-device\",\"pciBusId\":\"0000:00:05.0\",\"ipam\":{}}"}}
  creationTimestamp: "2022-05-19T12:55:26Z"
  generation: 1
  name: hwoffload9
  namespace: default
  resourceVersion: "314300"
  uid: 6201e277-7c0f-464f-965e-9c1c56389afb
spec:
  config: '{"cniVersion":"0.3.1", "name":"hwoffload9","type":"host-device","pciBusId":"0000:00:05.0","ipam":{}}'
(shiftstack) [cloud-user@installer-host ~]$
(shiftstack) [cloud-user@installer-host ~]$
(shiftstack) [cloud-user@installer-host ~]$ oc describe SriovNetworkNodeState -n openshift-sriov-network-operator ostest-7pqj8-worker-1
Name:         ostest-7pqj8-worker-1
Namespace:    openshift-sriov-network-operator
Labels:       <none>
Annotations:  <none>
API Version:  sriovnetwork.openshift.io/v1
Kind:         SriovNetworkNodeState
Metadata:
  Creation Timestamp:  2022-05-19T09:57:17Z
  Generation:          3
  Managed Fields:
    API Version:  sriovnetwork.openshift.io/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:ownerReferences:
          .:
          k:{"uid":"3026d570-279d-4fbb-8de3-419666cf9dbc"}:
      f:spec:
        .:
        f:dpConfigVersion:
        f:interfaces:
    Manager:      sriov-network-operator
    Operation:    Update
    Time:         2022-05-19T12:41:32Z
    API Version:  sriovnetwork.openshift.io/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:status:
        f:interfaces:
        f:syncStatus:
    Manager:      sriov-network-config-daemon
    Operation:    Update
    Subresource:  status
    Time:         2022-05-19T12:59:18Z
  Owner References:
    API Version:           sriovnetwork.openshift.io/v1
    Block Owner Deletion:  true
    Controller:            true
    Kind:                  SriovNetworkNodePolicy
    Name:                  default
    UID:                   3026d570-279d-4fbb-8de3-419666cf9dbc
  Resource Version:        315641
  UID:                     405f2c71-669d-468a-a8e5-1b6a721dfe79
Spec:
  Dp Config Version:  309366
  Interfaces:
    Name:         ens6
    Num Vfs:      1
    Pci Address:  0000:00:06.0
    Vf Groups:
      Device Type:    netdevice
      Is Rdma:        true
      Policy Name:    hwoffload10
      Resource Name:  hwoffload10
      Vf Range:       0-0
    Name:             ens5
    Num Vfs:          1
    Pci Address:      0000:00:05.0
    Vf Groups:
      Device Type:    netdevice
      Is Rdma:        true
      Policy Name:    hwoffload9
      Resource Name:  hwoffload9
      Vf Range:       0-0
Status:
  Interfaces:
    Vfs:
      Device ID:    1000
      Driver:       virtio-pci
      Mac:          fa:16:3e:f7:86:ae
      Pci Address:  0000:00:03.0
      Vendor:       1af4
      Vf ID:        0
    Device ID:      1000
    Driver:         virtio-pci
    Link Speed:     -1 Mb/s
    Link Type:      ETH
    Mac:            fa:16:3e:f7:86:ae
    Name:           ens3
    Net Filter:     openstack/NetworkID:6609c73b-0aad-426b-ac48-3e886c162693
    Num Vfs:        1
    Pci Address:    0000:00:03.0
    Totalvfs:       1
    Vendor:         1af4
    Vfs:
      Device ID:    1018
      Driver:       mlx5_core
      Pci Address:  0000:00:05.0
      Vendor:       15b3
      Vf ID:        0
    Device ID:      1018
    Driver:         mlx5_core
    Net Filter:     openstack/NetworkID:0373d387-43ba-4896-9f2f-7574c6284953
    Num Vfs:        1
    Pci Address:    0000:00:05.0
    Totalvfs:       1
    Vendor:         15b3
    Vfs:
      Device ID:    1018
      Driver:       mlx5_core
      Pci Address:  0000:00:06.0
      Vendor:       15b3
      Vf ID:        0
    Device ID:      1018
    Driver:         mlx5_core
    Net Filter:     openstack/NetworkID:6b295bdc-7e75-4250-8288-71ac2a7aa71d
    Num Vfs:        1
    Pci Address:    0000:00:06.0
    Totalvfs:       1
    Vendor:         15b3
  Sync Status:      Succeeded
Events:             <none>


(shiftstack) [cloud-user@installer-host ~]$ oc logs pods/hwoffload-testpmd
EAL: Detected 21 lcore(s)
EAL: Detected 1 NUMA nodes
Option -w, --pci-whitelist is deprecated, use -a, --allow option instead
Option -w, --pci-whitelist is deprecated, use -a, --allow option instead
EAL: Selected IOVA mode 'PA'
EAL: No available hugepages reported in hugepages-2048kB
EAL: No free hugepages reported in hugepages-2048kB
EAL: No available hugepages reported in hugepages-2048kB
EAL: Probing VFIO support...
EAL:   Invalid NUMA socket, default to 0
EAL: Probe PCI driver: mlx5_pci (15b3:1018) device: 0000:00:05.0 (socket 0)
mlx5_pci: No available register for Sampler.
mlx5_pci: Size 0xFFFF is not power of 2, will be aligned to 0x10000.
EAL:   Invalid NUMA socket, default to 0
EAL: Probe PCI driver: mlx5_pci (15b3:1018) device: 0000:00:06.0 (socket 0)
mlx5_pci: No available register for Sampler.
mlx5_pci: Size 0xFFFF is not power of 2, will be aligned to 0x10000.
EAL: No legacy callbacks, legacy socket not created
Auto-start selected
testpmd: create a new mbuf pool <mb_pool_0>: n=163456, size=2176, socket=0
testpmd: preferred mempool ops selected: ring_mp_mc
Configuring Port 0 (socket 0)
Port 0: FA:16:3E:34:70:AA
Configuring Port 1 (socket 0)
Port 1: FA:16:3E:5B:B7:B6
Checking link statuses...
Done
No commandline core given, start packet forwarding
io packet forwarding - ports=2 - cores=1 - streams=2 - NUMA support enabled, MP allocation mode: native
Logical Core 5 (socket 0) forwards packets on 2 streams:
  RX P=0/Q=0 (socket 0) -> TX P=1/Q=0 (socket 0) peer=02:00:00:00:00:01
  RX P=1/Q=0 (socket 0) -> TX P=0/Q=0 (socket 0) peer=02:00:00:00:00:00

  io packet forwarding packets/burst=32
  nb forwarding cores=1 - nb forwarding ports=2
  port 0: RX queue number: 1 Tx queue number: 1
    Rx offloads=0x0 Tx offloads=0x0
    RX queue: 0
      RX desc=256 - RX free threshold=64
      RX threshold registers: pthresh=0 hthresh=0  wthresh=0
      RX Offloads=0x0
    TX queue: 0
      TX desc=256 - TX free threshold=0
      TX threshold registers: pthresh=0 hthresh=0  wthresh=0
      TX offloads=0x0 - TX RS bit threshold=0
  port 1: RX queue number: 1 Tx queue number: 1
    Rx offloads=0x0 Tx offloads=0x0
    RX queue: 0
      RX desc=256 - RX free threshold=64
      RX threshold registers: pthresh=0 hthresh=0  wthresh=0
      RX Offloads=0x0
    TX queue: 0
      TX desc=256 - TX free threshold=0
      TX threshold registers: pthresh=0 hthresh=0  wthresh=0
      TX offloads=0x0 - TX RS bit threshold=0

Port statistics ====================================
  ######################## NIC statistics for port 0  ########################
  RX-packets: 0          RX-missed: 0          RX-bytes:  0
  RX-errors: 0
  RX-nombuf:  0
  TX-packets: 0          TX-errors: 0          TX-bytes:  0

  Throughput (since last show)
  Rx-pps:            0          Rx-bps:            0
  Tx-pps:            0          Tx-bps:            0
  ############################################################################

  ######################## NIC statistics for port 1  ########################
  RX-packets: 0          RX-missed: 0          RX-bytes:  0
  RX-errors: 0
  RX-nombuf:  0
  TX-packets: 0          TX-errors: 0          TX-bytes:  0

  Throughput (since last show)
  Rx-pps:            0          Rx-bps:            0
  Tx-pps:            0          Tx-bps:            0
  ############################################################################
(shiftstack) [cloud-user@installer-host ~]$



Thanks,
Ziv

Comment 15 errata-xmlrpc 2022-08-10 10:41:05 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069


Note You need to log in before you can comment on or make changes to this bug.