Description of problem: On this BZ[https://bugzilla.redhat.com/show_bug.cgi?id=1983964] we fix the issue of not mounting the vhost-net inside the container when requested by the use. The problem is to use the vhost-net on a dpdk application there is a need for a tap device to be created inside the pod. To be able to create the tap device we need 1. the sriov-device-plugin should mount the /dev/net/tun device inside the container together with the vhost-net device. 2. the sriov-config-daemon should load the tun kernel module when a user requests to apply a policy requesting vhost-net device. The current solution is the allow pods to have MKNOD capability so they can create the tun device inside the container but this exposes multiple security issues.
Verified this bug on 4.10.0-202111292203 with the following steps 1. Create VF with device type is vfio-pci and needVhostNet is true apiVersion: sriovnetwork.openshift.io/v1 kind: SriovNetworkNodePolicy metadata: name: intel-dpdk namespace: openshift-sriov-network-operator spec: deviceType: vfio-pci needVhostNet: true mtu: 1700 nicSelector: deviceID: "158b" pfNames: - ens1f1 rootDevices: - '0000:3b:00.1' vendor: '8086' nodeSelector: feature.node.kubernetes.io/sriov-capable: 'true' numVfs: 2 priority: 99 resourceName: inteldpdk 2. Create sriovnetwork to generate NAD apiVersion: sriovnetwork.openshift.io/v1 kind: SriovNetwork metadata: name: dpdk-network namespace: openshift-sriov-network-operator spec: networkNamespace: z1 ipam: "{}" vlan: 0 resourceName: inteldpdk 3. Create the test pod apiVersion: v1 kind: Pod metadata: generateName: testpod1 labels: env: test annotations: k8s.v1.cni.cncf.io/networks: dpdk-network spec: containers: - name: dpdk image: registry.redhat.io/openshift4/dpdk-base-rhel8:v4.8.0-8.1628601733 imagePullPolicy: IfNotPresent securityContext: runAsUser: 0 capabilities: add: ["IPC_LOCK"] resources: requests: hugepages-1Gi: 4Gi cpu: "4" memory: "1Gi" limits: hugepages-1Gi: 4Gi cpu: "4" memory: "1Gi" volumeMounts: - mountPath: /dev/hugepages name: hugepage command: ["sleep", "infinity"] volumes: - name: hugepage emptyDir: medium: HugePages 4. Rsh into the container and check sh-4.4# ls /dev/net/tun /dev/net/tun sh-4.4# dpdk-testpmd -l 2,4,6,8 -a 0000:3b:0a.0 --iova-mode=va -- -i --portmask=0x1 --nb-cores=2 --forward-mode=mac --port-topology=loop --no-mlockall EAL: Detected 56 lcore(s) EAL: Detected 2 NUMA nodes EAL: Multi-process socket /var/run/dpdk/rte/mp_socket EAL: Selected IOVA mode 'VA' EAL: No available hugepages reported in hugepages-2048kB EAL: Probing VFIO support... EAL: VFIO support initialized EAL: using IOMMU type 1 (Type 1) EAL: Probe PCI driver: net_i40e_vf (8086:154c) device: 0000:3b:0a.0 (socket 0) EAL: No legacy callbacks, legacy socket not created Interactive-mode selected Set mac packet forwarding mode testpmd: create a new mbuf pool <mb_pool_0>: n=171456, size=2176, socket=0 testpmd: preferred mempool ops selected: ring_mp_mc Configuring Port 0 (socket 0) Port 0: link state change event Port 0: link state change event Port 0: link state change event Port 0: 76:3E:4F:FC:EA:11 Checking link statuses... Done testpmd> start mac packet forwarding - ports=1 - cores=1 - streams=1 - NUMA support enabled, MP allocation mode: native Logical Core 4 (socket 0) forwards packets on 1 streams: RX P=0/Q=0 (socket 0) -> TX P=0/Q=0 (socket 0) peer=02:00:00:00:00:00 mac packet forwarding packets/burst=32 nb forwarding cores=2 - nb forwarding ports=1 port 0: RX queue number: 1 Tx queue number: 1 Rx offloads=0x0 Tx offloads=0x0 RX queue: 0 RX desc=512 - RX free threshold=32 RX threshold registers: pthresh=0 hthresh=0 wthresh=0 RX Offloads=0x0 TX queue: 0 TX desc=512 - TX free threshold=32 TX threshold registers: pthresh=32 hthresh=0 wthresh=0 TX offloads=0x0 - TX RS bit threshold=32 testpmd> testpmd> stop Telling cores to stop... Waiting for lcores to finish... ---------------------- Forward statistics for port 0 ---------------------- RX-packets: 10 RX-dropped: 0 RX-total: 10 TX-packets: 10 TX-dropped: 0 TX-total: 10 ---------------------------------------------------------------------------- +++++++++++++++ Accumulated forward statistics for all ports+++++++++++++++ RX-packets: 10 RX-dropped: 0 RX-total: 10 TX-packets: 10 TX-dropped: 0 TX-total: 10 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Done.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056