Description of problem: When running the latency checkup job for testing DPDK, the traffic generator fails to start due to inability to locate the unique environment variable it is looking for. Version-Release number of selected component (if applicable): CNV 4.13.0 Letncy checkup: registry.redhat.io/container-native-virtualization/vm-network-latency-checkup-rhel9 How reproducible: Always Steps to Reproduce: 1. On a cluster with SR-IOV supported - create the following namespace: $ oc create ns dpdk-checkup-ns namespace/dpdk-checkup-ns created 2. Change the cluster context to be in the new namespace: $ oc project dpdk-checkup-ns Now using project "dpdk-checkup-ns" on server "https://api.bm02-cnvqe2-rdu2.cnvqe2.lab.eng.rdu2.redhat.com:6443". 3. Apply the following resources, in order to run latency checkup job that tests DPDK (the resources are attached): $ oc apply -f dpdk-latency-checkup-infra.yaml serviceaccount/dpdk-checkup-sa created role.rbac.authorization.k8s.io/kiagnose-configmap-access created rolebinding.rbac.authorization.k8s.io/kiagnose-configmap-access created role.rbac.authorization.k8s.io/kubevirt-dpdk-checker created rolebinding.rbac.authorization.k8s.io/kubevirt-dpdk-checker created $ $ oc apply -f dpdk-latency-checkup-cm.yaml configmap/dpdk-checkup-config created $ 4. Start the latency checkup job using the attached resource: $ oc apply -f dpdk-latency-checkup-job.yaml job.batch/dpdk-checkup created 5. While the job runs - find the traffic-generator pod: $ oc get pods -n dpdk-checkup-ns NAME READY STATUS RESTARTS AGE dpdk-checkup-xzcvt 1/1 Running 0 25s kubevirt-dpdk-checkup-traffic-gen-tzb2h 0/1 CrashLoopBackOff 1 (12s ago) 22s virt-launcher-dpdk-vmi-v6l69-jd52z 0/2 PodInitializing 0 22s 6. Check the log of the traffic generator pod: $ oc logs kubevirt-dpdk-checkup-traffic-gen-tzb2h --follow setting params to trex_cfg.yaml + set_pci_addresses ++ get_pci_device_env_var +++ grep PCIDEVICE_ +++ env ++ local 'pci_device_env_with_value=PCIDEVICE_OPENSHIFT_IO_INTEL_NICS_DPDK=0000:19:0a.1,0000:19:0a.0 PCIDEVICE_OPENSHIFT_IO_INTEL_NICS_DPDK_INFO={"0000:19:0a.0":{"generic":{"deviceID":"0000:19:0a.0"},"vfio":{"dev-mount":"/dev/vfio/186","mount":"/dev/vfio/vfio"}},"0000:19:0a.1":{"generic":{"deviceID":"0000:19:0a.1"},"vfio":{"dev-mount":"/dev/vfio/187","mount":"/dev/vfio/vfio"}}}' +++ wc -l +++ echo 'PCIDEVICE_OPENSHIFT_IO_INTEL_NICS_DPDK=0000:19:0a.1,0000:19:0a.0 PCIDEVICE_OPENSHIFT_IO_INTEL_NICS_DPDK_INFO={"0000:19:0a.0":{"generic":{"deviceID":"0000:19:0a.0"},"vfio":{"dev-mount":"/dev/vfio/186","mount":"/dev/vfio/vfio"}},"0000:19:0a.1":{"generic":{"deviceID":"0000:19:0a.1"},"vfio":{"dev-mount":"/dev/vfio/187","mount":"/dev/vfio/vfio"}}}' ++ '[' 2 '!=' 1 ']' ++ echo 'error: could not find pci device env var' ++ exit 1 + local 'pci_device_env_name=error: could not find pci device env var' + IFS=, + read -r -a nics_array /opt/scripts/set_traffic_gen_cfg_file.sh: line 73: error: could not find pci device env var: invalid variable name Checking the log shows that the the flow looks for a single environment variable with a `PCIDEVICE_` prefix, but it finds 2, and because it cannot determine which is the relevant var - it fails. Actual results: <BUG> Traffic generator fails. Expected results: The generator should complete its role and generate traffic. Additional info: By checking the log of the traffic generator pod (pasted above), we can see that the source of this issue is that the the flow looks for a single environment variable with a `PCIDEVICE_` prefix, but it finds 2, and because it cannot determine which is the relevant var - it fails.
https://github.com/kiagnose/kubevirt-dpdk-checkup/pull/95
Verified with latest DPDK checkup related images: brew.registry.redhat.io/rh-osbs/container-native-virtualization-kubevirt-dpdk-checkup-rhel9:v4.13.0 quay.io/kiagnose/kubevirt-dpdk-checkup-traffic-gen:v0.1.1 quay.io/kiagnose/kubevirt-dpdk-checkup-vm:v0.1.1
@ysegev could you please state the full build tag of the checkup's image? (v4.13.0-XX)
Re-verified, this time with this DPDK checkup image: registry-proxy.engineering.redhat.com/rh-osbs/container-native-virtualization-kubevirt-dpdk-checkup-rhel9:v4.13.0-38
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Virtualization 4.13.0 Images security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2023:3205