Description of problem: After a successful deployment of the OCP on OSP (Shift-on-Stack), as part of our Telco testing, the sriov-network-operator is required to be installed and configured. But immediately after the initial operator installation, I found the daemon pods in the "CrashLoopBackOff" status, rebooting them didn't fix the main problem: [cloud-user@installer-host ~]$ oc get -n openshift-sriov-network-operator all NAME READY STATUS RESTARTS AGE pod/network-resources-injector-7zvb6 1/1 Running 0 3m22s pod/network-resources-injector-llx8q 1/1 Running 0 3m22s pod/network-resources-injector-swzxk 1/1 Running 0 3m22s pod/sriov-network-config-daemon-5jd4m 0/1 CrashLoopBackOff 4 3m22s pod/sriov-network-config-daemon-mwzmz 0/1 CrashLoopBackOff 4 3m22s pod/sriov-network-operator-6947d96c-lmcxn 1/1 Running 0 3m37s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/network-resources-injector-service ClusterIP 172.30.179.150 <none> 443/TCP 3m22s NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE daemonset.apps/network-resources-injector 3 3 3 3 3 beta.kubernetes.io/os=linux,node-role.kubernetes.io/master= 3m22s daemonset.apps/sriov-network-config-daemon 2 2 0 2 0 beta.kubernetes.io/os=linux,node-role.kubernetes.io/worker= 3m22s NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/sriov-network-operator 1/1 1 1 3m37s By looking at their specific logs, I could see the following errors: [cloud-user@installer-host ~]$ oc logs pod/sriov-network-config-daemon-5jd4m -n openshift-sriov-network-operator I1018 18:15:11.524256 190307 start.go:107] overriding kubernetes api to https://api-int.ostest.shiftstack.com:6443 I1018 18:15:11.525581 190307 start.go:138] starting node writer I1018 18:15:11.534127 190307 start.go:158] Running on platform: Virtual/Openstack I1018 18:15:11.534142 190307 writer.go:44] Run(): start writer I1018 18:15:11.534146 190307 writer.go:47] Run(): once I1018 18:15:11.560971 190307 utils.go:598] getLinkType(): Device 0000:00:03.0 I1018 18:15:11.561041 190307 utils.go:598] getLinkType(): Device 0000:00:04.0 I1018 18:15:11.561098 190307 utils.go:598] getLinkType(): Device 0000:00:05.0 I1018 18:15:11.566328 190307 writer.go:132] setNodeStateStatus(): syncStatus: , lastSyncError: I1018 18:15:11.571454 190307 writer.go:170] writeCheckpointFile(): try to decode the checkpoint file I1018 18:15:11.571553 190307 start.go:164] Starting SriovNetworkConfigDaemon I1018 18:15:11.571572 190307 writer.go:44] Run(): start writer I1018 18:15:11.571579 190307 daemon.go:257] Run(): start daemon E1018 18:15:11.581359 190307 daemon.go:951] tryEnableRdma(): fail to enable rdma exit status 1: I1018 18:15:11.587662 190307 daemon.go:442] Set log verbose level to: 2 I1018 18:15:16.686993 190307 daemon.go:319] Starting workers I1018 18:15:16.687012 190307 daemon.go:322] Started workers I1018 18:15:16.687027 190307 daemon.go:362] worker queue size: 1 I1018 18:15:16.687032 190307 daemon.go:364] get item: 1 I1018 18:15:16.687037 190307 daemon.go:454] nodeStateSyncHandler(): new generation is 1 I1018 18:15:16.689510 190307 daemon.go:689] loadVendorPlugins(): try to load plugin virtual_plugin I1018 18:15:16.689523 190307 plugin.go:39] loadPlugin(): load plugin from /plugins/virtual_plugin.so I1018 18:15:16.689576 190307 writer.go:61] Run(): refresh trigger I1018 18:15:16.689584 190307 writer.go:80] pollNicStatus() I1018 18:15:16.689588 190307 utils_virtual.go:158] DiscoverSriovDevicesVirtual I1018 18:15:16.708806 190307 virtual_plugin.go:52] virtual-plugin OnNodeStateAdd() I1018 18:15:16.708855 190307 daemon.go:509] nodeStateSyncHandler(): plugin virtual_plugin: reqDrain false, reqReboot false I1018 18:15:16.708868 190307 daemon.go:513] nodeStateSyncHandler(): reqDrain false, reqReboot false disableDrain false I1018 18:15:16.708875 190307 virtual_plugin.go:84] virtual-plugin Apply(): desiredState={186996 []} I1018 18:15:16.718493 190307 utils.go:409] getNetdevMTU(): get MTU for device 0000:00:03.0 I1018 18:15:16.718606 190307 utils.go:598] getLinkType(): Device 0000:00:03.0 I1018 18:15:16.718720 190307 utils.go:409] getNetdevMTU(): get MTU for device 0000:00:04.0 I1018 18:15:16.718797 190307 utils.go:598] getLinkType(): Device 0000:00:04.0 I1018 18:15:16.718916 190307 utils.go:409] getNetdevMTU(): get MTU for device 0000:00:05.0 I1018 18:15:16.718991 190307 utils.go:598] getLinkType(): Device 0000:00:05.0 I1018 18:15:16.724615 190307 writer.go:132] setNodeStateStatus(): syncStatus: InProgress, lastSyncError: E1018 18:15:16.730394 190307 runtime.go:78] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference) goroutine 102 [running]: k8s.io/apimachinery/pkg/util/runtime.logPanic(0x1eb0e00, 0x2f17450) /go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:74 +0xa6 k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0) /go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:48 +0x86 panic(0x1eb0e00, 0x2f17450) /usr/lib/golang/src/runtime/panic.go:965 +0x1b9 github.com/k8snetworkplumbingwg/sriov-network-operator/pkg/daemon.(*Daemon).nodeStateSyncHandler(0xc001440270, 0x1, 0xc0005ea0d0, 0xc001484630) /go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/pkg/daemon/daemon.go:548 +0x101b github.com/k8snetworkplumbingwg/sriov-network-operator/pkg/daemon.(*Daemon).processNextWorkItem.func1(0xc001440270, 0x1e3c2a0, 0x2f5e708, 0x0, 0x0) /go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/pkg/daemon/daemon.go:385 +0xdf github.com/k8snetworkplumbingwg/sriov-network-operator/pkg/daemon.(*Daemon).processNextWorkItem(0xc001440270, 0x203000) /go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/pkg/daemon/daemon.go:401 +0x169 github.com/k8snetworkplumbingwg/sriov-network-operator/pkg/daemon.(*Daemon).runWorker(...) /go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/pkg/daemon/daemon.go:346 k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0xc00073e080) /go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:155 +0x5f k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc00073e080, 0x22bc000, 0xc0001a69f0, 0x1, 0xc00010e360) /go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:156 +0x9b k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc00073e080, 0x3b9aca00, 0x0, 0xc0004c8d01, 0xc00010e360) /go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133 +0x98 k8s.io/apimachinery/pkg/util/wait.Until(0xc00073e080, 0x3b9aca00, 0xc00010e360) /go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:90 +0x4d created by github.com/k8snetworkplumbingwg/sriov-network-operator/pkg/daemon.(*Daemon).Run /go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/pkg/daemon/daemon.go:321 +0xac5 panic: runtime error: invalid memory address or nil pointer dereference [recovered] panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x1d375bb] goroutine 102 [running]: k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0) /go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:55 +0x109 panic(0x1eb0e00, 0x2f17450) /usr/lib/golang/src/runtime/panic.go:965 +0x1b9 github.com/k8snetworkplumbingwg/sriov-network-operator/pkg/daemon.(*Daemon).nodeStateSyncHandler(0xc001440270, 0x1, 0xc0005ea0d0, 0xc001484630) /go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/pkg/daemon/daemon.go:548 +0x101b github.com/k8snetworkplumbingwg/sriov-network-operator/pkg/daemon.(*Daemon).processNextWorkItem.func1(0xc001440270, 0x1e3c2a0, 0x2f5e708, 0x0, 0x0) /go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/pkg/daemon/daemon.go:385 +0xdf github.com/k8snetworkplumbingwg/sriov-network-operator/pkg/daemon.(*Daemon).processNextWorkItem(0xc001440270, 0x203000) /go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/pkg/daemon/daemon.go:401 +0x169 github.com/k8snetworkplumbingwg/sriov-network-operator/pkg/daemon.(*Daemon).runWorker(...) /go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/pkg/daemon/daemon.go:346 k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0xc00073e080) /go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:155 +0x5f k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc00073e080, 0x22bc000, 0xc0001a69f0, 0x1, 0xc00010e360) /go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:156 +0x9b k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc00073e080, 0x3b9aca00, 0x0, 0xc0004c8d01, 0xc00010e360) /go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133 +0x98 k8s.io/apimachinery/pkg/util/wait.Until(0xc00073e080, 0x3b9aca00, 0xc00010e360) /go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:90 +0x4d created by github.com/k8snetworkplumbingwg/sriov-network-operator/pkg/daemon.(*Daemon).Run /go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/pkg/daemon/daemon.go:321 +0xac5 [cloud-user@installer-host ~]$ oc logs pod/sriov-network-config-daemon-mwzmz -n openshift-sriov-network-operator I1018 18:16:50.082468 424839 start.go:107] overriding kubernetes api to https://api-int.ostest.shiftstack.com:6443 I1018 18:16:50.084404 424839 start.go:138] starting node writer I1018 18:16:50.092922 424839 start.go:158] Running on platform: Virtual/Openstack I1018 18:16:50.093055 424839 writer.go:44] Run(): start writer I1018 18:16:50.093125 424839 writer.go:47] Run(): once I1018 18:16:50.122076 424839 utils.go:598] getLinkType(): Device 0000:00:03.0 I1018 18:16:50.122282 424839 utils.go:598] getLinkType(): Device 0000:00:05.0 I1018 18:16:50.122887 424839 utils.go:598] getLinkType(): Device 0000:00:06.0 I1018 18:16:50.125775 424839 writer.go:132] setNodeStateStatus(): syncStatus: , lastSyncError: I1018 18:16:50.131089 424839 writer.go:170] writeCheckpointFile(): try to decode the checkpoint file I1018 18:16:50.131332 424839 start.go:164] Starting SriovNetworkConfigDaemon I1018 18:16:50.131349 424839 writer.go:44] Run(): start writer I1018 18:16:50.131496 424839 daemon.go:257] Run(): start daemon E1018 18:16:50.142113 424839 daemon.go:951] tryEnableRdma(): fail to enable rdma exit status 1: I1018 18:16:50.147463 424839 daemon.go:442] Set log verbose level to: 2 I1018 18:16:55.247070 424839 daemon.go:319] Starting workers I1018 18:16:55.247230 424839 daemon.go:322] Started workers I1018 18:16:55.247254 424839 daemon.go:362] worker queue size: 1 I1018 18:16:55.247382 424839 daemon.go:364] get item: 1 I1018 18:16:55.247449 424839 daemon.go:454] nodeStateSyncHandler(): new generation is 1 I1018 18:16:55.250544 424839 daemon.go:689] loadVendorPlugins(): try to load plugin virtual_plugin I1018 18:16:55.250556 424839 plugin.go:39] loadPlugin(): load plugin from /plugins/virtual_plugin.so I1018 18:16:55.250558 424839 writer.go:61] Run(): refresh trigger I1018 18:16:55.250566 424839 writer.go:80] pollNicStatus() I1018 18:16:55.250579 424839 utils_virtual.go:158] DiscoverSriovDevicesVirtual I1018 18:16:55.270524 424839 virtual_plugin.go:52] virtual-plugin OnNodeStateAdd() I1018 18:16:55.270568 424839 daemon.go:509] nodeStateSyncHandler(): plugin virtual_plugin: reqDrain false, reqReboot false I1018 18:16:55.270577 424839 daemon.go:513] nodeStateSyncHandler(): reqDrain false, reqReboot false disableDrain false I1018 18:16:55.270583 424839 virtual_plugin.go:84] virtual-plugin Apply(): desiredState={186996 []} I1018 18:16:55.279729 424839 utils.go:409] getNetdevMTU(): get MTU for device 0000:00:03.0 I1018 18:16:55.279757 424839 utils.go:598] getLinkType(): Device 0000:00:03.0 I1018 18:16:55.279811 424839 utils.go:409] getNetdevMTU(): get MTU for device 0000:00:05.0 I1018 18:16:55.279853 424839 utils.go:404] tryGetInterfaceName(): name is ens5 I1018 18:16:55.279899 424839 utils.go:404] tryGetInterfaceName(): name is ens5 I1018 18:16:55.279902 424839 utils.go:430] getNetDevMac(): get Mac for device ens5 I1018 18:16:55.279923 424839 utils.go:442] getNetDevLinkSpeed(): get LinkSpeed for device ens5 I1018 18:16:55.279939 424839 utils.go:598] getLinkType(): Device 0000:00:05.0 I1018 18:16:55.280070 424839 utils.go:409] getNetdevMTU(): get MTU for device 0000:00:06.0 I1018 18:16:55.280112 424839 utils.go:404] tryGetInterfaceName(): name is ens6 I1018 18:16:55.280154 424839 utils.go:404] tryGetInterfaceName(): name is ens6 I1018 18:16:55.280157 424839 utils.go:430] getNetDevMac(): get Mac for device ens6 I1018 18:16:55.280179 424839 utils.go:442] getNetDevLinkSpeed(): get LinkSpeed for device ens6 I1018 18:16:55.280198 424839 utils.go:598] getLinkType(): Device 0000:00:06.0 I1018 18:16:55.282415 424839 writer.go:132] setNodeStateStatus(): syncStatus: InProgress, lastSyncError: E1018 18:16:55.291837 424839 runtime.go:78] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference) goroutine 123 [running]: k8s.io/apimachinery/pkg/util/runtime.logPanic(0x1eb0e00, 0x2f17450) /go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:74 +0xa6 k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0) /go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:48 +0x86 panic(0x1eb0e00, 0x2f17450) /usr/lib/golang/src/runtime/panic.go:965 +0x1b9 github.com/k8snetworkplumbingwg/sriov-network-operator/pkg/daemon.(*Daemon).nodeStateSyncHandler(0xc0014984e0, 0x1, 0xc000b84000, 0xc000314000) /go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/pkg/daemon/daemon.go:548 +0x101b github.com/k8snetworkplumbingwg/sriov-network-operator/pkg/daemon.(*Daemon).processNextWorkItem.func1(0xc0014984e0, 0x1e3c2a0, 0x2f5e708, 0x0, 0x0) /go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/pkg/daemon/daemon.go:385 +0xdf github.com/k8snetworkplumbingwg/sriov-network-operator/pkg/daemon.(*Daemon).processNextWorkItem(0xc0014984e0, 0x203000) /go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/pkg/daemon/daemon.go:401 +0x169 github.com/k8snetworkplumbingwg/sriov-network-operator/pkg/daemon.(*Daemon).runWorker(...) /go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/pkg/daemon/daemon.go:346 k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0xc001000590) /go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:155 +0x5f k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc001000590, 0x22bc000, 0xc000c17cb0, 0x1, 0xc00010e540) /go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:156 +0x9b k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc001000590, 0x3b9aca00, 0x0, 0x217b801, 0xc00010e540) /go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133 +0x98 k8s.io/apimachinery/pkg/util/wait.Until(0xc001000590, 0x3b9aca00, 0xc00010e540) /go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:90 +0x4d created by github.com/k8snetworkplumbingwg/sriov-network-operator/pkg/daemon.(*Daemon).Run /go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/pkg/daemon/daemon.go:321 +0xac5 panic: runtime error: invalid memory address or nil pointer dereference [recovered] panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x1d375bb] goroutine 123 [running]: k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0) /go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:55 +0x109 panic(0x1eb0e00, 0x2f17450) /usr/lib/golang/src/runtime/panic.go:965 +0x1b9 github.com/k8snetworkplumbingwg/sriov-network-operator/pkg/daemon.(*Daemon).nodeStateSyncHandler(0xc0014984e0, 0x1, 0xc000b84000, 0xc000314000) /go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/pkg/daemon/daemon.go:548 +0x101b github.com/k8snetworkplumbingwg/sriov-network-operator/pkg/daemon.(*Daemon).processNextWorkItem.func1(0xc0014984e0, 0x1e3c2a0, 0x2f5e708, 0x0, 0x0) /go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/pkg/daemon/daemon.go:385 +0xdf github.com/k8snetworkplumbingwg/sriov-network-operator/pkg/daemon.(*Daemon).processNextWorkItem(0xc0014984e0, 0x203000) /go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/pkg/daemon/daemon.go:401 +0x169 github.com/k8snetworkplumbingwg/sriov-network-operator/pkg/daemon.(*Daemon).runWorker(...) /go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/pkg/daemon/daemon.go:346 k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0xc001000590) /go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:155 +0x5f k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc001000590, 0x22bc000, 0xc000c17cb0, 0x1, 0xc00010e540) /go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:156 +0x9b k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc001000590, 0x3b9aca00, 0x0, 0x217b801, 0xc00010e540) /go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133 +0x98 k8s.io/apimachinery/pkg/util/wait.Until(0xc001000590, 0x3b9aca00, 0xc00010e540) /go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:90 +0x4d created by github.com/k8snetworkplumbingwg/sriov-network-operator/pkg/daemon.(*Daemon).Run /go/src/github.com/k8snetworkplumbingwg/sriov-network-operator/pkg/daemon/daemon.go:321 +0xac5 Version-Release number of selected component (if applicable): Cluster version is 4.8.0-0.nightly-2021-10-16-024756 Additional info: The actual bug and the proposed solution could be tracked here: https://github.com/openshift/sriov-network-operator/commit/1d954a5304283f62808abbe13c55c6dd7b2b4083#diff-a53b7b593d3d778e62eaeeafa40088656f9212bfa2c2b7991df15fa78e60b0f0
The issue affects both the 4.8 and 4.9 releases. I have verified an upstream patch by @pliu (https://github.com/k8snetworkplumbingwg/sriov-network-operator/pull/191/files) that fixes the issue on the 4.9 release branch.
Hi, Ziv Could you help check the fix is works on 4.10 version ?
Yes, of course. This is exactly what I'm trying to achieve for a couple of days now. The main problem is that 4.10 currently is not stable from the deployment point of view. I'm trying to find a stable puddle to work with. I'll update as soon as I'll have any progress.
Hi, I was able to verify it, please see the details below: [cloud-user@installer-host ~]$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.10.0-0.nightly-2021-11-04-001635 True False 71m Cluster version is 4.10.0-0.nightly-2021-11-04-001635 [cloud-user@installer-host ~]$ [cloud-user@installer-host ~]$ [cloud-user@installer-host ~]$ oc get csv -n openshift-sriov-network-operator NAME DISPLAY VERSION REPLACES PHASE performance-addon-operator.v4.9.0 Performance Addon Operator 4.9.0 Succeeded sriov-network-operator.4.9.0-202110182323 SR-IOV Network Operator 4.9.0-202110182323 Succeeded [cloud-user@installer-host ~]$ [cloud-user@installer-host ~]$ [cloud-user@installer-host ~]$ oc get all -n openshift-sriov-network-operator NAME READY STATUS RESTARTS AGE pod/network-resources-injector-jp2l8 1/1 Running 0 44m pod/network-resources-injector-p7tbw 1/1 Running 0 44m pod/network-resources-injector-v8x6r 1/1 Running 0 44m pod/sriov-device-plugin-knl7c 1/1 Running 0 31m pod/sriov-network-config-daemon-67nhv 3/3 Running 7 (37m ago) 44m pod/sriov-network-config-daemon-p5k2s 3/3 Running 7 (37m ago) 44m pod/sriov-network-operator-976c7d6fc-4gjp8 1/1 Running 2 (32m ago) 44m NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/network-resources-injector-service ClusterIP 172.30.223.210 <none> 443/TCP 44m NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE daemonset.apps/network-resources-injector 3 3 3 3 3 beta.kubernetes.io/os=linux 44m daemonset.apps/sriov-device-plugin 1 1 1 1 1 beta.kubernetes.io/os=linux,node-role.kubernetes.io/worker= 43m daemonset.apps/sriov-network-config-daemon 2 2 2 2 2 beta.kubernetes.io/os=linux,node-role.kubernetes.io/worker= 44m NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/sriov-network-operator 1/1 1 1 44m NAME DESIRED CURRENT READY AGE replicaset.apps/sriov-network-operator-976c7d6fc 1 1 1 44m A question please, shouldn't we suppose to have 4.10 version of the sriov network operator instead of 4.9?
You shall use a 4.10 image to verify. The fix has not yet been merged in the 4.9 branch. Please try to use image sriov-network-operator.4.10.0-202111031923 or a newer one.
Hello Peng, Sorry, I have no experience with it as I've always used the latest marketplace version. Could you please elaborate how I should use/install this specific image? Additionally, if it is not yet merged in to the 4.9 branch, how come it is working on my current environment? Thanks.
@zzhao Could you help Ziv to setup the QE operator repo in his environment?
*** Bug 2028246 has been marked as a duplicate of this bug. ***
Hello, I was able to verify it and also created a dedicated dut pod witch attached SR-IOV VF's: (shiftstack) [cloud-user@installer-host ~]$ oc get clusterversions.config.openshift.io NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.10.0-0.nightly-2021-12-06-162419 True False 76m Cluster version is 4.10.0-0.nightly-2021-12-06-162419 (shiftstack) [cloud-user@installer-host ~]$ (shiftstack) [cloud-user@installer-host ~]$ (shiftstack) [cloud-user@installer-host ~]$ oc get csv -n openshift-sriov-network-operator NAME DISPLAY VERSION REPLACES PHASE performance-addon-operator.v4.9.2 Performance Addon Operator 4.9.2 Succeeded sriov-network-operator.4.10.0-202112070531 SR-IOV Network Operator 4.10.0-202112070531 Succeeded (shiftstack) [cloud-user@installer-host ~]$ (shiftstack) [cloud-user@installer-host ~]$ (shiftstack) [cloud-user@installer-host ~]$ oc get all -n openshift-sriov-network-operator NAME READY STATUS RESTARTS AGE pod/network-resources-injector-bbct4 1/1 Running 0 32m pod/network-resources-injector-m9n8b 1/1 Running 0 32m pod/network-resources-injector-z2nzp 1/1 Running 0 32m pod/sriov-device-plugin-tz7sr 1/1 Running 0 2m54s pod/sriov-network-config-daemon-lllf4 3/3 Running 3 32m pod/sriov-network-config-daemon-ngdrq 3/3 Running 3 32m pod/sriov-network-operator-dfdf7b466-dgw6t 1/1 Running 0 32m NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/network-resources-injector-service ClusterIP 172.30.171.60 <none> 443/TCP 32m NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE daemonset.apps/network-resources-injector 3 3 3 3 3 beta.kubernetes.io/os=linux 32m daemonset.apps/sriov-device-plugin 1 1 1 1 1 beta.kubernetes.io/os=linux,node-role.kubernetes.io/worker= 3m29s daemonset.apps/sriov-network-config-daemon 2 2 2 2 2 beta.kubernetes.io/os=linux,node-role.kubernetes.io/worker= 32m NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/sriov-network-operator 1/1 1 1 32m NAME DESIRED CURRENT READY AGE replicaset.apps/sriov-network-operator-dfdf7b466 1 1 1 32m (shiftstack) [cloud-user@installer-host ~]$ (shiftstack) [cloud-user@installer-host ~]$ (shiftstack) [cloud-user@installer-host ~]$ oc get pods NAME READY STATUS RESTARTS AGE dpdk-testpmd 1/1 Running 0 2m19s (shiftstack) [cloud-user@installer-host ~]$ Thanks, Ziv
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056