Description of problem: Even after successful installation of the SR-IOV operator and creation of sriovnetworknodepolicy and sriovnetwork, and waiting for several hours, some of the worker nodes in the cluster do not show any allocatable VFs. In my deployment with 15 worker nodes, this happens only on 3 worker nodes. The worker nodes this happens on also changes intermittently when I delete and recreate the sriovnetworknodepolicy. Manually restarting the sriov-device-plugin pod of the node, results in allocatable VFs being shown for the node. Version-Release number of selected component (if applicable): 4.5.8 How reproducible: Happens very often on atelast one node, when creating a sriovetworknodepolicy for the 15 worker nodes Steps to Reproduce: 1. Install SR-IOV operator 2. Create sriovnetworknodepolicy 3. Verify that *some* worker nodes don't show any allocatable VFs Actual results: Some worker nodes don't have allocatable resources unless the sriov-device-plugin pod is restarted manually Expected results: All worker nodes that match the sriovnetworknodepolicy should have allocatable VFs Additional info:
[kni@e22-h20-b01-fc640 sriov-operator]$ oc get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES network-resources-injector-5lqlr 1/1 Running 0 4d11h 10.128.0.33 master-1 <none> <none> network-resources-injector-g66lq 1/1 Running 0 4d11h 10.129.0.14 master-2 <none> <none> network-resources-injector-rxwtw 1/1 Running 0 4d11h 10.130.0.40 master-0 <none> <none> operator-webhook-2tn48 1/1 Running 0 4d11h 10.129.0.15 master-2 <none> <none> operator-webhook-9rhwv 1/1 Running 0 4d11h 10.128.0.32 master-1 <none> <none> operator-webhook-h65s7 1/1 Running 0 4d11h 10.130.0.41 master-0 <none> <none> sriov-cni-2jwkm 1/1 Running 0 2d5h 10.128.6.4 worker008 <none> <none> sriov-cni-2krg4 1/1 Running 0 2d5h 10.129.2.4 worker007 <none> <none> sriov-cni-7fmp9 1/1 Running 0 2d5h 10.131.6.5 worker000 <none> <none> sriov-cni-8spfz 1/1 Running 0 2d5h 10.130.6.4 worker010 <none> <none> sriov-cni-8tkrv 1/1 Running 0 2d5h 10.130.2.7 worker015 <none> <none> sriov-cni-9wwvc 1/1 Running 0 2d5h 10.129.4.4 worker004 <none> <none> sriov-cni-b9bjv 1/1 Running 0 2d5h 10.129.8.4 worker001 <none> <none> sriov-cni-gjfgf 1/1 Running 0 2d5h 10.128.4.5 worker013 <none> <none> sriov-cni-k8zd7 1/1 Running 0 2d5h 10.130.4.6 worker006 <none> <none> sriov-cni-q7rmp 1/1 Running 0 2d5h 10.131.0.7 worker011 <none> <none> sriov-cni-rrmt9 1/1 Running 0 2d5h 10.129.6.6 worker009 <none> <none> sriov-cni-rvlgx 1/1 Running 0 2d5h 10.131.4.4 worker002 <none> <none> sriov-cni-slx4x 1/1 Running 0 2d5h 10.131.2.4 worker012 <none> <none> sriov-cni-t4c4q 1/1 Running 0 2d5h 10.128.2.5 worker014 <none> <none> sriov-cni-twhnw 1/1 Running 0 2d5h 10.128.8.4 worker003 <none> <none> sriov-device-plugin-2chzt 1/1 Running 0 2d4h 192.168.222.17 worker004 <none> <none> sriov-device-plugin-5z5x8 1/1 Running 0 22s 192.168.222.13 worker000 <none> <none> sriov-device-plugin-dlbh7 1/1 Running 0 2d5h 192.168.222.21 worker008 <none> <none> sriov-device-plugin-dzvmm 1/1 Running 0 2d5h 192.168.222.15 worker002 <none> <none> sriov-device-plugin-fbtkh 1/1 Running 0 2d5h 192.168.222.20 worker007 <none> <none> sriov-device-plugin-fnfwq 1/1 Running 0 2d5h 192.168.222.14 worker001 <none> <none> sriov-device-plugin-hf289 1/1 Running 0 2d4h 192.168.222.24 worker011 <none> <none> sriov-device-plugin-k8h5r 1/1 Running 0 2d5h 192.168.222.22 worker009 <none> <none> sriov-device-plugin-pppr7 1/1 Running 0 2d5h 192.168.222.23 worker010 <none> <none> sriov-device-plugin-pxvd4 1/1 Running 0 2d5h 192.168.222.16 worker003 <none> <none> sriov-device-plugin-q9qdr 1/1 Running 0 2d4h 192.168.222.19 worker006 <none> <none> sriov-device-plugin-sbb9q 1/1 Running 0 2d5h 192.168.222.26 worker013 <none> <none> sriov-device-plugin-t5mlj 1/1 Running 0 22m 192.168.222.28 worker015 <none> <none> sriov-device-plugin-vz8c6 1/1 Running 0 2d4h 192.168.222.25 worker012 <none> <none> sriov-device-plugin-x84pk 1/1 Running 0 2d5h 192.168.222.27 worker014 <none> <none> sriov-network-config-daemon-4hkmc 1/1 Running 0 4d11h 192.168.222.19 worker006 <none> <none> sriov-network-config-daemon-5gt4z 1/1 Running 0 4d11h 192.168.222.20 worker007 <none> <none> sriov-network-config-daemon-6rnlp 1/1 Running 1 4d11h 192.168.222.25 worker012 <none> <none> sriov-network-config-daemon-9ffpg 1/1 Running 0 4d11h 192.168.222.22 worker009 <none> <none> sriov-network-config-daemon-cm7tv 1/1 Running 1 4d11h 192.168.222.14 worker001 <none> <none> sriov-network-config-daemon-f8284 1/1 Running 0 4d11h 192.168.222.16 worker003 <none> <none> sriov-network-config-daemon-jjscr 1/1 Running 0 4d11h 192.168.222.17 worker004 <none> <none> sriov-network-config-daemon-jn64v 1/1 Running 1 4d11h 192.168.222.28 worker015 <none> <none> sriov-network-config-daemon-lsqmm 1/1 Running 0 4d11h 192.168.222.27 worker014 <none> <none> sriov-network-config-daemon-n9mlc 1/1 Running 1 4d11h 192.168.222.24 worker011 <none> <none> sriov-network-config-daemon-pc8fw 1/1 Running 2 4d11h 192.168.222.13 worker000 <none> <none> sriov-network-config-daemon-qfzwb 1/1 Running 1 4d11h 192.168.222.26 worker013 <none> <none> sriov-network-config-daemon-sk4tl 1/1 Running 0 4d11h 192.168.222.23 worker010 <none> <none> sriov-network-config-daemon-wcq2t 1/1 Running 0 4d11h 192.168.222.15 worker002 <none> <none> sriov-network-config-daemon-x2pdr 1/1 Running 1 4d11h 192.168.222.21 worker008 <none> <none> sriov-network-operator-74c59f66f-mkw7v 1/1 Running 0 4d11h 10.129.0.13 master-2 <none> <none> ================================================================================================================================================================================== [kni@e22-h20-b01-fc640 sriov-operator]$ oc get node worker011 -o json | jq -r '.status.allocatable' { "cpu": "60", "ephemeral-storage": "430449989328", "hugepages-1Gi": "8Gi", "hugepages-2Mi": "0", "memory": "385226452Ki", "openshift.io/intelnics": "0", "pods": "250" } ==================================================================================================================================================================================== [kni@e22-h20-b01-fc640 sriov-operator]$ oc logs sriov-device-plugin-hf289 I0911 21:03:58.693819 19 manager.go:70] Using Kubelet Plugin Registry Mode I0911 21:03:58.693898 19 main.go:44] resource manager reading configs I0911 21:03:58.694013 19 manager.go:98] ResourceList: [{ResourcePrefix: ResourceName:intelnics IsRdma:false Selectors:{Vendors:[8086] Devices:[154c] Drivers:[iavf mlx5_core i40evf ixgbevf] PfNames:[ens2f0] LinkTypes:[] DDPProfiles:[]}}] I0911 21:03:58.694054 19 manager.go:174] validating resource name "openshift.io/intelnics" I0911 21:03:58.694060 19 main.go:60] Discovering host network devices I0911 21:03:58.694067 19 manager.go:190] discovering host network devices I0911 21:03:58.739322 19 manager.go:220] discoverDevices(): device found: 0000:18:00.0 02 Intel Corporation Ethernet Controller X710 for 10GbE ba... I0911 21:03:58.739686 19 manager.go:290] eno1 added to linkWatchList I0911 21:03:58.739850 19 manager.go:220] discoverDevices(): device found: 0000:18:00.1 02 Intel Corporation Ethernet Controller X710 for 10GbE ba... I0911 21:03:58.740010 19 manager.go:290] eno2 added to linkWatchList I0911 21:03:58.740138 19 manager.go:220] discoverDevices(): device found: 0000:62:00.0 02 Intel Corporation Ethernet Controller XXV710 for 25GbE ... I0911 21:03:58.740316 19 manager.go:290] ens2f0 added to linkWatchList I0911 21:03:58.740867 19 manager.go:220] discoverDevices(): device found: 0000:62:00.1 02 Intel Corporation Ethernet Controller XXV710 for 25GbE ... I0911 21:03:58.741133 19 manager.go:270] excluding interface ens2f1: default route found: {Ifindex: 5 Dst: <nil> Src: <nil> Gw: 192.168.222.1 Flags: [] Table: 254} I0911 21:03:58.741205 19 main.go:66] Initializing resource servers I0911 21:03:58.741210 19 manager.go:108] number of config: 1 I0911 21:03:58.741214 19 manager.go:111] I0911 21:03:58.741217 19 manager.go:112] Creating new ResourcePool: intelnics I0911 21:03:58.741522 19 manager.go:126] New resource server is created for intelnics ResourcePool I0911 21:03:58.741540 19 main.go:72] Starting all servers... I0911 21:03:58.741609 19 server.go:190] starting intelnics device plugin endpoint at: openshift.io_intelnics.sock I0911 21:03:58.742119 19 server.go:216] intelnics device plugin endpoint started serving I0911 21:03:58.742154 19 main.go:77] All servers started. I0911 21:03:58.742160 19 main.go:78] Listening for term signals I0911 21:04:00.075265 19 server.go:105] Plugin: openshift.io_intelnics.sock gets registered successfully at Kubelet I0911 21:04:00.075284 19 server.go:130] ListAndWatch(intelnics) invoked I0911 21:04:00.075310 19 server.go:138] ListAndWatch(intelnics): send devices &ListAndWatchResponse{Devices:[]*Device{},} =============================================================================================================================================================================================== After restarting the device-plugin for worker011 ================================================================================================================================================================================================ [kni@e22-h20-b01-fc640 sriov-operator]$ oc delete pod/sriov-device-plugin-hf289 pod "sriov-device-plugin-hf289" deleted (reverse-i-search)`all': oc get node worker015 -o json | jq -r '.status.^Clocatable' [kni@e22-h20-b01-fc640 sriov-operator]$ oc get node worker011 -o json | jq -r '.status.allocatable' { "cpu": "60", "ephemeral-storage": "430449989328", "hugepages-1Gi": "8Gi", "hugepages-2Mi": "0", "memory": "385226452Ki", "openshift.io/intelnics": "64", "pods": "250" } ==================================================================================================================================================================================================== I0914 02:10:28.937106 19 factory.go:100] device added: [pciAddr: 0000:62:09.3, vendor: 8086, device: 154c, driver: iavf] I0914 02:10:28.937109 19 factory.go:100] device added: [pciAddr: 0000:62:09.4, vendor: 8086, device: 154c, driver: iavf] I0914 02:10:28.937113 19 factory.go:100] device added: [pciAddr: 0000:62:09.5, vendor: 8086, device: 154c, driver: iavf] I0914 02:10:28.937117 19 factory.go:100] device added: [pciAddr: 0000:62:09.6, vendor: 8086, device: 154c, driver: iavf] I0914 02:10:28.937120 19 factory.go:100] device added: [pciAddr: 0000:62:09.7, vendor: 8086, device: 154c, driver: iavf] I0914 02:10:28.937134 19 manager.go:126] New resource server is created for intelnics ResourcePool I0914 02:10:28.937140 19 main.go:72] Starting all servers... I0914 02:10:28.937210 19 server.go:190] starting intelnics device plugin endpoint at: openshift.io_intelnics.sock I0914 02:10:28.937857 19 server.go:216] intelnics device plugin endpoint started serving I0914 02:10:28.937877 19 main.go:77] All servers started. I0914 02:10:28.937882 19 main.go:78] Listening for term signals I0914 02:10:30.724880 19 server.go:105] Plugin: openshift.io_intelnics.sock gets registered successfully at Kubelet I0914 02:10:30.724948 19 server.go:130] ListAndWatch(intelnics) invoked I0914 02:10:30.724965 19 server.go:138] ListAndWatch(intelnics): send devices &ListAndWatchResponse{Devices:[]*Device{&Device{ID:0000:62:05.0,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:09.2,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:02.4,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:07.1,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:08.7,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:05.3,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:05.4,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:06.0,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:06.1,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:06.4,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:09.3,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:03.7,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:06.5,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:08.5,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:04.7,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:05.2,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:08.6,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:09.0,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:09.5,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:04.3,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:03.3,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:05.1,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:02.2,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:03.1,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:04.2,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:07.5,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:08.0,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:09.6,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:02.6,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:04.4,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:06.3,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:03.5,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:05.6,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:06.6,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:07.7,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:09.1,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:03.6,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:09.4,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:06.2,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:04.6,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:07.3,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:07.6,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:08.1,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:08.4,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:02.5,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:03.2,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:02.0,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:07.0,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:06.7,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:05.7,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:08.2,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:08.3,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:09.7,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:03.4,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:03.0,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:04.0,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:04.1,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:04.5,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:02.1,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:02.7,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:05.5,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:07.2,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:07.4,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},&Device{ID:0000:62:02.3,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:0,},},},},},}
Did, echo 0 > /sys/class/net/ens2f0/device/sriov_numvfs echo 64 > /sys/class/net/ens2f0/device/sriov_numvfs and then deleted the sriov-device-plugin pod for the node and that resulted in 64 allocatable VFs showing up.
To be clear, this keeps happening regularly on creating and deleting pods and leaving the environment around for a few hours/days and the worker node that had allocatable VFs does not have any after sometime. Also, I am waiting for all pods to be terminated gracefully before launching new pods., so this cannot be linked to me not waiting for pods to be killed. I think the issue here can easily be seen if we do any longevity testing with Sr-IOV where a node that has VFs stops reporting allocatable VFs after sometime.
Logs from sriov-device-plugin on worker002 where this happens: http://rdu-storage01.scalelab.redhat.com/sai/bz-1878566
(In reply to Sai Sindhur Malleni from comment #5) > Logs from sriov-device-plugin on worker002 where this happens: > http://rdu-storage01.scalelab.redhat.com/sai/bz-1878566 Sai, From above log, it looks like device plugin was terminated and re-registered with kubelet. but it didn't discover any VFs on the host. Could you please check `ip link show` on the worker node to see if VFs still exist when there was no allocatable VFs reported on the node status?
Unfortunately I lost access to the environment over the weekend. However, this is fairly reproducible and can be observed on long running clusters.
I haven't tried it recently and do not have cycles to try it soon either. I would check with QE and ask them if you can reproduce. Thanks!
I think i saw this behaviour on 4.4, we fixed it by resetting number of VFs to 0 on all nodes / all PFs before deploying the sriov-opeartor, reason below. There is one shot only when creating the policy, at least on versions 4.4 / 4.5. This might need to be fixed, in case it still happens on latest. The flow is as follows: 1. We set VFs number (sriov_numvfs) to max allowed (according sriov_totalvfs). 2. The sriov operator is deployed. 3. The sriov operator sets all unused PFs to have zero VFs, since no policy applies to these PFs yet. Number of VFs equal zero would eventually be reported at SriovNetworkNodeState. 4. We create a new SriovNetworkNodePolicy, with numVfs equals sriov_totalvfs value. 5. The sriov operator checks if it should update number of VFs, by comparing the desired numVfs (according to policy) to the current numVfs configured according SriovNetworkNodeState status. In case SriovNetworkNodeState didn't finish updating the actual numVfs yet in the status, the previous value would still appear. The sriov operator would not create VFs, because the desired state equals the current state. The result would be that the cluster would stay with zero VFs until a new policy is created, or until the network config daemon is restarted. By resetting the VFs before deploying the operator we fix this race condition. The operator assumes it starts on a clean setup, so this change is natural. We can see here the reset (line 3), and then the desired state, but syncNodeState doesnt do anything, since the current state isn't updated yet to refelect the actual zero VFs that exists I1231 07:55:24.285589 6012 generic_plugin.go:105] generic-plugin Apply(): desiredState={1452 []} I1231 07:55:24.285746 6012 utils.go:322] resetSriovDevice(): reset sr-iov device 0000:65:00.0 I1231 07:55:25.504970 6012 utils.go:333] resetSriovDevice(): reset mtu to 1500 I1231 07:55:34.614214 6012 generic_plugin.go:105] generic-plugin Apply(): desiredState={1718 [{0000:65:00.0 64 1500 ens2f0 [{sriov_net vfio-pci 0-63}]}]} I1231 07:55:34.631605 6012 generic_plugin.go:115] generic-plugin Apply(): lastStat={1452 []} I1231 07:55:34.631942 6012 utils.go:113] syncNodeState(): no need update interface 0000:65:00.0 Since 4.6 there is nodeStateSyncHandler which makes the flow less likely to happen. But i did see this in 4.8 few weeks ago, if i undeploy and redeploy the sriov-operator in a loop. Ref: https://github.com/kubevirt/kubevirtci/pull/513 One more note: I saw that sometimes the PF dissappers from /sys/class/net and reappear, espcially after moving it between namespaces, this also can cause the behaviour in this BZ, in case there isn't a reconcile of the policy (by syncNodeState) IIUC. Thanks
> > Since 4.6 there is nodeStateSyncHandler which makes the flow less likely to > happen. > But i did see this in 4.8 few weeks ago, if i undeploy and redeploy the > sriov-operator in a loop. > I can see why this happened in 4.4/4.5. But with 4.6, policy update shall be handled in serial, not in parallel and there is a nodeState update triggered at the end of nodeStateSyncHandler (which is synced between daemon and writer), not sure why we still see this issue.
Or, are you able to test a customized config-daemon image for a possible fix of this issue?
In 4.4/4.5 releases, sriov-config-daemon handles policy update/remove/add requests in parallel, there is a chance that a new policy may not be correctly applied when previous policy is still in the removing process. This is because config daemon compares the difference of current node state status (nodeState.Status) with new policy and only apply the policy if there is a mismatch, but nodeState.Status is not guaranteed to be the latest as it is updated every 30 seconds or on demand. The following policy reconcilation (every 5 mins) won't fix the issue because it checks the nodeState generations and skip it if it matches with last applied policy. In 4.6, this is improved by processing policy update/remove/add requests in serial with workqueue. It should be rare to see the same issue in 4.6 or later releases. Since the fix in 4.6 is not a trivial patch or a clean backport, we are leaning towards not doing the backport in 4.5/4.4 unless there is a customer case associated with it. For possible similar issue in 4.8 as mentioned in comment #13, we shall track it in a separate bug.
Closing as CURRENTRELEASE per comment #16.