Description of problem: Observation with installing SNO setup is SR-IOV operator installation: ]$ oc logs daemonset/sriov-network-config-daemon --namespace=openshift-sriov-network-operator --container=sriov-network-config-daemon ... I1125 20:16:32.376476 2285905 daemon.go:133] evicting pod openshift-cnv/virt-api-5b9d4b6767-n8bc4 I1125 20:16:32.376576 2285905 daemon.go:133] evicting pod openshift-cnv/virt-controller-8464dfc565-9ch8s E1125 20:16:32.384313 2285905 daemon.go:133] error when evicting pods/"virt-controller-8464dfc565-9ch8s" -n "openshift-cnv" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget. E1125 20:16:32.384381 2285905 daemon.go:133] error when evicting pods/"virt-api-5b9d4b6767-n8bc4" -n "openshift-cnv" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget. I1125 20:16:37.384840 2285905 daemon.go:133] evicting pod openshift-cnv/virt-api-5b9d4b6767-n8bc4 I1125 20:16:37.384907 2285905 daemon.go:133] evicting pod openshift-cnv/virt-controller-8464dfc565-9ch8s E1125 20:16:37.391466 2285905 daemon.go:133] error when evicting pods/"virt-api-5b9d4b6767-n8bc4" -n "openshift-cnv" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget. E1125 20:16:37.391480 2285905 daemon.go:133] error when evicting pods/"virt-controller-8464dfc565-9ch8s" -n "openshift-cnv" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget. I1125 20:16:42.117093 2285905 daemon.go:312] Run(): period refresh I1125 20:16:42.120240 2285905 daemon.go:972] tryCreateSwitchdevUdevRule() I1125 20:16:42.120286 2285905 daemon.go:1030] tryCreateNMUdevRule() I1125 20:16:42.392461 2285905 daemon.go:133] evicting pod openshift-cnv/virt-controller-8464dfc565-9ch8s I1125 20:16:42.392560 2285905 daemon.go:133] evicting pod openshift-cnv/virt-api-5b9d4b6767-n8bc4 E1125 20:16:42.398659 2285905 daemon.go:133] error when evicting pods/"virt-api-5b9d4b6767-n8bc4" -n "openshift-cnv" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget. E1125 20:16:42.398658 2285905 daemon.go:133] error when evicting pods/"virt-controller-8464dfc565-9ch8s" -n "openshift-cnv" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget. Version-Release number of selected component (if applicable): CNV-4.10 on SNO How reproducible: Always Steps to Reproduce: 1. Install OCP in SNO 2. Install CNV in SNO 3. Install SR-IOV operator Actual results: SR-IOV operator fails to install. E1125 20:16:42.398659 2285905 daemon.go:133] error when evicting pods/"virt-api-5b9d4b6767-n8bc4" -n "openshift-cnv" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget. E1125 20:16:42.398658 2285905 daemon.go:133] error when evicting pods/"virt-controller-8464dfc565-9ch8s" -n "openshift-cnv" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget. Expected results: SR-IOV operator installation is successfully installed. Additional info: a) Installing the SR-IOV operator is successful before installation of CNV b) Installing the SR-IOV operator after CNV Installation fails.
The operator reboots nodes as part of the configuration. Due to virt components refusing to move out of the node, it cannot perform that operation.
Jed, I assume this is due to the fact that we still have 2 replicas and PDB's present?
]$ oc logs daemonset/sriov-network-config-daemon --namespace=openshift-sriov-network-operator --container=sriov-network-config-daemon | grep "error when evicting" ]$ oc get nodes NAME STATUS ROLES AGE VERSION node-23.cnvqe.redhat.com Ready master,worker 3h18m v1.22.1+6859754 ]$ --- + sed /home/kbidarka/git_world/cnv-qe-automation/ocp/bm/sriov/10_sriov_network_node_policy_cr.yaml -e 's/^\( \+pfNames\): .*/\1: ["eno1"]/' -e 's/^\( \+rootDevices\): .*/\1: ["0000:19:00.0"]/' -e 's/^\( \+numVfs\): .*/\1: 32/' + oc create --filename=- sriovnetworknodepolicy.sriovnetwork.openshift.io/sriov-network-policy created + oc create --filename=/home/kbidarka/git_world/cnv-qe-automation/ocp/bm/sriov/11_sriov_network_cr.yaml sriovnetwork.sriovnetwork.openshift.io/sriov-network created --- ]$ oc get vmi vm-rhel84-nfs3 -o yaml | grep -A 4 interfaces interfaces: - interfaceName: eth0 mac: 02:0c:a5:00:00:00 name: sriov-net [kbidarka@localhost sriov]$ virtctl console vm-rhel84-nfs3 Successfully connected to vm-rhel84-nfs3 console. The escape sequence is ^] Red Hat Enterprise Linux 8.4 (Ootpa) Kernel 4.18.0-305.30.1.el8_4.x86_64 on an x86_64 Activate the web console with: systemctl enable --now cockpit.socket vm-rhel84-nfs3 login: cloud-user Password: [cloud-user@vm-rhel84-nfs3 ~]$ ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether 02:0c:a5:00:00:00 brd ff:ff:ff:ff:ff:ff inet xx.yy.zz.aa/24 brd xx.yy.zz.255 scope global dynamic noprefixroute eth0 valid_lft 1769sec preferred_lft 1769sec [cloud-user@vm-rhel84-nfs3 ~]$ ping google.com -4 PING google.com (142.250.188.206) 56(84) bytes of data. 64 bytes from iad23s94-in-f14.1e100.net (142.250.188.206): icmp_seq=1 ttl=54 time=7.68 ms 64 bytes from iad23s94-in-f14.1e100.net (142.250.188.206): icmp_seq=2 ttl=54 time=7.60 ms --- google.com ping statistics --- 4 packets transmitted, 4 received, 0% packet loss, time 3005ms rtt min/avg/max/mdev = 7.601/7.683/7.755/0.054 ms [cloud-user@vm-rhel84-nfs3 ~]$ --- Once this bug got fixed, https://bugzilla.redhat.com/show_bug.cgi?id=2026336 This bug automatically got fixed too. --- VERIFIED with the build, "image": "registry-proxy.engineering.redhat.com/rh-osbs/iib:146913", "hcoVersion": "v4.10.0-464"
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Virtualization 4.10.0 Images security and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0947