Description of problem: [kbidarka@localhost ocp-cnv-scripts]$ oc get infrastructure.config.openshift.io cluster -o json | jq ".status.infrastructureTopology" "SingleReplica" [kbidarka@localhost ocp-cnv-scripts]$ oc get nodes NAME STATUS ROLES AGE VERSION node-23.redhat.com Ready master,worker 4d22h v1.22.1+f773b8b [kbidarka@localhost ocp-cnv-scripts]$ oc get pods -n openshift-cnv | grep virt | grep -v virt-template virt-api-7754f44597-gnbkq 1/1 Running 1 4d21h virt-api-7754f44597-m6qmd 1/1 Running 1 4d21h virt-controller-65df6c6bdd-2nznf 1/1 Running 3 (4d3h ago) 4d21h virt-controller-65df6c6bdd-66vcs 1/1 Running 4 (3d22h ago) 4d21h virt-handler-9wgpg 1/1 Running 1 4d21h virt-operator-5ffff65b57-rsghm 1/1 Running 4 (3d22h ago) 4d21h virt-operator-5ffff65b57-xlsbr 1/1 Running 3 (4d ago) 4d21h Summary: Currently from Latest 4.10 CNV Setup, we still see multiple replicas of virt-api, virt-controller and virt-operator. Version-Release number of selected component (if applicable): CNV-4.10 on SNO How reproducible: Install CNV on SNO Steps to Reproduce: 1. Install CNV on SNO 2. 3. Actual results: we still see multiple replicas of virt-api, virt-controller and virt-operator. Expected results: we should have "single replica" of virt-api, virt-controller and virt-operator. Additional info:
The github issue linked to this BZ is resolved on the Virtualization side, thus we're re-assigning this BZ to the Installation component. Please let me know if you feel this action was in error.
Another observation with installing SNO setup is SR-IOV operator installation: a) Installing the SR-IOV operator is successful before installation of CNV b) Installing the SR-IOV operator after CNV Installation fails. ]$ oc logs daemonset/sriov-network-config-daemon --namespace=openshift-sriov-network-operator --container=sriov-network-config-daemon ... I1125 20:16:32.376476 2285905 daemon.go:133] evicting pod openshift-cnv/virt-api-5b9d4b6767-n8bc4 I1125 20:16:32.376576 2285905 daemon.go:133] evicting pod openshift-cnv/virt-controller-8464dfc565-9ch8s E1125 20:16:32.384313 2285905 daemon.go:133] error when evicting pods/"virt-controller-8464dfc565-9ch8s" -n "openshift-cnv" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget. E1125 20:16:32.384381 2285905 daemon.go:133] error when evicting pods/"virt-api-5b9d4b6767-n8bc4" -n "openshift-cnv" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget. I1125 20:16:37.384840 2285905 daemon.go:133] evicting pod openshift-cnv/virt-api-5b9d4b6767-n8bc4 I1125 20:16:37.384907 2285905 daemon.go:133] evicting pod openshift-cnv/virt-controller-8464dfc565-9ch8s E1125 20:16:37.391466 2285905 daemon.go:133] error when evicting pods/"virt-api-5b9d4b6767-n8bc4" -n "openshift-cnv" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget. E1125 20:16:37.391480 2285905 daemon.go:133] error when evicting pods/"virt-controller-8464dfc565-9ch8s" -n "openshift-cnv" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget. I1125 20:16:42.117093 2285905 daemon.go:312] Run(): period refresh I1125 20:16:42.120240 2285905 daemon.go:972] tryCreateSwitchdevUdevRule() I1125 20:16:42.120286 2285905 daemon.go:1030] tryCreateNMUdevRule() I1125 20:16:42.392461 2285905 daemon.go:133] evicting pod openshift-cnv/virt-controller-8464dfc565-9ch8s I1125 20:16:42.392560 2285905 daemon.go:133] evicting pod openshift-cnv/virt-api-5b9d4b6767-n8bc4 E1125 20:16:42.398659 2285905 daemon.go:133] error when evicting pods/"virt-api-5b9d4b6767-n8bc4" -n "openshift-cnv" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget. E1125 20:16:42.398658 2285905 daemon.go:133] error when evicting pods/"virt-controller-8464dfc565-9ch8s" -n "openshift-cnv" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
This will be addressed as an Epic in 4.11 CNV-13912 Please let me know if you are not ok with having it addressed there.
For virt-operator, currently its replica level is statically set in the CSV at build time and the OLM, at least in the actual release, is not going to scale down to 1. See: https://github.com/operator-framework/operator-lifecycle-manager/issues/2453 --- See, SingleReplica for both virt-api and virt-controller. VERIFIED with the build, "image": "registry-proxy.engineering.redhat.com/rh-osbs/iib:146913", "hcoVersion": "v4.10.0-464"
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Virtualization 4.10.0 Images security and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0947