Bug 2026336

Summary: [SNO] We see multiple replicas of virt-api, virt-controller and virt-operator.
Product: Container Native Virtualization (CNV) Reporter: Kedar Bidarkar <kbidarka>
Component: InstallationAssignee: Simone Tiraboschi <stirabos>
Status: CLOSED ERRATA QA Contact: Kedar Bidarkar <kbidarka>
Severity: high Docs Contact:
Priority: high    
Version: 4.10.0CC: cnv-qe-bugs, kmajcher, stirabos
Target Milestone: ---   
Target Release: 4.10.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: hco-bundle-registry-container-v4.10.0-453 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-03-16 15:56:33 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2031919    

Description Kedar Bidarkar 2021-11-24 12:03:38 UTC
Description of problem:

[kbidarka@localhost ocp-cnv-scripts]$ oc get infrastructure.config.openshift.io cluster -o json | jq ".status.infrastructureTopology"
"SingleReplica"

[kbidarka@localhost ocp-cnv-scripts]$ oc get nodes
NAME STATUS ROLES AGE VERSION
node-23.redhat.com Ready master,worker 4d22h v1.22.1+f773b8b


[kbidarka@localhost ocp-cnv-scripts]$ oc get pods -n openshift-cnv | grep virt | grep -v virt-template
virt-api-7754f44597-gnbkq 1/1 Running 1 4d21h
virt-api-7754f44597-m6qmd 1/1 Running 1 4d21h
virt-controller-65df6c6bdd-2nznf 1/1 Running 3 (4d3h ago) 4d21h
virt-controller-65df6c6bdd-66vcs 1/1 Running 4 (3d22h ago) 4d21h
virt-handler-9wgpg 1/1 Running 1 4d21h
virt-operator-5ffff65b57-rsghm 1/1 Running 4 (3d22h ago) 4d21h
virt-operator-5ffff65b57-xlsbr 1/1 Running 3 (4d ago) 4d21h

Summary: Currently from Latest 4.10 CNV Setup, we still see multiple replicas of virt-api, virt-controller and virt-operator.

Version-Release number of selected component (if applicable):
CNV-4.10 on SNO

How reproducible:

Install CNV on SNO

Steps to Reproduce:
1. Install CNV on SNO
2.
3.

Actual results:
we still see multiple replicas of virt-api, virt-controller and virt-operator.

Expected results:
we should have "single replica" of virt-api, virt-controller and virt-operator.

Additional info:

Comment 1 sgott 2021-11-24 13:25:28 UTC
The github issue linked to this BZ is resolved on the Virtualization side, thus we're re-assigning this BZ to the Installation component. Please let me know if you feel this action was in error.

Comment 2 Kedar Bidarkar 2021-11-25 20:22:45 UTC
Another observation with installing SNO setup is SR-IOV operator installation:
a) Installing the SR-IOV operator is successful before installation of CNV
b) Installing the SR-IOV operator after CNV Installation fails.

]$ oc logs daemonset/sriov-network-config-daemon     --namespace=openshift-sriov-network-operator     --container=sriov-network-config-daemon
...
I1125 20:16:32.376476 2285905 daemon.go:133] evicting pod openshift-cnv/virt-api-5b9d4b6767-n8bc4
I1125 20:16:32.376576 2285905 daemon.go:133] evicting pod openshift-cnv/virt-controller-8464dfc565-9ch8s
E1125 20:16:32.384313 2285905 daemon.go:133] error when evicting pods/"virt-controller-8464dfc565-9ch8s" -n "openshift-cnv" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
E1125 20:16:32.384381 2285905 daemon.go:133] error when evicting pods/"virt-api-5b9d4b6767-n8bc4" -n "openshift-cnv" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
I1125 20:16:37.384840 2285905 daemon.go:133] evicting pod openshift-cnv/virt-api-5b9d4b6767-n8bc4
I1125 20:16:37.384907 2285905 daemon.go:133] evicting pod openshift-cnv/virt-controller-8464dfc565-9ch8s
E1125 20:16:37.391466 2285905 daemon.go:133] error when evicting pods/"virt-api-5b9d4b6767-n8bc4" -n "openshift-cnv" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
E1125 20:16:37.391480 2285905 daemon.go:133] error when evicting pods/"virt-controller-8464dfc565-9ch8s" -n "openshift-cnv" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
I1125 20:16:42.117093 2285905 daemon.go:312] Run(): period refresh
I1125 20:16:42.120240 2285905 daemon.go:972] tryCreateSwitchdevUdevRule()
I1125 20:16:42.120286 2285905 daemon.go:1030] tryCreateNMUdevRule()
I1125 20:16:42.392461 2285905 daemon.go:133] evicting pod openshift-cnv/virt-controller-8464dfc565-9ch8s
I1125 20:16:42.392560 2285905 daemon.go:133] evicting pod openshift-cnv/virt-api-5b9d4b6767-n8bc4
E1125 20:16:42.398659 2285905 daemon.go:133] error when evicting pods/"virt-api-5b9d4b6767-n8bc4" -n "openshift-cnv" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
E1125 20:16:42.398658 2285905 daemon.go:133] error when evicting pods/"virt-controller-8464dfc565-9ch8s" -n "openshift-cnv" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.

Comment 3 Krzysztof Majcher 2021-11-26 14:14:14 UTC
This will be addressed as an Epic in 4.11 CNV-13912
Please let me know if you are not ok with having it addressed there.

Comment 6 Kedar Bidarkar 2021-12-13 18:48:58 UTC
For virt-operator,  currently its replica level is statically set in the CSV at build time and the OLM, 
at least in the actual release, is not going to scale down to 1.
See: https://github.com/operator-framework/operator-lifecycle-manager/issues/2453

---

See, SingleReplica for both virt-api and virt-controller.


VERIFIED with the build,
"image": "registry-proxy.engineering.redhat.com/rh-osbs/iib:146913",
"hcoVersion": "v4.10.0-464"

Comment 11 errata-xmlrpc 2022-03-16 15:56:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Virtualization 4.10.0 Images security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0947