Bug 2026336 - [SNO] We see multiple replicas of virt-api, virt-controller and virt-operator.
Summary: [SNO] We see multiple replicas of virt-api, virt-controller and virt-operator.
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Container Native Virtualization (CNV)
Classification: Red Hat
Component: Installation
Version: 4.10.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.10.0
Assignee: Simone Tiraboschi
QA Contact: Kedar Bidarkar
URL:
Whiteboard:
Depends On:
Blocks: 2031919
TreeView+ depends on / blocked
 
Reported: 2021-11-24 12:03 UTC by Kedar Bidarkar
Modified: 2022-03-16 15:56 UTC (History)
3 users (show)

Fixed In Version: hco-bundle-registry-container-v4.10.0-453
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-03-16 15:56:33 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github kubevirt hyperconverged-cluster-operator pull 1639 0 None Merged Configure Kubevirt for SNO mode 2021-12-10 10:20:20 UTC
Github kubevirt kubevirt pull 6663 0 None Merged Make the number of replicas for virt- pods configurable 2021-11-24 13:23:42 UTC
Red Hat Product Errata RHSA-2022:0947 0 None None None 2022-03-16 15:56:49 UTC

Description Kedar Bidarkar 2021-11-24 12:03:38 UTC
Description of problem:

[kbidarka@localhost ocp-cnv-scripts]$ oc get infrastructure.config.openshift.io cluster -o json | jq ".status.infrastructureTopology"
"SingleReplica"

[kbidarka@localhost ocp-cnv-scripts]$ oc get nodes
NAME STATUS ROLES AGE VERSION
node-23.redhat.com Ready master,worker 4d22h v1.22.1+f773b8b


[kbidarka@localhost ocp-cnv-scripts]$ oc get pods -n openshift-cnv | grep virt | grep -v virt-template
virt-api-7754f44597-gnbkq 1/1 Running 1 4d21h
virt-api-7754f44597-m6qmd 1/1 Running 1 4d21h
virt-controller-65df6c6bdd-2nznf 1/1 Running 3 (4d3h ago) 4d21h
virt-controller-65df6c6bdd-66vcs 1/1 Running 4 (3d22h ago) 4d21h
virt-handler-9wgpg 1/1 Running 1 4d21h
virt-operator-5ffff65b57-rsghm 1/1 Running 4 (3d22h ago) 4d21h
virt-operator-5ffff65b57-xlsbr 1/1 Running 3 (4d ago) 4d21h

Summary: Currently from Latest 4.10 CNV Setup, we still see multiple replicas of virt-api, virt-controller and virt-operator.

Version-Release number of selected component (if applicable):
CNV-4.10 on SNO

How reproducible:

Install CNV on SNO

Steps to Reproduce:
1. Install CNV on SNO
2.
3.

Actual results:
we still see multiple replicas of virt-api, virt-controller and virt-operator.

Expected results:
we should have "single replica" of virt-api, virt-controller and virt-operator.

Additional info:

Comment 1 sgott 2021-11-24 13:25:28 UTC
The github issue linked to this BZ is resolved on the Virtualization side, thus we're re-assigning this BZ to the Installation component. Please let me know if you feel this action was in error.

Comment 2 Kedar Bidarkar 2021-11-25 20:22:45 UTC
Another observation with installing SNO setup is SR-IOV operator installation:
a) Installing the SR-IOV operator is successful before installation of CNV
b) Installing the SR-IOV operator after CNV Installation fails.

]$ oc logs daemonset/sriov-network-config-daemon     --namespace=openshift-sriov-network-operator     --container=sriov-network-config-daemon
...
I1125 20:16:32.376476 2285905 daemon.go:133] evicting pod openshift-cnv/virt-api-5b9d4b6767-n8bc4
I1125 20:16:32.376576 2285905 daemon.go:133] evicting pod openshift-cnv/virt-controller-8464dfc565-9ch8s
E1125 20:16:32.384313 2285905 daemon.go:133] error when evicting pods/"virt-controller-8464dfc565-9ch8s" -n "openshift-cnv" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
E1125 20:16:32.384381 2285905 daemon.go:133] error when evicting pods/"virt-api-5b9d4b6767-n8bc4" -n "openshift-cnv" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
I1125 20:16:37.384840 2285905 daemon.go:133] evicting pod openshift-cnv/virt-api-5b9d4b6767-n8bc4
I1125 20:16:37.384907 2285905 daemon.go:133] evicting pod openshift-cnv/virt-controller-8464dfc565-9ch8s
E1125 20:16:37.391466 2285905 daemon.go:133] error when evicting pods/"virt-api-5b9d4b6767-n8bc4" -n "openshift-cnv" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
E1125 20:16:37.391480 2285905 daemon.go:133] error when evicting pods/"virt-controller-8464dfc565-9ch8s" -n "openshift-cnv" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
I1125 20:16:42.117093 2285905 daemon.go:312] Run(): period refresh
I1125 20:16:42.120240 2285905 daemon.go:972] tryCreateSwitchdevUdevRule()
I1125 20:16:42.120286 2285905 daemon.go:1030] tryCreateNMUdevRule()
I1125 20:16:42.392461 2285905 daemon.go:133] evicting pod openshift-cnv/virt-controller-8464dfc565-9ch8s
I1125 20:16:42.392560 2285905 daemon.go:133] evicting pod openshift-cnv/virt-api-5b9d4b6767-n8bc4
E1125 20:16:42.398659 2285905 daemon.go:133] error when evicting pods/"virt-api-5b9d4b6767-n8bc4" -n "openshift-cnv" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
E1125 20:16:42.398658 2285905 daemon.go:133] error when evicting pods/"virt-controller-8464dfc565-9ch8s" -n "openshift-cnv" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.

Comment 3 Krzysztof Majcher 2021-11-26 14:14:14 UTC
This will be addressed as an Epic in 4.11 CNV-13912
Please let me know if you are not ok with having it addressed there.

Comment 6 Kedar Bidarkar 2021-12-13 18:48:58 UTC
For virt-operator,  currently its replica level is statically set in the CSV at build time and the OLM, 
at least in the actual release, is not going to scale down to 1.
See: https://github.com/operator-framework/operator-lifecycle-manager/issues/2453

---

See, SingleReplica for both virt-api and virt-controller.


VERIFIED with the build,
"image": "registry-proxy.engineering.redhat.com/rh-osbs/iib:146913",
"hcoVersion": "v4.10.0-464"

Comment 11 errata-xmlrpc 2022-03-16 15:56:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Virtualization 4.10.0 Images security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0947


Note You need to log in before you can comment on or make changes to this bug.