Bug 1874938

Summary: OLM generates many ReplicaSets when it is caught in a loop where it updates the Deployment
Product: OpenShift Container Platform Reporter: Alexander Greene <agreene>
Component: OLMAssignee: Alexander Greene <agreene>
OLM sub component: OLM QA Contact: Jian Zhang <jiazha>
Status: CLOSED DUPLICATE Docs Contact:
Severity: medium    
Priority: medium CC: krizza, nhale
Version: 4.5   
Target Milestone: ---   
Target Release: 4.6.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-09-23 00:56:47 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Alexander Greene 2020-09-02 15:51:12 UTC
Description of problem:
An operator which is configured to use OLM admission webhooks in its CSV (.spec.webhookdefinitions), exhibits the following behavior:
1. During installation phase, the operator pod is constantly getting terminated, and a new pod is created alongside the old one, this occurs dozen of times during the deployment.
2. When (1) occurs, new ReplicaSet is being created, causing the existing active ReplicaSet to scale down to 0. All inactive ReplicaSets (with desired replicas = 0) remain in the namespace.
Version-Release number of selected component (if applicable):
OCP 4.5.3
PackageServer 0.15.1

How reproducible:
100%

Steps to Reproduce:
Positive flow:
1. Create a catalog source using the following bundle image:
quay.io/orenc/hco-container-registry:olm-webhooks
2. Install "KubeVirt HyperConverged Cluster Operator" in channel 1.2.0 using OperatorHub (or manually with CLI).
3. Create the "HyperConverged" CR (default settings).
4. watch hco-operator pod getting terminated and created, new RS are being created:
$ oc get rs -n kubevirt-hyperconverged
NAME                                            DESIRED   CURRENT   READY   AGE
cdi-apiserver-7dcb77db79                        1         1         1       4m28s
cdi-deployment-7f999c755                        1         1         1       4m28s
cdi-operator-54d5b958d6                         1         1         1       5m2s
cdi-uploadproxy-85f76cc48b                      1         1         1       4m27s
cluster-network-addons-operator-7658f658d4      1         1         1       5m3s
hco-operator-5476bf64f5                         0         0         0       2m11s
hco-operator-54dd9fcf59                         1         1         1       15s
hco-operator-56c4c6866f                         0         0         0       96s
hco-operator-59f65f4559                         0         0         0       3m34s
hco-operator-5bb486777c                         0         0         0       2m47s
hco-operator-64f4cfb7bb                         0         0         0       18s
hco-operator-6978d5bb9f                         0         0         0       61s
hco-operator-7995844456                         0         0         0       3m32s
hco-operator-7b69cf7c54                         0         0         0       2m49s
hco-operator-7b95cc76d9                         0         0         0       4m23s
hco-operator-cc87fccb8                          0         0         0       5m1s
hostpath-provisioner-operator-79cc779987        1         1         1       5m2s
kubemacpool-mac-controller-manager-6c8c6557c5   2         2         2       4m30s
kubevirt-ssp-operator-767c7dff98                1         1         1       5m2s
nmstate-webhook-7fcdbdb77d                      2         2         2       4m29s
virt-operator-695d9b7659                        2         2         2       5m3s
virt-template-validator-76db69664c              2         2         2       4m6s
vm-import-controller-785cb6d578                 1         1         0       4m30s
vm-import-operator-647cff486f                   1         1         1       5m2s


Actual results:
HCO pod getting terminated and created numerous times during installation by OLM. Many ReplicaSets are created as a result.

Expected results:
If the Deployment is updated, only the previous replicaSet should exist

Comment 1 Alexander Greene 2020-09-22 18:31:10 UTC
I am moving this back to a 4.6.0 bug because the PR that implements this fix was merged in 4.6