Bug 1666635

Summary:	openshift-apiserver and openshift-controller-manager pods restart when update the managementState to invalid value
Product:	OpenShift Container Platform	Reporter:	zhou ying <yinzhou>
Component:	Master	Assignee:	Michal Fojtik <mfojtik>
Status:	CLOSED ERRATA	QA Contact:	Xingxing Xia <xxia>
Severity:	medium	Docs Contact:
Priority:	high
Version:	4.1.0	CC:	aos-bugs, jokerman, mmccomas
Target Milestone:	---
Target Release:	4.1.0
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2019-06-04 10:41:55 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description zhou ying 2019-01-16 09:06:01 UTC

Description of problem:
Update the managementState of openshiftcontrollermanageroperatorconfig  with any words will recreate the pods in project: openshift-controller-manager;
Update the managementState of OpenShiftAPIServerOperatorConfig with any words will recreate the pods in project: openshift-apiserver.


Version-Release number of selected component (if applicable):
Cluster version is 4.0.0-0.nightly-2019-01-15-010905

How reproducible:
Always

Steps to Reproduce:
1. Update the managementState of openshiftcontrollermanageroperatorconfig with any words.
2. Update the managementState of OpenShiftAPIServerOperatorConfig with  any words.

Actual results:
1. Update succeed and  recreate the pods in project: openshift-controller-manager;
2. Update succeed and  recreate the pods in project : openshift-apiserver;


Expected results:
1~2: Should restrict the value of spec.managementState with "Managed" and "Unmanaged" and should not recreate the pods in the object projects.



Additional info:

Comment 1 Michal Fojtik 2019-02-20 11:26:38 UTC

1~2: Should restrict the value of spec.managementState with "Managed" and "Unmanaged" and should not recreate the pods in the object projects.

This will require CR validation on API server side we don't have yet. We can validate on operator and report error condition when you set the field
to unknown value.

Comment 2 Michal Fojtik 2019-02-20 11:57:13 UTC

https://github.com/openshift/library-go/pull/261
https://github.com/openshift/cluster-kube-controller-manager-operator/pull/168
https://github.com/openshift/cluster-kube-apiserver-operator/pull/272
https://github.com/openshift/cluster-openshift-apiserver-operator/pull/147

Moving to QE as these likely merge today.

Comment 3 zhou ying 2019-02-21 02:12:15 UTC

https://github.com/openshift/cluster-kube-controller-manager-operator/pull/168    This PR is not merged , change the status to modified.

Comment 6 zhou ying 2019-02-26 03:14:36 UTC

Confirmed with OCP: 4.0.0-0.nightly-2019-02-25-234632,  all the master team operators have the event warning:

[root@preserved-yinzhou-rhel-1 0221]# oc project openshift-kube-controller-manager-operator
Now using project "openshift-kube-controller-manager-operator" on server "https://api.qe-yinzhou.qe.devcluster.openshift.com:6443".

[root@preserved-yinzhou-rhel-1 0221]# oc get events
LAST SEEN   TYPE      REASON                  KIND         MESSAGE
...
5m36s   Warning   ManagementStateUnknown       Deployment   Unrecognized operator management state "Managed!@#"
46m     Normal    LeaderElection               ConfigMap    b573d757-396d-11e9-8835-0a580a820008 became leader


[root@preserved-yinzhou-rhel-1 0221]# oc get po -n openshift-kube-controller-manager
NAME                                                                      READY   STATUS      RESTARTS   AGE
...
kube-controller-manager-ip-10-0-140-63.ap-northeast-1.compute.internal    1/1     Running     3          37m
kube-controller-manager-ip-10-0-150-12.ap-northeast-1.compute.internal    1/1     Running     3          36m
kube-controller-manager-ip-10-0-160-213.ap-northeast-1.compute.internal   1/1     Running     3          34m
revision-pruner-2-ip-10-0-140-63.ap-northeast-1.compute.internal          0/1     Completed   0          45m
revision-pruner-3-ip-10-0-140-63.ap-northeast-1.compute.internal          0/1     Completed   0          37m
[root@preserved-yinzhou-rhel-1 0221]# oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.0.0-0.nightly-2019-02-25-234632   True        False         26m     Cluster version is 4.0.0-0.nightly-2019-02-25-234632

Comment 9 errata-xmlrpc 2019-06-04 10:41:55 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0758