Description of problem: the servicemonitor of dns-operator is missing after upgrading to 4.3 Version-Release number of selected component (if applicable): upgrade from 4.2 to 4.3.0-0.nightly-2019-12-08-215349 How reproducible: 100% Steps to Reproduce: 1. upgrade a 4.2 cluster to 4.3 2. oc get servicemonitor -n openshift-dns-operator 3. Actual results: No resources found. Expected results: should be same to the fresh install 4.3 cluster as below: $ oc get servicemonitor -n openshift-dns-operator NAME AGE dns-operator 7h22m Additional info:
Initial investigation: $ oc get clusterversions.config.openshift.io -w NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.2.9 True True 12h Working towards 4.3.0-0.ci-2019-12-09-151850: 13% complete version 4.2.9 True True 12h Unable to apply 4.3.0-0.ci-2019-12-09-151850: the cluster operator kube-apiserver is degraded $ kubectl get clusteroperators.config.openshift.io NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE authentication 4.3.0-0.ci-2019-12-09-151850 True False False 12h cloud-credential 4.3.0-0.ci-2019-12-09-151850 True False False 13h cluster-autoscaler 4.3.0-0.ci-2019-12-09-151850 True False False 13h console 4.3.0-0.ci-2019-12-09-151850 True False False 12h dns 4.3.0-0.ci-2019-12-09-151850 True False False 13h image-registry 4.3.0-0.ci-2019-12-09-151850 True False False 11h ingress 4.3.0-0.ci-2019-12-09-151850 True False False 11h insights 4.3.0-0.ci-2019-12-09-151850 True False False 13h kube-apiserver 4.3.0-0.ci-2019-12-09-151850 True False True 13h kube-controller-manager 4.3.0-0.ci-2019-12-09-151850 True False True 13h kube-scheduler 4.3.0-0.ci-2019-12-09-151850 True False True 13h machine-api 4.3.0-0.ci-2019-12-09-151850 True False False 13h machine-config 4.2.9 False True True 11h marketplace 4.3.0-0.ci-2019-12-09-151850 True False False 11h monitoring 4.3.0-0.ci-2019-12-09-151850 False True True 11h network 4.3.0-0.ci-2019-12-09-151850 True True True 13h node-tuning 4.3.0-0.ci-2019-12-09-151850 True False False 12h openshift-apiserver 4.3.0-0.ci-2019-12-09-151850 True False False 11h openshift-controller-manager 4.3.0-0.ci-2019-12-09-151850 True False False 13h openshift-samples 4.3.0-0.ci-2019-12-09-151850 True False False 12h operator-lifecycle-manager 4.3.0-0.ci-2019-12-09-151850 True False False 13h operator-lifecycle-manager-catalog 4.3.0-0.ci-2019-12-09-151850 True False False 13h operator-lifecycle-manager-packageserver 4.3.0-0.ci-2019-12-09-151850 True False False 11h service-ca 4.3.0-0.ci-2019-12-09-151850 True False False 13h service-catalog-apiserver 4.3.0-0.ci-2019-12-09-151850 True False False 13h service-catalog-controller-manager 4.3.0-0.ci-2019-12-09-151850 True False False 13h storage 4.3.0-0.ci-2019-12-09-151850 True False False 12h $ kubectl describe clusteroperators.config.openshift.io kube-apiserver Name: kube-apiserver Namespace: Labels: <none> Annotations: <none> API Version: config.openshift.io/v1 Kind: ClusterOperator Metadata: Creation Timestamp: 2019-12-09T18:16:17Z Generation: 1 Resource Version: 49677 Self Link: /apis/config.openshift.io/v1/clusteroperators/kube-apiserver UID: fe52edc3-1aaf-11ea-b93c-0a256f9a9963 Spec: Status: Conditions: Last Transition Time: 2019-12-09T19:35:11Z Message: NodeControllerDegraded: The master node(s) "ip-10-0-130-156.ec2.internal" not ready Reason: NodeControllerDegradedMasterNodesReady Status: True Type: Degraded Last Transition Time: 2019-12-09T19:17:04Z Message: Progressing: 3 nodes are at revision 8 Reason: AsExpected Status: False Type: Progressing Last Transition Time: 2019-12-09T18:18:47Z Message: Available: 3 nodes are active; 3 nodes are at revision 8 Reason: AsExpected Status: True Type: Available Last Transition Time: 2019-12-09T18:16:18Z Reason: AsExpected Status: True Type: Upgradeable Extension: <nil> Related Objects: Group: operator.openshift.io Name: cluster Resource: kubeapiservers Group: Name: openshift-config Resource: namespaces Group: Name: openshift-config-managed Resource: namespaces Group: Name: openshift-kube-apiserver-operator Resource: namespaces Group: Name: openshift-kube-apiserver Resource: namespaces Versions: Name: raw-internal Version: 4.3.0-0.ci-2019-12-09-151850 Name: operator Version: 4.3.0-0.ci-2019-12-09-151850 Name: kube-apiserver Version: 1.16.2 Events: <none> $ oc get nodes NAME STATUS ROLES AGE VERSION ip-10-0-130-124.ec2.internal NotReady,SchedulingDisabled worker 13h v1.14.6+31a56cf75 ip-10-0-130-156.ec2.internal NotReady,SchedulingDisabled master 13h v1.14.6+31a56cf75 ip-10-0-149-216.ec2.internal Ready master 13h v1.14.6+31a56cf75 ip-10-0-150-125.ec2.internal Ready worker 13h v1.14.6+31a56cf75 ip-10-0-170-76.ec2.internal Ready master 13h v1.14.6+31a56cf75 ip-10-0-175-242.ec2.internal Ready worker 13h v1.14.6+31a56cf75
I did a upgrade from 4.2 to operator: 4.3.0-0.ci-2019-11-22-122829 which worked fine; upgrade completed and no errors from: $ oc get rolebindings prometheus-k8s -n openshift-ingress-operator NAME AGE prometheus-k8s 154m $ oc get servicemonitor -n openshift-dns-operator NAME AGE dns-operator 154m
A further upgrade to 4.3.0-0.ci-2019-12-05-183852 has stalled: $ oc get nodes NAME STATUS ROLES AGE VERSION ip-10-0-133-109.ec2.internal Ready master 4h40m v1.16.2 ip-10-0-135-197.ec2.internal Ready worker 4h35m v1.16.2 ip-10-0-144-223.ec2.internal Ready worker 4h35m v1.16.2 ip-10-0-154-107.ec2.internal NotReady,SchedulingDisabled master 4h40m v1.16.2 ip-10-0-168-143.ec2.internal Ready master 4h40m v1.16.2 ip-10-0-175-228.ec2.internal NotReady,SchedulingDisabled worker 4h35m v1.16.2 aim@spicy:~/clusters/openshift-4.2.9 $ oc get clusterversions.config.openshift.io NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.3.0-0.ci-2019-11-22-122829 True True 113m Working towards 4.3.0-0.ci-2019-12-05-183852: 13% complete aim@spicy:~/clusters/openshift-4.2.9 $ oc get clusteroperators.config.openshift.io NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE authentication 4.3.0-0.ci-2019-12-05-183852 True False False 4h26m cloud-credential 4.3.0-0.ci-2019-12-05-183852 True False False 4h39m cluster-autoscaler 4.3.0-0.ci-2019-12-05-183852 True False False 4h32m console 4.3.0-0.ci-2019-12-05-183852 True False False 103m dns 4.3.0-0.ci-2019-12-05-183852 True False False 4h39m image-registry 4.3.0-0.ci-2019-12-05-183852 True False False 91m ingress 4.3.0-0.ci-2019-12-05-183852 True False False 163m insights 4.3.0-0.ci-2019-12-05-183852 True False False 4h39m kube-apiserver 4.3.0-0.ci-2019-12-05-183852 True False True 4h38m kube-controller-manager 4.3.0-0.ci-2019-12-05-183852 True False True 4h37m kube-scheduler 4.3.0-0.ci-2019-12-05-183852 True False True 4h36m machine-api 4.3.0-0.ci-2019-12-05-183852 True False False 4h39m machine-config 4.3.0-0.ci-2019-11-22-122829 False True True 97m marketplace 4.3.0-0.ci-2019-12-05-183852 True False False 104m monitoring 4.3.0-0.ci-2019-12-05-183852 False True True 89m network 4.3.0-0.ci-2019-12-05-183852 True True True 4h38m node-tuning 4.3.0-0.ci-2019-12-05-183852 True False False 91m openshift-apiserver 4.3.0-0.ci-2019-12-05-183852 True False False 89m openshift-controller-manager 4.3.0-0.ci-2019-12-05-183852 True False False 4h38m openshift-samples 4.3.0-0.ci-2019-12-05-183852 True False False 105m operator-lifecycle-manager 4.3.0-0.ci-2019-12-05-183852 True False False 4h38m operator-lifecycle-manager-catalog 4.3.0-0.ci-2019-12-05-183852 True False False 4h38m operator-lifecycle-manager-packageserver 4.3.0-0.ci-2019-12-05-183852 True False False 91m service-ca 4.3.0-0.ci-2019-12-05-183852 True False False 4h39m service-catalog-apiserver 4.3.0-0.ci-2019-12-05-183852 True False False 4h35m service-catalog-controller-manager 4.3.0-0.ci-2019-12-05-183852 True False False 4h35m storage 4.3.0-0.ci-2019-12-05-183852 True False False 106m The bug says the following resources are not available but I still (currently) see them: $ oc get servicemonitor -n openshift-dns-operator NAME AGE dns-operator 157m $ oc get rolebindings prometheus-k8s -n openshift-ingress-operator NAME AGE prometheus-k8s 157m And logging into the console or prometheus works so routing is not affected.
*** This bug has been marked as a duplicate of bug 1781062 ***