Description of problem: The ingress operator was observed failing to integrate with metrics and reporting status sync errors, and the only recovery was to restart the operator. This is the same as https://bugzilla.redhat.com/show_bug.cgi?id=1687640, which we thought was fixed, but the fix was not effective in all cases. Example error output from an Azure cluster: 2019-06-05T13:16:43.0651753Z 2019-06-05T13:16:43.065Z ERROR operator.init.controller-runtime.controller controller/controller.go:217 Reconciler error {"controller": "operator-controller", "request": "openshift-ingress-operator/default", "error": "failed to ensure ingresscontroller: failed to integrate metrics with openshift-monitoring for ingresscontroller default: failed to ensure servicemonitor for default: no matches for kind \"ServiceMonitor\" in version \"monitoring.coreos.com/v1\"", "errorCauses": [{"error": "failed to ensure ingresscontroller: failed to integrate metrics with openshift-monitoring for ingresscontroller default: failed to ensure servicemonitor for default: no matches for kind \"ServiceMonitor\" in version \"monitoring.coreos.com/v1\""}]} Version-Release number of selected component (if applicable): 4.2.0-0.ci-2019-06-04-085838 How reproducible: Create a new cluster with the installer. Actual results: Sometimes, the ingress operator will report those errors indefinitely depending on the order in which the operator starts relative to the prometheus operator. Expected results: The operator should eventually fix itself. Additional info:
verified with 4.2.0-0.nightly-2019-06-25-003324 and the issue has been fixed. Ingress operator reported the errors but eventually operator fix itself. $ oc -n openshift-ingress get servicemonitor NAME AGE router-default 3h42m 2019-06-25T06:22:02.743Z ERROR operator.init.controller-runtime.controller controller/controller.go:212 Reconciler error {"controller": "ingress_controller", "request": "openshift-ingress-operator/default", "error": "failed to ensure ingresscontroller: failed to integrate metrics with openshift-monitoring for ingresscontroller default: failed to ensure servicemonitor for default: no matches for kind \"ServiceMonitor\" in version \"monitoring.coreos.com/v1\"", "errorCauses": [{"error": "failed to ensure ingresscontroller: failed to integrate metrics with openshift-monitoring for ingresscontroller default: failed to ensure servicemonitor for default: no matches for kind \"ServiceMonitor\" in version \"monitoring.coreos.com/v1\""}]} <---snip---> 2019-06-25T06:24:03.512Z ERROR operator.init.controller-runtime.controller controller/controller.go:212 Reconciler error {"controller": "ingress_controller", "request": "openshift-ingress-operator/default", "error": "failed to ensure ingresscontroller: failed to integrate metrics with openshift-monitoring for ingresscontroller default: failed to ensure servicemonitor for default: no matches for kind \"ServiceMonitor\" in version \"monitoring.coreos.com/v1\"", "errorCauses": [{"error": "failed to ensure ingresscontroller: failed to integrate metrics with openshift-monitoring for ingresscontroller default: failed to ensure servicemonitor for default: no matches for kind \"ServiceMonitor\" in version \"monitoring.coreos.com/v1\""}]} 2019-06-25T06:24:31.702Z INFO operator.controller controller/monitoring.go:30 created servicemonitor {"namespace": "openshift-ingress", "name": "router-default"}
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2922