Description of problem: cluster-monitoring-operator pod CrashLoopBackOff Version-Release number of selected component (if applicable): openshift v3.11.18 How reproducible: Always Steps to Reproduce: 1. Install OCP v3.11 Actual results: Install succeeded. cluster-monitoring-operator pod CrashLoopBackOff # oc get pod -n openshift-monitoring NAME READY STATUS RESTARTS AGE cluster-monitoring-operator-56bb5946c4-d5b5b 0/1 CrashLoopBackOff 23 2h # oc describe pod/cluster-monitoring-operator-56bb5946c4-gb7lf Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 10m default-scheduler Successfully assigned openshift-monitoring/cluster-monitoring-operator-56bb5946c4-gb7lf to ip-172-18-8-81.ec2.internal Normal Pulling 10m kubelet, ip-172-18-8-81.ec2.internal pulling image "registry.reg-aws.openshift.com:443/openshift3/ose-cluster-monitoring-operator:v3.11" Normal Pulled 10m kubelet, ip-172-18-8-81.ec2.internal Successfully pulled image "registry.reg-aws.openshift.com:443/openshift3/ose-cluster-monitoring-operator:v3.11" Normal Created 1m (x5 over 10m) kubelet, ip-172-18-8-81.ec2.internal Created container Normal Pulled 1m (x4 over 5m) kubelet, ip-172-18-8-81.ec2.internal Container image "registry.reg-aws.openshift.com:443/openshift3/ose-cluster-monitoring-operator:v3.11" already present on machine Normal Started 1m (x5 over 10m) kubelet, ip-172-18-8-81.ec2.internal Started container Warning BackOff 3s (x8 over 4m) kubelet, ip-172-18-8-81.ec2.internal Back-off restarting failed container # oc logs pod/cluster-monitoring-operator-56bb5946c4-gb7lf I1002 04:52:59.433584 1 decoder.go:224] decoding stream as YAML I1002 04:53:00.624866 1 tasks.go:37] running task Updating Telemeter client I1002 04:53:00.624964 1 decoder.go:224] decoding stream as YAML panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x38 pc=0x10d344d] goroutine 13 [running]: github.com/openshift/cluster-monitoring-operator/pkg/manifests.(*Factory).TelemeterClientServiceMonitor(0xc4203d4a20, 0xc420098420, 0xc4208d3cc0, 0x6ae93a) /go/src/github.com/openshift/cluster-monitoring-operator/pkg/manifests/manifests.go:1441 +0x11d github.com/openshift/cluster-monitoring-operator/pkg/tasks.(*TelemeterClientTask).Run(0xc4203e10a0, 0x1, 0x1) /go/src/github.com/openshift/cluster-monitoring-operator/pkg/tasks/telemeter.go:36 +0x33 github.com/openshift/cluster-monitoring-operator/pkg/tasks.(*TaskRunner).ExecuteTask(0xc4208d3eb8, 0xc4203d4ba0, 0xf, 0xc4208d3d60) /go/src/github.com/openshift/cluster-monitoring-operator/pkg/tasks/tasks.go:48 +0x34 github.com/openshift/cluster-monitoring-operator/pkg/tasks.(*TaskRunner).RunAll(0xc4208d3eb8, 0xc4200f31c0, 0x351c07d935ca6caa) /go/src/github.com/openshift/cluster-monitoring-operator/pkg/tasks/tasks.go:38 +0x141 github.com/openshift/cluster-monitoring-operator/pkg/operator.(*Operator).sync(0xc420348500, 0xc4203e21e0, 0x2e, 0x11a2f40, 0xc42042a060) /go/src/github.com/openshift/cluster-monitoring-operator/pkg/operator/operator.go:251 +0x828 github.com/openshift/cluster-monitoring-operator/pkg/operator.(*Operator).processNextWorkItem(0xc420348500, 0xc42003bf00) /go/src/github.com/openshift/cluster-monitoring-operator/pkg/operator/operator.go:201 +0xfb github.com/openshift/cluster-monitoring-operator/pkg/operator.(*Operator).worker(0xc420348500) /go/src/github.com/openshift/cluster-monitoring-operator/pkg/operator/operator.go:171 +0x15a created by github.com/openshift/cluster-monitoring-operator/pkg/operator.(*Operator).Run /go/src/github.com/openshift/cluster-monitoring-operator/pkg/operator/operator.go:130 +0x1e2 Expected results: pod not crash Additional info:
I have been speaking OOB with N. Harrison Ripps. This issue is fundamentally the same as https://bugzilla.redhat.com/show_bug.cgi?id=1634227. The root cause of these issues is that 3.11 OCP images are incorrectly being built from the master branch of the Cluster Monitoring Operator rather than the release-3.11 branch. The commit that caused this external crash should never have ended up in 3.11. The master branch switched to 4.0 development some time ago.
*** This bug has been marked as a duplicate of bug 1634227 ***