1909874 – cluster monitoring operator pods failing to start

Bug 1909874 - cluster monitoring operator pods failing to start

Summary: cluster monitoring operator pods failing to start

Keywords:
Status:	CLOSED DUPLICATE of bug 1906836
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Monitoring
Sub Component:
Version:	4.6
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Target Release:	---
Assignee:	Sergiusz Urbaniak
QA Contact:	Junqi Zhao
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2020-12-21 21:32 UTC by Ben Parees
Modified:	2020-12-22 08:53 UTC (History)
CC List:	9 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:	[sig-arch] Managed cluster should have no crashlooping pods in core namespaces over four minutes
Last Closed:	2020-12-22 08:53:49 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Ben Parees 2020-12-21 21:32:02 UTC

https://search.ci.openshift.org/?search=%5C%5Bsig-arch%5C%5D+Managed+cluster+should+have+no+crashlooping+pods+in+core+namespaces+over+four+minutes&maxAge=168h&context=1&type=junit&name=%5Erelease.*4.6&maxMatches=5&maxBytes=20971520&groupBy=job

shows a number of failures caused by monitoring pods, such as:

https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-azure-ovn-4.6/1339670137233477632

fail [github.com/openshift/origin/test/extended/operators/cluster.go:151]: Expected
    <[]string | len:1, cap:1>: [
        "Pod openshift-monitoring/cluster-monitoring-operator-55554cdb4b-z46dn was pending entire time: unknown error",
    ]
to be empty


the failure appears to be related to the kube-rbac-proxy container:
{
      "name": "kube-rbac-proxy",
      "state": {
        "waiting": {
          "reason": "CreateContainerConfigError",
          "message": "container has runAsNonRoot and image has non-numeric user (nobody), cannot verify user is non-root"
        }
      },
      "lastState": {},
      "ready": false,
      "restartCount": 0,
      "image": "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:b5d3f179d92e0fca445f69c41bb9763b2f2d9d37356621953d9ab8d71e22b2c5",
      "imageID": "",
      "started": false
    }


though i also see this:
Dec 17 21:26:52.580: INFO: prometheus-k8s-1[openshift-monitoring].container[prometheus]=level=error ts=2020-12-17T21:07:51.480Z caller=main.go:290 msg="Error loading config (--config.file=/etc/prometheus/config_out/prometheus.env.yaml)" err="open /etc/prometheus/config_out/prometheus.env.yaml: no such file or directory"

Comment 2 Damien Grisonnet 2020-12-22 08:53:49 UTC

I think it's safe to close this bug as a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1906836.

*** This bug has been marked as a duplicate of bug 1906836 ***

Note You need to log in before you can comment on or make changes to this bug.