Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1581760

Summary:	prometheus-operator deployment fails to start
Product:	OpenShift Container Platform	Reporter:	Dan Mace <dmace>
Component:	Hawkular	Assignee:	Dan Mace <dmace>
Status:	CLOSED ERRATA	QA Contact:	Junqi Zhao <juzhao>
Severity:	urgent	Docs Contact:
Priority:	high
Version:	3.10.0	CC:	aos-bugs, dmace
Target Milestone:	---
Target Release:	3.10.0
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	No Doc Update
Doc Text:	undefined	Story Points:	---
Clone Of:		Environment:
Last Closed:	2018-07-30 19:16:18 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Dan Mace 2018-05-23 15:03:37 UTC

Description of problem:

The openshift-monitoring/prometheus-operator deployment (managed by the cluster-monitoring-operator) fails to roll out because of an SCC issue:

   message: container has runAsNonRoot and image will run as root

As a result, the monitoring stack as a whole fails to deploy.

Version-Release number of selected component (if applicable):


How reproducible:

Launch a cluster with monitoring enabled via inventory:

   openshift_monitoring_deploy: true


Actual results:

cluster-monitoring-operator deploys successfully, but prometheus-operator fails to scale up.

Expected results:

The full monitoring stack to bootstrap in the openshift-monitoring namespace.

Additional info:

Comment 1 Dan Mace 2018-05-23 15:05:29 UTC

Already fixed in https://github.com/openshift/cluster-monitoring-operator/pull/20, still working on getting a new image released; when I have a new release, I'll link to an openshift-ansible PR to represent the fix.

Comment 2 Dan Mace 2018-05-24 11:55:28 UTC

https://github.com/openshift/openshift-ansible/pull/8514

Comment 3 Dan Mace 2018-05-25 14:18:55 UTC

Fixing this problem revealed a related SCC issue, which needs another patch. Pulling this back to "ASSIGNED".

Comment 4 Dan Mace 2018-05-25 15:15:49 UTC

https://github.com/openshift/openshift-ansible/pull/8531

Comment 7 Junqi Zhao 2018-06-05 09:20:05 UTC

@Dan

Which playbook shall I use, I set openshift_monitoring_deploy: true in inventory and run with playbooks/openshift-prometheus/config.yml, there is not prometheus-operator deployment under every namepace

Comment 8 Dan Mace 2018-06-05 12:56:46 UTC

(In reply to Junqi Zhao from comment #7)
> @Dan
> 
> Which playbook shall I use, I set openshift_monitoring_deploy: true in
> inventory and run with playbooks/openshift-prometheus/config.yml, there is
> not prometheus-operator deployment under every namepace

Junqi,

Here are where the new monitoring playbooks are located:

https://github.com/openshift/openshift-ansible/tree/master/playbooks/openshift-monitoring

The "openshift-prometheus" playbook is being replaced by "openshift-monitoring".

Comment 9 Dan Mace 2018-06-05 13:00:32 UTC

Juniqi,

One more thing: the monitoring infrastructure will be installed in the openshift-monitoring namespace.

Comment 10 Junqi Zhao 2018-06-06 02:51:04 UTC

Tested with openshift-ansible-3.10.0-0.60.0.git.0.bf95bf8.el7.noarch, prometheus-operator could be scaled up now, all pods are normal.

Steps:
1. set openshift_monitoring_deploy=true in inventory file
2. run with playbooks/openshift-monitoring/config.yml playbook

Comment 11 Junqi Zhao 2018-06-06 02:51:55 UTC

# oc get po -n openshift-monitoring
NAME                                           READY     STATUS    RESTARTS   AGE
alertmanager-main-0                            3/3       Running   0          53m
alertmanager-main-1                            3/3       Running   0          53m
cluster-monitoring-operator-7f6c68764b-f5qc4   1/1       Running   0          54m
kube-state-metrics-d6f855965-ztd4s             3/3       Running   0          52m
node-exporter-dx5zn                            2/2       Running   0          52m
node-exporter-g6dw5                            2/2       Running   0          52m
prometheus-k8s-0                               3/3       Running   1          54m
prometheus-k8s-1                               3/3       Running   1          54m
prometheus-operator-7878fffc55-hlls5           1/1       Running   0          7m

Comment 13 errata-xmlrpc 2018-07-30 19:16:18 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:1816