Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1581760

Summary: prometheus-operator deployment fails to start
Product: OpenShift Container Platform Reporter: Dan Mace <dmace>
Component: HawkularAssignee: Dan Mace <dmace>
Status: CLOSED ERRATA QA Contact: Junqi Zhao <juzhao>
Severity: urgent Docs Contact:
Priority: high    
Version: 3.10.0CC: aos-bugs, dmace
Target Milestone: ---   
Target Release: 3.10.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
undefined
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-07-30 19:16:18 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Dan Mace 2018-05-23 15:03:37 UTC
Description of problem:

The openshift-monitoring/prometheus-operator deployment (managed by the cluster-monitoring-operator) fails to roll out because of an SCC issue:

   message: container has runAsNonRoot and image will run as root

As a result, the monitoring stack as a whole fails to deploy.

Version-Release number of selected component (if applicable):


How reproducible:

Launch a cluster with monitoring enabled via inventory:

   openshift_monitoring_deploy: true


Actual results:

cluster-monitoring-operator deploys successfully, but prometheus-operator fails to scale up.

Expected results:

The full monitoring stack to bootstrap in the openshift-monitoring namespace.

Additional info:

Comment 1 Dan Mace 2018-05-23 15:05:29 UTC
Already fixed in https://github.com/openshift/cluster-monitoring-operator/pull/20, still working on getting a new image released; when I have a new release, I'll link to an openshift-ansible PR to represent the fix.

Comment 3 Dan Mace 2018-05-25 14:18:55 UTC
Fixing this problem revealed a related SCC issue, which needs another patch. Pulling this back to "ASSIGNED".

Comment 7 Junqi Zhao 2018-06-05 09:20:05 UTC
@Dan

Which playbook shall I use, I set openshift_monitoring_deploy: true in inventory and run with playbooks/openshift-prometheus/config.yml, there is not prometheus-operator deployment under every namepace

Comment 8 Dan Mace 2018-06-05 12:56:46 UTC
(In reply to Junqi Zhao from comment #7)
> @Dan
> 
> Which playbook shall I use, I set openshift_monitoring_deploy: true in
> inventory and run with playbooks/openshift-prometheus/config.yml, there is
> not prometheus-operator deployment under every namepace

Junqi,

Here are where the new monitoring playbooks are located:

https://github.com/openshift/openshift-ansible/tree/master/playbooks/openshift-monitoring

The "openshift-prometheus" playbook is being replaced by "openshift-monitoring".

Comment 9 Dan Mace 2018-06-05 13:00:32 UTC
Juniqi,

One more thing: the monitoring infrastructure will be installed in the openshift-monitoring namespace.

Comment 10 Junqi Zhao 2018-06-06 02:51:04 UTC
Tested with openshift-ansible-3.10.0-0.60.0.git.0.bf95bf8.el7.noarch, prometheus-operator could be scaled up now, all pods are normal.

Steps:
1. set openshift_monitoring_deploy=true in inventory file
2. run with playbooks/openshift-monitoring/config.yml playbook

Comment 11 Junqi Zhao 2018-06-06 02:51:55 UTC
# oc get po -n openshift-monitoring
NAME                                           READY     STATUS    RESTARTS   AGE
alertmanager-main-0                            3/3       Running   0          53m
alertmanager-main-1                            3/3       Running   0          53m
cluster-monitoring-operator-7f6c68764b-f5qc4   1/1       Running   0          54m
kube-state-metrics-d6f855965-ztd4s             3/3       Running   0          52m
node-exporter-dx5zn                            2/2       Running   0          52m
node-exporter-g6dw5                            2/2       Running   0          52m
prometheus-k8s-0                               3/3       Running   1          54m
prometheus-k8s-1                               3/3       Running   1          54m
prometheus-operator-7878fffc55-hlls5           1/1       Running   0          7m

Comment 13 errata-xmlrpc 2018-07-30 19:16:18 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:1816