Description of problem: The openshift-monitoring/prometheus-operator deployment (managed by the cluster-monitoring-operator) fails to roll out because of an SCC issue: message: container has runAsNonRoot and image will run as root As a result, the monitoring stack as a whole fails to deploy. Version-Release number of selected component (if applicable): How reproducible: Launch a cluster with monitoring enabled via inventory: openshift_monitoring_deploy: true Actual results: cluster-monitoring-operator deploys successfully, but prometheus-operator fails to scale up. Expected results: The full monitoring stack to bootstrap in the openshift-monitoring namespace. Additional info:
Already fixed in https://github.com/openshift/cluster-monitoring-operator/pull/20, still working on getting a new image released; when I have a new release, I'll link to an openshift-ansible PR to represent the fix.
https://github.com/openshift/openshift-ansible/pull/8514
Fixing this problem revealed a related SCC issue, which needs another patch. Pulling this back to "ASSIGNED".
https://github.com/openshift/openshift-ansible/pull/8531
@Dan Which playbook shall I use, I set openshift_monitoring_deploy: true in inventory and run with playbooks/openshift-prometheus/config.yml, there is not prometheus-operator deployment under every namepace
(In reply to Junqi Zhao from comment #7) > @Dan > > Which playbook shall I use, I set openshift_monitoring_deploy: true in > inventory and run with playbooks/openshift-prometheus/config.yml, there is > not prometheus-operator deployment under every namepace Junqi, Here are where the new monitoring playbooks are located: https://github.com/openshift/openshift-ansible/tree/master/playbooks/openshift-monitoring The "openshift-prometheus" playbook is being replaced by "openshift-monitoring".
Juniqi, One more thing: the monitoring infrastructure will be installed in the openshift-monitoring namespace.
Tested with openshift-ansible-3.10.0-0.60.0.git.0.bf95bf8.el7.noarch, prometheus-operator could be scaled up now, all pods are normal. Steps: 1. set openshift_monitoring_deploy=true in inventory file 2. run with playbooks/openshift-monitoring/config.yml playbook
# oc get po -n openshift-monitoring NAME READY STATUS RESTARTS AGE alertmanager-main-0 3/3 Running 0 53m alertmanager-main-1 3/3 Running 0 53m cluster-monitoring-operator-7f6c68764b-f5qc4 1/1 Running 0 54m kube-state-metrics-d6f855965-ztd4s 3/3 Running 0 52m node-exporter-dx5zn 2/2 Running 0 52m node-exporter-g6dw5 2/2 Running 0 52m prometheus-k8s-0 3/3 Running 1 54m prometheus-k8s-1 3/3 Running 1 54m prometheus-operator-7878fffc55-hlls5 1/1 Running 0 7m
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:1816