Created attachment 1809474 [details] get events pods and pod-yaml Description of problem: OCS operator fails during install with status "Failed: install failed: deployment ocs-metrics-exporter not ready before timeout: deployment "ocs-metrics-exporter" exceeded its progress deadline" In events for "ocs-metrics-exporter" pod I see the error "Error: container has runAsNonRoot and image will run as root (pod: "ocs-metrics-exporter-545d7dc948-ksszj_openshift-storage(a75039ba-56c4-4efb-8cda-57de7029a3cb)", container: ocs-metrics-exporter) Pod remains with CreateContainerConfigError Version-Release number of selected component (if applicable): OCP version 4.8.2 OCS version 4.8.0-175.ci This is done on zKVM cluster. How reproducible: Of 4 OCS installs with this version I saw this twice. Steps to Reproduce: 1. Install OCP 2. Setup Catalogsource to install OCS version 4.8.0-175.ci 3. Install OCS from operatorhub with all variables left as defaults 4. Wait until operator reports "failed due to timeout" Actual results: Operator installation reports failed tdue to timeout because open-shift-metrics-exporter pod is failing Expected results: Operator reports successfully installed Additional info: Intererstingly the operator appears to work fine since the metrics exporter is the only thing failed. I could continue onto creating a storagecluster and bind a PVC with the ceph-rbd storageclass. oc get events, oc get pods , and oc get pod ocs-metric-exporter -o yaml in attachment must-gather logs to follow
This looks like a problem in OCS deployment, not in OCP. OCP only reports discrepancy in the pod setting - ocs-metrics-exporter container should either be allowed to run as root or the image should use another user.
I have updated the Dockerfile to make the container run as a nonroot user. It should be fixed in the next build.
MODIFIED, till we get a build.
I"ve did the following verification steps : 1) Deployed OCP cluster ( 4.9.0-0.nightly-2021-10-22-102153) 2) updated catalog source 3) Installed odf-operator.v4.9.0 with default variables from the operator hub. Result: No error were reported by operator. Also nothing like " "Error: container has runAsNonRoot and image will run as root (pod: "ocs-metrics-exporter-545d7dc948-ksszj_openshift-storage(a75039ba-56c4-4efb-8cda-57de7029a3cb)", container: ocs-metrics-exporter) " was seen in the events. => closing this byug as verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: Red Hat OpenShift Data Foundation 4.9.0 enhancement, security, and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:5086