Bug 1874754 - [downstream helm-operator] The pods of the CR is crashing(Permission denied)
Summary: [downstream helm-operator] The pods of the CR is crashing(Permission denied)
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Operator SDK
Version: 4.6
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 4.6.0
Assignee: Alex Dellapenta
QA Contact: Jian Zhang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-09-02 07:32 UTC by Jian Zhang
Modified: 2020-12-14 17:09 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Known Issue
Doc Text:
Cause: Building and running a Helm-based operator that uses the default boilerplate nginx chart (by running operator-sdk new without the --helm-chart flag) Consequence: The example chart fails to deploy cleanly on OpenShift (it works fine on upstream Kubernetes) Workaround (if any): Use the --helm-chart flag to provide a helm chart that deploys cleanly on OpenShift Result:
Clone Of:
Environment:
Last Closed: 2020-12-14 17:09:22 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Jian Zhang 2020-09-02 07:32:12 UTC
Description of problem:
Get the error below when runing the CR by using the `helm-operator`:

[root@preserve-olm-env data]# oc logs example-nginx-7df4447d79-jx8dd
2020/09/02 07:05:12 [warn] 1#1: the "user" directive makes sense only if the master process runs with super-user privileges, ignored in /etc/nginx/nginx.conf:2
nginx: [warn] the "user" directive makes sense only if the master process runs with super-user privileges, ignored in /etc/nginx/nginx.conf:2
2020/09/02 07:05:12 [emerg] 1#1: mkdir() "/var/cache/nginx/client_temp" failed (13: Permission denied)
nginx: [emerg] mkdir() "/var/cache/nginx/client_temp" failed (13: Permission denied)

[root@preserve-olm-env data]# oc get pods
NAME                                    READY   STATUS             RESTARTS   AGE
...
example-nginx-7df4447d79-jx8dd          0/1     CrashLoopBackOff   3          70s


Version-Release number of selected component (if applicable):
4.6
Downstream helm-operator

How reproducible:
always

Steps to Reproduce:
1. Init a Helm type nginx-operator by using the `operator-sdk`.
[root@preserve-olm-env nginx-operator]# pwd
/data/goproject/src/github.com/example-inc/nginx-operator
[root@preserve-olm-env nginx-operator]# ls
build  deploy  helm-charts   watches.yaml

2. Compile the downstream helm-operator binary.
[root@preserve-olm-env openshift]# git clone git:openshift/ocp-release-operator-sdk.git
...
[root@preserve-olm-env ocp-release-operator-sdk]# make build/helm-operator
[root@preserve-olm-env ocp-release-operator-sdk]# ls -l ./build/helm-operator 
-rwxr-xr-x. 1 root root 55631400 Sep  2 02:38 ./build/helm-operator

3. Login a OCP 4.6 cluster.
[root@preserve-olm-env nginx-operator]# oc project
Using project "openshift-marketplace" on server "https://api.jiazha0901.qe.devcluster.openshift.com:6443".

4, Deploy the CRD.
[root@preserve-olm-env nginx-operator]# oc create -f deploy/crds/example.com_nginxes_crd.yaml
...
[root@preserve-olm-env nginx-operator]# oc get nginx -A
No resources found

5, Set the "OPERATOR_NAME" value and run this nginx-operator by using the helm-operator.
[root@preserve-olm-env nginx-operator]# export OPERATOR_NAME=nginx-operator
[root@preserve-olm-env nginx-operator]# ./helm-operator run
{"level":"info","ts":1599029881.6350155,"logger":"cmd","msg":"Go Version: go1.14"}
{"level":"info","ts":1599029881.6350768,"logger":"cmd","msg":"Go OS/Arch: linux/amd64"}
{"level":"info","ts":1599029881.635086,"logger":"cmd","msg":"Version of operator-sdk: v0.19.0"}
...

6, Deploy a nginx CR.
[root@preserve-olm-env nginx-operator]# oc create -f deploy/crds/example.com_v1alpha1_nginx_cr.yaml
...

Actual results:
The CR created sucecssfully, but its pod is crashing.

[root@preserve-olm-env data]# oc get pods
NAME                                    READY   STATUS             RESTARTS   AGE
...
example-nginx-7df4447d79-jx8dd          0/1     CrashLoopBackOff   3          70s
...

[root@preserve-olm-env data]# oc logs example-nginx-7df4447d79-jx8dd
2020/09/02 07:05:12 [warn] 1#1: the "user" directive makes sense only if the master process runs with super-user privileges, ignored in /etc/nginx/nginx.conf:2
nginx: [warn] the "user" directive makes sense only if the master process runs with super-user privileges, ignored in /etc/nginx/nginx.conf:2
2020/09/02 07:05:12 [emerg] 1#1: mkdir() "/var/cache/nginx/client_temp" failed (13: Permission denied)
nginx: [emerg] mkdir() "/var/cache/nginx/client_temp" failed (13: Permission denied)


Expected results:
The CR pod should works well.


Additional info:

[root@preserve-olm-env nginx-operator]# ./helm-operator run
{"level":"info","ts":1599029881.6350155,"logger":"cmd","msg":"Go Version: go1.14"}
{"level":"info","ts":1599029881.6350768,"logger":"cmd","msg":"Go OS/Arch: linux/amd64"}
{"level":"info","ts":1599029881.635086,"logger":"cmd","msg":"Version of operator-sdk: v0.19.0"}
{"level":"info","ts":1599029881.6379602,"logger":"cmd","msg":"WATCH_NAMESPACE environment variable not set. Watching all namespaces.","Namespace":""}
I0902 02:58:02.765044   20527 request.go:621] Throttling request took 1.027322925s, request: GET:https://api.jiazha0901.qe.devcluster.openshift.com:6443/apis/machine.openshift.io/v1beta1?timeout=32s
{"level":"info","ts":1599029884.0876718,"logger":"controller-runtime.metrics","msg":"metrics server is starting to listen","addr":"0.0.0.0:8383"}
{"level":"info","ts":1599029884.0884671,"logger":"helm.controller","msg":"Watching resource","apiVersion":"example.com/v1alpha1","kind":"Nginx","namespace":"","reconcilePeriod":"1m0s"}
{"level":"info","ts":1599029884.0885055,"logger":"leader","msg":"Trying to become the leader."}
{"level":"info","ts":1599029884.0885339,"logger":"leader","msg":"Skipping leader election; not running in a cluster."}
{"level":"info","ts":1599029884.0885646,"logger":"cmd","msg":"Could not generate and serve custom resource metrics","Namespace":"","error":"WATCH_NAMESPACE must be set"}
{"level":"info","ts":1599029886.5114994,"logger":"metrics","msg":"Skipping metrics Service creation; not running in a cluster."}
{"level":"info","ts":1599029888.932723,"logger":"controller-runtime.manager","msg":"starting metrics server","path":"/metrics"}
{"level":"info","ts":1599029888.9327874,"logger":"controller-runtime.controller","msg":"Starting EventSource","controller":"nginx-controller","source":"kind source: example.com/v1alpha1, Kind=Nginx"}
{"level":"info","ts":1599029889.0333405,"logger":"controller-runtime.controller","msg":"Starting Controller","controller":"nginx-controller"}
{"level":"info","ts":1599029889.0334074,"logger":"controller-runtime.controller","msg":"Starting workers","controller":"nginx-controller","worker count":1}



I0902 02:58:35.292883   20527 request.go:621] Throttling request took 1.038226606s, request: GET:https://api.jiazha0901.qe.devcluster.openshift.com:6443/apis/operators.coreos.com/v1alpha1?timeout=32s
{"level":"info","ts":1599029917.2595756,"logger":"controller-runtime.controller","msg":"Starting EventSource","controller":"nginx-controller","source":"kind source: /v1, Kind=ServiceAccount"}
{"level":"info","ts":1599029917.3603249,"logger":"helm.controller","msg":"Watching dependent resource","ownerApiVersion":"example.com/v1alpha1","ownerKind":"Nginx","apiVersion":"v1","kind":"ServiceAccount"}
{"level":"info","ts":1599029917.3608084,"logger":"controller-runtime.controller","msg":"Starting EventSource","controller":"nginx-controller","source":"kind source: /v1, Kind=Service"}
{"level":"info","ts":1599029917.4613502,"logger":"helm.controller","msg":"Watching dependent resource","ownerApiVersion":"example.com/v1alpha1","ownerKind":"Nginx","apiVersion":"v1","kind":"Service"}
{"level":"info","ts":1599029917.4620032,"logger":"controller-runtime.controller","msg":"Starting EventSource","controller":"nginx-controller","source":"kind source: apps/v1, Kind=Deployment"}
{"level":"info","ts":1599029917.5623562,"logger":"helm.controller","msg":"Watching dependent resource","ownerApiVersion":"example.com/v1alpha1","ownerKind":"Nginx","apiVersion":"apps/v1","kind":"Deployment"}
{"level":"info","ts":1599029917.562435,"logger":"helm.controller","msg":"Installed release","namespace":"openshift-marketplace","name":"example-nginx","apiVersion":"example.com/v1alpha1","kind":"Nginx","release":"example-nginx"}
+---
+# Source: nginx/templates/serviceaccount.yaml
+apiVersion: v1
+kind: ServiceAccount
+metadata:
+  name: example-nginx
+  labels:
+    helm.sh/chart: nginx-0.1.0
+    app.kubernetes.io/name: nginx
+    app.kubernetes.io/instance: example-nginx
+    app.kubernetes.io/version: "1.16.0"
+    app.kubernetes.io/managed-by: Helm
+---
+# Source: nginx/templates/service.yaml
+apiVersion: v1
+kind: Service
+metadata:
+  name: example-nginx
+  labels:
+    helm.sh/chart: nginx-0.1.0
+    app.kubernetes.io/name: nginx
+    app.kubernetes.io/instance: example-nginx
+    app.kubernetes.io/version: "1.16.0"
+    app.kubernetes.io/managed-by: Helm
+spec:
+  type: ClusterIP
+  ports:
+    - port: 80
+      targetPort: http
+      protocol: TCP
+      name: http
+  selector:
+    app.kubernetes.io/name: nginx
+    app.kubernetes.io/instance: example-nginx
+---
+# Source: nginx/templates/deployment.yaml
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+  name: example-nginx
+  labels:
+    helm.sh/chart: nginx-0.1.0
+    app.kubernetes.io/name: nginx
+    app.kubernetes.io/instance: example-nginx
+    app.kubernetes.io/version: "1.16.0"
+    app.kubernetes.io/managed-by: Helm
+spec:
+  replicas: 1
+  selector:
+    matchLabels:
+      app.kubernetes.io/name: nginx
+      app.kubernetes.io/instance: example-nginx
+  template:
+    metadata:
+      labels:
+        app.kubernetes.io/name: nginx
+        app.kubernetes.io/instance: example-nginx
+    spec:
+      serviceAccountName: example-nginx
+      securityContext:
+        {}
+      containers:
+        - name: nginx
+          securityContext:
+            {}
+          image: "nginx:1.16.0"
+          imagePullPolicy: IfNotPresent
+          ports:
+            - name: http
+              containerPort: 80
+              protocol: TCP
+          livenessProbe:
+            httpGet:
+              path: /
+              port: http
+          readinessProbe:
+            httpGet:
+              path: /
+              port: http
+          resources:
+            {}

{"level":"info","ts":1599029920.6204355,"logger":"helm.controller","msg":"Reconciled release","namespace":"openshift-marketplace","name":"example-nginx","apiVersion":"example.com/v1alpha1","kind":"Nginx","release":"example-nginx"}

Comment 1 Joe Lanford 2020-09-03 16:33:35 UTC
This appears to be an issue with the operand image, not the operator image.

The default chart that is scaffolded if --helm-chart is not provided comes from the upstream helm project's boilerplate chart and is meant to be modified by the helm chart / helm-operator developer.

It seems the `nginx` image is not natively compatible with OpenShift?

Changing severity to medium since this is not a bug with the helm-operator base image, and potentially not a bug at all, depending on whether it's considered a bug that an example boilerplate chart scaffolded by the SDK is not compatible with OpenShift.

Comment 2 Joe Lanford 2020-09-03 18:28:14 UTC
Since this issue only occurs with the boilerplate example chart provided by Helm (and only on OpenShift) and is not related to the Helm operator base image, I think we should document this as a known issue with OpenShift.

This should not affect any production usage of the helm-operator because the sample chart is extremely simple and provides no real-world functionality.

Comment 3 Alex Dellapenta 2020-09-14 16:37:15 UTC
https://github.com/openshift/openshift-docs/pull/25451

Comment 4 Jian Zhang 2020-09-15 09:30:26 UTC
Add the `--helm-chart` flag when running the helm-operator. LGTM, verify it.

[root@preserve-olm-env nginx-operator]# ./helm-operator run -- --helm-chart
{"level":"info","ts":1600161079.4351165,"logger":"cmd","msg":"Go Version: go1.14"}
{"level":"info","ts":1600161079.4351988,"logger":"cmd","msg":"Go OS/Arch: linux/amd64"}
{"level":"info","ts":1600161079.4352088,"logger":"cmd","msg":"Version of operator-sdk: v0.19.0"}
{"level":"info","ts":1600161079.4377782,"logger":"cmd","msg":"WATCH_NAMESPACE environment variable not set. Watching all namespaces.","Namespace":""}
I0915 05:11:20.980035   31902 request.go:621] Throttling request took 1.049494513s, request: GET:https://api.min-shared0914.qe.devcluster.openshift.com:6443/apis/migration.k8s.io/v1alpha1?timeout=32s
{"level":"info","ts":1600161082.3129342,"logger":"controller-runtime.metrics","msg":"metrics server is starting to listen","addr":"0.0.0.0:8383"}
{"level":"info","ts":1600161082.3139274,"logger":"helm.controller","msg":"Watching resource","apiVersion":"example.com/v1alpha1","kind":"Nginx","namespace":"","reconcilePeriod":"1m0s"}
{"level":"info","ts":1600161082.3139641,"logger":"leader","msg":"Trying to become the leader."}
{"level":"info","ts":1600161082.313998,"logger":"leader","msg":"Skipping leader election; not running in a cluster."}
{"level":"info","ts":1600161082.3140197,"logger":"cmd","msg":"Could not generate and serve custom resource metrics","Namespace":"","error":"WATCH_NAMESPACE must be set"}
...

[root@preserve-olm-env nginx-operator]# oc create -f deploy/crds/example.com_v1alpha1_nginx_cr.yaml
nginx.example.com/example-nginx created
[root@preserve-olm-env nginx-operator]# oc get nginx -A
NAMESPACE   NAME            AGE
default     example-nginx   10s

[root@preserve-olm-env nginx-operator]# oc get pods
NAME                             READY   STATUS      RESTARTS   AGE
example-nginx-7df4447d79-qhvrn   1/1     Running     0          86s
scorecard-test-rvgn              0/1     Completed   0          22h

Comment 5 Alex Dellapenta 2020-09-17 20:31:33 UTC
Staged 4.6 doc for the merged content available here:

https://docs.openshift.com/container-platform/4.6/release_notes/ocp-4-6-release-notes.html#ocp-4-6-known-issues


Note You need to log in before you can comment on or make changes to this bug.