Created attachment 1638057 [details] hco.log Description of problem: Pods jumping from terminating to pending to crash loop mostly ssp cdi-deployment and cdi-operator ridge-marker-4kfns 1/1 Running 0 25m bridge-marker-6pgb5 1/1 Running 0 25m bridge-marker-8c4cp 1/1 Running 0 25m bridge-marker-nhqqk 1/1 Running 0 25m cdi-apiserver-779b7c455b-jw8fw 1/1 Running 0 25m cdi-deployment-59756855fc-27wf5 0/1 CrashLoopBackOff 6 24m cdi-operator-87ccdd97f-v8qv2 0/1 CrashLoopBackOff 19 111m cdi-uploadproxy-6456f9b5cb-9szg7 1/1 Running 0 25m cluster-network-addons-operator-57985b8b55-kgfzx 1/1 Running 0 111m hco-operator-5d46d855bc-t9mgx 0/1 Running 3 111m kube-cni-linux-bridge-plugin-4mn7f 1/1 Running 0 25m kube-cni-linux-bridge-plugin-dt4qh 1/1 Running 0 25m kube-cni-linux-bridge-plugin-m6pzg 1/1 Running 0 25m kube-cni-linux-bridge-plugin-psfb7 1/1 Running 0 25m kubemacpool-mac-controller-manager-5965948866-c9rvd 1/1 Terminating 0 20m kubemacpool-mac-controller-manager-5965948866-chhqw 1/1 Running 0 115s kubemacpool-mac-controller-manager-5965948866-wkhv8 1/1 Running 0 115s kubevirt-ssp-operator-64575cf47f-ffsr9 0/1 CrashLoopBackOff 11 111m nmstate-handler-5p5xx 1/1 Running 0 8m35s nmstate-handler-dzc66 1/1 Running 0 8m58s nmstate-handler-jkxgq 0/1 Terminating 0 25m nmstate-handler-xr8lc 1/1 Running 0 8m11s node-maintenance-operator-b775cddfb-5h7tj 1/1 Running 0 111m ovs-cni-amd64-b89vf 2/2 Running 0 25m ovs-cni-amd64-cmh84 2/2 Running 0 25m ovs-cni-amd64-qzmgr 2/2 Running 0 25m ovs-cni-amd64-ztthr 2/2 Running 0 25m virt-api-68f7857466-fctvh 1/1 Running 0 20m virt-api-68f7857466-mqq86 1/1 Running 0 20m virt-controller-59c4c6c84b-h2gsl 0/1 CrashLoopBackOff 1 20m virt-controller-59c4c6c84b-nfjtr 1/1 Running 6 20m virt-handler-6cnmh 1/1 Running 0 20m virt-handler-frhcw 1/1 Running 0 20m virt-operator-8b94c69b-rvlr5 0/1 CrashLoopBackOff 14 111m virt-operator-8b94c69b-zfrzq 1/1 Running 13 111m Version-Release number of selected component (if applicable): Container ID: cri-o://926e20d95b69b7f0998de57c4a8c3c0c5f80cba3e39795b223a0e4f9f1d02a62 Image: registry-proxy.engineering.redhat.com/rh-osbs/container-native-virtualization-hyperconverged-cluster-operator:v2.2.0-5 Image ID: registry-proxy.engineering.redhat.com/rh-osbs/container-native-virtualization-hyperconverged-cluster-operator@sha256:a74f005478e8e5f62b14805acf252fae8539ee358f87b7bfaf6415751a902d17 How reproducible: Steps to Reproduce: 1. deploy hco from rh-verified-operators 2. 3. Actual results: Expected results: Additional info:
Created attachment 1638070 [details] describe cdi-operator
After a few minutes the same env got in this status: [cnv-qe-jenkins@cnv-executor-a43 ~]$ oc get pods -n openshift-cnv NAME READY STATUS RESTARTS AGE bridge-marker-4kfns 1/1 Running 0 52m bridge-marker-6pgb5 1/1 Running 0 52m bridge-marker-8c4cp 1/1 Running 0 52m bridge-marker-nhqqk 1/1 Running 0 52m cdi-apiserver-779b7c455b-jw8fw 1/1 Running 0 52m cdi-deployment-59756855fc-27wf5 1/1 Running 7 51m cdi-operator-87ccdd97f-ldnp4 1/1 Running 0 24m cdi-operator-87ccdd97f-v8qv2 0/1 Terminating 19 139m cdi-uploadproxy-6456f9b5cb-2jspg 1/1 Running 0 24m cdi-uploadproxy-6456f9b5cb-9szg7 1/1 Terminating 0 52m cluster-network-addons-operator-57985b8b55-kgfzx 1/1 Running 0 139m hco-operator-5d46d855bc-dq76j 1/1 Running 0 24m hco-operator-5d46d855bc-t9mgx 0/1 Terminating 3 139m kube-cni-linux-bridge-plugin-4mn7f 1/1 Running 0 52m kube-cni-linux-bridge-plugin-dt4qh 1/1 Running 0 52m kube-cni-linux-bridge-plugin-m6pzg 1/1 Running 0 52m kube-cni-linux-bridge-plugin-psfb7 1/1 Running 0 52m kubemacpool-mac-controller-manager-5965948866-c9rvd 1/1 Terminating 0 47m kubemacpool-mac-controller-manager-5965948866-chhqw 1/1 Running 0 29m kubemacpool-mac-controller-manager-5965948866-wkhv8 1/1 Running 0 29m kubevirt-ssp-operator-64575cf47f-ffsr9 0/1 CrashLoopBackOff 16 139m nmstate-handler-5p5xx 1/1 Running 0 35m nmstate-handler-dzc66 1/1 Running 0 36m nmstate-handler-jkxgq 0/1 Terminating 0 52m nmstate-handler-xr8lc 1/1 Running 0 35m node-maintenance-operator-b775cddfb-5h7tj 1/1 Running 0 139m ovs-cni-amd64-b89vf 2/2 Running 0 52m ovs-cni-amd64-cmh84 2/2 Running 0 52m ovs-cni-amd64-qzmgr 2/2 Running 0 52m ovs-cni-amd64-ztthr 2/2 Running 0 52m virt-api-68f7857466-6c472 1/1 Running 0 24m virt-api-68f7857466-fctvh 1/1 Running 0 47m virt-api-68f7857466-mqq86 1/1 Terminating 0 47m virt-controller-59c4c6c84b-h2gsl 0/1 Terminating 1 47m virt-controller-59c4c6c84b-nfjtr 1/1 Running 6 47m virt-controller-59c4c6c84b-xvwdk 1/1 Running 0 24m virt-handler-6cnmh 1/1 Running 0 47m virt-handler-frhcw 1/1 Running 0 47m virt-operator-8b94c69b-hrk7l 1/1 Running 0 24m virt-operator-8b94c69b-rvlr5 0/1 Terminating 14 139m virt-operator-8b94c69b-zfrzq 1/1 Running 13 139m SSP operator fails with: {"level":"error","ts":1574247075.376807,"logger":"cmd","msg":"Exposing metrics port failed.","Namespace":"","error":"failed to create or get service for metrics: services \"kubevirt-ssp-operator-metrics\" is forbidden: User \"system:serviceaccount:openshift-cnv:kubevirt-ssp-operator\" cannot update resource \"services\" in API group \"\" in the namespace \"openshift-cnv\"","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\tsrc/github.com/operator-framework/operator-sdk/vendor/github.com/go-logr/zapr/zapr.go:128\ngithub.com/operator-framework/operator-sdk/pkg/ansible.Run\n\tsrc/github.com/operator-framework/operator-sdk/pkg/ansible/run.go:153\ngithub.com/operator-framework/operator-sdk/cmd/operator-sdk/run.newRunAnsibleCmd.func1\n\tsrc/github.com/operator-framework/operator-sdk/cmd/operator-sdk/run/ansible.go:38\ngithub.com/spf13/cobra.(*Command).execute\n\tsrc/github.com/operator-framework/operator-sdk/vendor/github.com/spf13/cobra/command.go:826\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\tsrc/github.com/operator-framework/operator-sdk/vendor/github.com/spf13/cobra/command.go:914\ngithub.com/spf13/cobra.(*Command).Execute\n\tsrc/github.com/operator-framework/operator-sdk/vendor/github.com/spf13/cobra/command.go:864\nmain.main\n\tsrc/github.com/operator-framework/operator-sdk/cmd/operator-sdk/main.go:84\nruntime.main\n\t/opt/rh/go-toolset-1.12/root/usr/lib/go-toolset-1.12-golang/src/runtime/proc.go:200"} Error: failed to create or get service for metrics: services "kubevirt-ssp-operator-metrics" is forbidden: User "system:serviceaccount:openshift-cnv:kubevirt-ssp-operator" cannot update resource "services" in API group "" in the namespace "openshift-cnv"
It's trying to deploy registry-proxy.engineering.redhat.com/rh-osbs/container-native-virtualization-kubevirt-ssp-operator:v2.2.0-8
I see the permission in our upstream manifest: https://github.com/MarSik/kubevirt-ssp-operator/blob/master/deploy/role.yaml#L66 And also in our manifest template for csv-generator: https://github.com/MarSik/kubevirt-ssp-operator/blob/master/manifests/generated/kubevirt-ssp-operator.vVERSION.clusterserviceversion.yaml#L76 Someone should check the HCO generated file.
We have this in HCO generated CSV: - apiGroups: - "" resources: - serviceaccounts - configmaps - services verbs: - create - get - patch - list - watch and also in the CSV deployed on that specific cluster.
This seams to me really close to https://bugzilla.redhat.com/1773905
I am not able to reproduce this bug. I tried it locally on my OKD 4.3 cluster and HCO deployed ssp correctly. Tareq please can you provide testing ENV where this bug occured?
Should be fixed in this PR https://github.com/kubevirt/hyperconverged-cluster-operator/pull/359
Env was provided clearing the needinfo
Using kubevirt-ssp-operator:v2.2.0-10 the issue seems to be still there: oc logs -n openshift-cnv kubevirt-ssp-operator-85c4cbcc96-lc524 {"level":"info","ts":1574766559.2150786,"logger":"cmd","msg":"Go Version: go1.12.12"} {"level":"info","ts":1574766559.2151883,"logger":"cmd","msg":"Go OS/Arch: linux/amd64"} {"level":"info","ts":1574766559.2152066,"logger":"cmd","msg":"Version of operator-sdk: v0.12.0+git"} {"level":"info","ts":1574766559.215244,"logger":"cmd","msg":"Watching namespace.","Namespace":""} {"level":"info","ts":1574766561.9840798,"logger":"controller-runtime.metrics","msg":"metrics server is starting to listen","addr":"0.0.0.0:8383"} {"level":"info","ts":1574766561.985181,"logger":"watches","msg":"Failed to parse %v from environment. Using default %v","WORKER_KUBEVIRTCOMMONTEMPLATESBUNDLE_KUBEVIRT_IO":1} {"level":"info","ts":1574766561.985221,"logger":"watches","msg":"Failed to parse %v from environment. Using default %v","ANSIBLE_VERBOSITY_KUBEVIRTCOMMONTEMPLATESBUNDLE_KUBEVIRT_IO":2} {"level":"info","ts":1574766561.9852333,"logger":"watches","msg":"Failed to parse %v from environment. Using default %v","WORKER_KUBEVIRTTEMPLATEVALIDATOR_KUBEVIRT_IO":1} {"level":"info","ts":1574766561.9852378,"logger":"watches","msg":"Failed to parse %v from environment. Using default %v","ANSIBLE_VERBOSITY_KUBEVIRTTEMPLATEVALIDATOR_KUBEVIRT_IO":2} {"level":"info","ts":1574766561.985245,"logger":"watches","msg":"Failed to parse %v from environment. Using default %v","WORKER_KUBEVIRTNODELABELLERBUNDLE_KUBEVIRT_IO":1} {"level":"info","ts":1574766561.9852488,"logger":"watches","msg":"Failed to parse %v from environment. Using default %v","ANSIBLE_VERBOSITY_KUBEVIRTNODELABELLERBUNDLE_KUBEVIRT_IO":2} {"level":"info","ts":1574766561.9852564,"logger":"watches","msg":"Failed to parse %v from environment. Using default %v","WORKER_KUBEVIRTMETRICSAGGREGATION_KUBEVIRT_IO":1} {"level":"info","ts":1574766561.9852605,"logger":"watches","msg":"Failed to parse %v from environment. Using default %v","ANSIBLE_VERBOSITY_KUBEVIRTMETRICSAGGREGATION_KUBEVIRT_IO":2} {"level":"info","ts":1574766561.9853222,"logger":"ansible-controller","msg":"Watching resource","Options.Group":"kubevirt.io","Options.Version":"v1","Options.Kind":"KubevirtCommonTemplatesBundle"} {"level":"info","ts":1574766561.9856575,"logger":"controller-runtime.controller","msg":"Starting EventSource","controller":"kubevirtcommontemplatesbundle-controller","source":"kind source: kubevirt.io/v1, Kind=KubevirtCommonTemplatesBundle"} {"level":"info","ts":1574766561.9858563,"logger":"ansible-controller","msg":"Watching resource","Options.Group":"kubevirt.io","Options.Version":"v1","Options.Kind":"KubevirtTemplateValidator"} {"level":"info","ts":1574766561.9859648,"logger":"controller-runtime.controller","msg":"Starting EventSource","controller":"kubevirttemplatevalidator-controller","source":"kind source: kubevirt.io/v1, Kind=KubevirtTemplateValidator"} {"level":"info","ts":1574766561.9860878,"logger":"ansible-controller","msg":"Watching resource","Options.Group":"kubevirt.io","Options.Version":"v1","Options.Kind":"KubevirtNodeLabellerBundle"} {"level":"info","ts":1574766561.9861827,"logger":"controller-runtime.controller","msg":"Starting EventSource","controller":"kubevirtnodelabellerbundle-controller","source":"kind source: kubevirt.io/v1, Kind=KubevirtNodeLabellerBundle"} {"level":"info","ts":1574766561.9862955,"logger":"ansible-controller","msg":"Watching resource","Options.Group":"kubevirt.io","Options.Version":"v1","Options.Kind":"KubevirtMetricsAggregation"} {"level":"info","ts":1574766561.9863958,"logger":"controller-runtime.controller","msg":"Starting EventSource","controller":"kubevirtmetricsaggregation-controller","source":"kind source: kubevirt.io/v1, Kind=KubevirtMetricsAggregation"} {"level":"info","ts":1574766561.9866958,"logger":"leader","msg":"Trying to become the leader."} {"level":"info","ts":1574766564.7678726,"logger":"leader","msg":"Found existing lock with my name. I was likely restarted."} {"level":"info","ts":1574766564.7679055,"logger":"leader","msg":"Continuing as the leader."} {"level":"error","ts":1574766578.6226292,"logger":"cmd","msg":"Exposing metrics port failed.","Namespace":"","error":"failed to create or get service for metrics: services \"kubevirt-ssp-operator-metrics\" is forbidden: User \"system:serviceaccount:openshift-cnv:kubevirt-ssp-operator\" cannot update resource \"services\" in API group \"\" in the namespace \"openshift-cnv\"","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\tsrc/github.com/operator-framework/operator-sdk/vendor/github.com/go-logr/zapr/zapr.go:128\ngithub.com/operator-framework/operator-sdk/pkg/ansible.Run\n\tsrc/github.com/operator-framework/operator-sdk/pkg/ansible/run.go:153\ngithub.com/operator-framework/operator-sdk/cmd/operator-sdk/run.newRunAnsibleCmd.func1\n\tsrc/github.com/operator-framework/operator-sdk/cmd/operator-sdk/run/ansible.go:38\ngithub.com/spf13/cobra.(*Command).execute\n\tsrc/github.com/operator-framework/operator-sdk/vendor/github.com/spf13/cobra/command.go:826\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\tsrc/github.com/operator-framework/operator-sdk/vendor/github.com/spf13/cobra/command.go:914\ngithub.com/spf13/cobra.(*Command).Execute\n\tsrc/github.com/operator-framework/operator-sdk/vendor/github.com/spf13/cobra/command.go:864\nmain.main\n\tsrc/github.com/operator-framework/operator-sdk/cmd/operator-sdk/main.go:84\nruntime.main\n\t/opt/rh/go-toolset-1.12/root/usr/lib/go-toolset-1.12-golang/src/runtime/proc.go:200"} Error: failed to create or get service for metrics: services "kubevirt-ssp-operator-metrics" is forbidden: User "system:serviceaccount:openshift-cnv:kubevirt-ssp-operator" cannot update resource "services" in API group "" in the namespace "openshift-cnv" Usage: operator-sdk run ansible [flags] Flags: --ansible-verbosity int Ansible verbosity. Overridden by environment variable. (default 2) -h, --help help for ansible --inject-owner-ref The ansible operator will inject owner references unless this flag is false (default true) --max-workers int Maximum number of workers to use. Overridden by environment variable. (default 1) --reconcile-period duration Default reconcile period for controllers (default 1m0s) --watches-file string Path to the watches file to use (default "./watches.yaml") --zap-devel Enable zap development mode (changes defaults to console encoder, debug log level, and disables sampling) --zap-encoder encoder Zap log encoding ('json' or 'console') --zap-level level Zap log level (one of 'debug', 'info', 'error' or any integer value > 0) (default info) --zap-sample sample Enable zap log sampling. Sampling will be disabled for integer log levels > 1 --zap-time-encoding timeEncoding Sets the zap time format ('epoch', 'millis', 'nano', or 'iso8601') (default ) Global Flags: --verbose Enable verbose logging
The fix is in kubevirt-ssp-operator-container-v2.2.0-11
All pods are up and running, ssp logs are clean. All logs attached in the previous comment.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2020:0307