Created attachment 1662654 [details] full cdi-operator log Description of problem: I see following error in CDI operator {"level":"error","ts":1581507990.0780861,"logger":"controller-runtime.controller","msg":"Reconciler error","controller":"cdi-operator-controller","request":"/cdi-kubevirt-hyperconverged","error":"ClusterRole.rbac.authorization.k8s.io \"cdi-apiserver\" is invalid: [metadata.labels: Invalid value: \"sha256:2ca28952fb8da452da9fefc2db9a22bd813c9831ac5e93cd6f5489bbc795b32f\": must be no more than 63 characters, metadata.labels: Invalid value: \"sha256:2ca28952fb8da452da9fefc2db9a22bd813c9831ac5e93cd6f5489bbc795b32f\": a valid label must be an empty string or consist of alphanumeric characters, '-', '_' or '.', and must start and end with an alphanumeric character (e.g. 'MyValue', or 'my_value', or '12345', regex used for validation is '(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])?')]","stacktrace":"kubevirt.io/containerized-data-importer/vendor/github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/src/kubevirt.io/containerized-data-importer/vendor/github.com/go-logr/zapr/zapr.go:128\nkubevirt.io/containerized-data-importer/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/src/kubevirt.io/containerized-data-importer/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:258\nkubevirt.io/containerized-data-importer/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/src/kubevirt.io/containerized-data-importer/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:232\nkubevirt.io/containerized-data-importer/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\t/go/src/kubevirt.io/containerized-data-importer/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:211\nkubevirt.io/containerized-data-importer/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/src/kubevirt.io/containerized-data-importer/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152\nkubevirt.io/containerized-data-importer/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/src/kubevirt.io/containerized-data-importer/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153\nkubevirt.io/containerized-data-importer/vendor/k8s.io/apimachinery/pkg/util/wait.Until\n\t/go/src/kubevirt.io/containerized-data-importer/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88"} Version-Release number of selected component (if applicable): virt-cdi-operator-container-v2.3.0-31 How reproducible: 100 Steps to Reproduce: 1. Deploy cnv-2.3 from rh-osbs-operators 2. 3. Actual results: Failing to deploy CDI Expected results: CDI is ready Additional info:
I did some digging, and it turns out we are passing the sha of the container image as part of the operator deployment. Here is the example from upstream: apiVersion: apps/v1 kind: Deployment metadata: name: cdi-operator namespace: cdi spec: replicas: 1 selector: matchLabels: name: cdi-operator operator.cdi.kubevirt.io: "" strategy: {} template: metadata: labels: name: cdi-operator operator.cdi.kubevirt.io: "" spec: containers: - env: - name: DEPLOY_CLUSTER_RESOURCES value: "true" - name: OPERATOR_VERSION value: v1.12.0 <---------------- this is the one that contains the sha, and it shouldn't. I see how that happened as the rest of values are the container images, and the tags are the same. However for CNV it should be v2.3.x - name: CONTROLLER_IMAGE value: kubevirt/cdi-controller:v1.12.0 - name: IMPORTER_IMAGE value: kubevirt/cdi-importer:v1.12.0 - name: CLONER_IMAGE value: kubevirt/cdi-cloner:v1.12.0 - name: APISERVER_IMAGE value: kubevirt/cdi-apiserver:v1.12.0 - name: UPLOAD_SERVER_IMAGE value: kubevirt/cdi-uploadserver:v1.12.0 - name: UPLOAD_PROXY_IMAGE value: kubevirt/cdi-uploadproxy:v1.12.0 - name: VERBOSITY value: "1" - name: PULL_POLICY value: IfNotPresent image: kubevirt/cdi-operator:v1.12.0 imagePullPolicy: IfNotPresent name: cdi-operator ports: - containerPort: 60000 name: metrics protocol: TCP resources: {} securityContext: runAsNonRoot: true serviceAccountName: cdi-operator
(In reply to Alexander Wels from comment #1) > I did some digging, and it turns out we are passing the sha of the container > image as part of the operator deployment. Here is the example from upstream: Yes, correct: we are supposed to use only sha digests and not plain tags. This is required for the mirroring mechanism for disconnected install since OCP 4.3.1. A really close issue has been already fixed on CNAO side, see: https://github.com/kubevirt/cluster-network-addons-operator/pull/288
So this particular variable tells us which version the operator is. it needs to be semver compatible and simply cannot contain shas. It has NOTHING to do container images, its the version of the operator. We don't release CNV version sha256:something do we, it should be the version of CNV.
Ok, in that case I can easily fix it passing the CNV version to the CSV generator.
Simone, can you please link the PR that fixes this into the bug?
I can confirm that CDI was deployed and CDI operator doesn't contain any error in logs. cdi-apiserver-6cbb564b9c-tt2nx 1/1 Running 0 73s cdi-deployment-6d78f4db86-cx54m 1/1 Running 0 72s cdi-operator-68466ff7f8-js8b5 1/1 Running 0 2m8s cdi-uploadproxy-675d59b8b-kw8gm 1/1 Running 0 73s - name: DEPLOY_CLUSTER_RESOURCES value: "true" - name: OPERATOR_VERSION value: v2.3.0 - name: CONTROLLER_IMAGE value: registry-proxy.engineering.redhat.com/rh-osbs/container-native-virtualization-virt-cdi-controller@sha256:7b37b5b3b664839f448b300f1293b3d1d5a1d8d8aa1c23d129917dbc40985e2b - name: IMPORTER_IMAGE value: registry-proxy.engineering.redhat.com/rh-osbs/container-native-virtualization-virt-cdi-importer@sha256:f7bd104d73331a4338075311030083b81a6fa9f697cf15280f35bc50c8a73d11 - name: CLONER_IMAGE value: registry-proxy.engineering.redhat.com/rh-osbs/container-native-virtualization-virt-cdi-cloner@sha256:f28a7cf083a7f81bb6f538cb2415b5369f5280f5b12ea30c8764f9a8bd05ba67 - name: APISERVER_IMAGE value: registry-proxy.engineering.redhat.com/rh-osbs/container-native-virtualization-virt-cdi-apiserver@sha256:efc69f852668ac9535718b68f59a9a89955834c973699cc61fe364fb4e7e7565 - name: UPLOAD_SERVER_IMAGE value: registry-proxy.engineering.redhat.com/rh-osbs/container-native-virtualization-virt-cdi-uploadserver@sha256:ab094d96049cd3cbe42644f8022c543d9ac7aa584f82ba23587045d661a02f25 - name: UPLOAD_PROXY_IMAGE value: registry-proxy.engineering.redhat.com/rh-osbs/container-native-virtualization-virt-cdi-uploadproxy@sha256:ddb4db6080442d0b87d7ed537f9b804e82f815ff59b33626709354dcfcabbfc0 - name: VERBOSITY value: "1" - name: PULL_POLICY value: IfNotPresent image: registry-proxy.engineering.redhat.com/rh-osbs/container-native-virtualization-virt-cdi-operator@sha256:2ca28952fb8da452da9fefc2db9a22bd813c9831ac5e93cd6f5489bbc795b32f
So the problem seems to manifest itself when CNV is deployed in the openshift-operators namespace. Now that that is possible is a separate issue and being fixed somewhere else, but to verify the problem is fixed we should test that it still works even if we deploy in openshift-operators.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2020:2011