Description of problem: ------------------------ When the HCO CR is deleted and created back, HCO CR was not reaching a stable condition Version-Release number of selected component (if applicable): ------------------------------------------------------------- CNV 4.11 ( bundle - v4.11.0-360 ) Index Image - registry-proxy.engineering.redhat.com/rh-osbs/iib:233912 hyperconverged-cluster-operator - v4.11.0-65 OCP 4.11 nightly ( 4.11.0-0.nightly-2022-05-11-054135 ) How reproducible: ----------------- Always Steps to Reproduce: ------------------- 1. Set the 'uninstallStrategy' of HCO CR to 'RemoveWorkloads' # oc edit hco kubevirt-hyperconverged -n openshift-cnv 2. Remove the HCO CR # oc delete hco kubevirt-hyperconverged -n openshift-cnv 3. After successful deletion of 'oc delete' command, create HCO CR ( "HyperConverged") from web-console ( Installed Operators -> create HyperConverged ) Actual results: --------------- HCO CR never reaches stable condition <snip> { "lastTransitionTime": "2022-05-18T05:03:33Z", "message": "SSP is progressing: Error: the server could not find the requested resource (post datasources.cdi.kubevirt.io)", "observedGeneration": 2, "reason": "SSPProgressing", "status": "True", "type": "Progressing" }, { "lastTransitionTime": "2022-05-18T05:03:33Z", "message": "SSP is degraded: Error: the server could not find the requested resource (post datasources.cdi.kubevirt.io)", "observedGeneration": 2, "reason": "SSPDegraded", "status": "True", "type": "Degraded" }, </snip> Expected results: ----------------- HCO CR should reach stable condition after recreating the same Additional info: ----------------- SSP is stuck in deploying state [ ~]$ oc get ssps ssp-kubevirt-hyperconverged -n openshift-cnv NAME PHASE ssp-kubevirt-hyperconverged Deploying [ ~]$ oc get ssps ssp-kubevirt-hyperconverged -n openshift-cnv NAME PHASE ssp-kubevirt-hyperconverged Deploying [ ~]$ oc get ssps ssp-kubevirt-hyperconverged -n openshift-cnv -o json | jq '.status' { "conditions": [ { "lastHeartbeatTime": "2022-05-18T05:24:40Z", "lastTransitionTime": "2022-05-18T05:03:32Z", "message": "Error: the server could not find the requested resource (post datasources.cdi.kubevirt.io)", "reason": "Available", "status": "False", "type": "Available" }, { "lastHeartbeatTime": "2022-05-18T05:24:40Z", "lastTransitionTime": "2022-05-18T05:03:32Z", "message": "Error: the server could not find the requested resource (post datasources.cdi.kubevirt.io)", "reason": "Progressing", "status": "True", "type": "Progressing" }, { "lastHeartbeatTime": "2022-05-18T05:24:40Z", "lastTransitionTime": "2022-05-18T05:03:32Z", "message": "Error: the server could not find the requested resource (post datasources.cdi.kubevirt.io)", "reason": "Degraded", "status": "True", "type": "Degraded" } ], "observedGeneration": 1, "operatorVersion": "4.11.0", "phase": "Deploying", "targetVersion": "4.11.0" }
Created attachment 1880759 [details] CDI operator logs
The underlying problem is that cdi-operator doesn't remove its secrets and configmaps on CDI deletion process. Then, on reinstallation, the previous secrets and configmaps are preventing cdi-operator to proceed due to orphan objects, e.g. {"level":"error","ts":1652795514.7434072,"logger":"cdi-operator","msg":"error getting apiserver ca bundle","error":"ConfigMap \"cdi-apiserver-signer-bundle\" not found","stacktrace":"kubevirt.io/containerized-data-importer/pkg/operator/resources/cluster.getAPIServerCABundle\n\t/remote-source/app/pkg/operator/resources/cluster/apiserver.go:542\nkubevirt.io/containerized-data-importer/pkg/operator/resources/cluster.createDataImportCronValidatingWebhook\n\t/remote-source/app/pkg/operator/resources/cluster/apiserver.go:244\nkubevirt.io/containerized-data-importer/pkg/operator/resources/cluster.createDynamicAPIServerResources\n\t/remote-source/app/pkg/operator/resources/cluster/apiserver.go:57\nkubevirt.io/containerized-data-importer/pkg/operator/resources/cluster.createResourceGroup\n\t/remote-source/app/pkg/operator/resources/cluster/factory.go:102\nkubevirt.io/containerized-data-importer/pkg/operator/resources/cluster.createAllResources\n\t/remote-source/app/pkg/operator/resources/cluster/factory.go:88\nkubevirt.io/containerized-data-importer/pkg/operator/resources/cluster.CreateAllDynamicResources\n\t/remote-source/app/pkg/operator/resources/cluster/factory.go:77\nkubevirt.io/containerized-data-importer/pkg/operator/controller.(*ReconcileCDI).GetAllResources\n\t/remote-source/app/pkg/operator/controller/cr-manager.go:126\nkubevirt.io/controller-lifecycle-operator-sdk/pkg/sdk/reconciler.(*Reconciler).CheckForOrphans\n\t/remote-source/app/vendor/kubevirt.io/controller-lifecycle-operator-sdk/pkg/sdk/reconciler/reconciler.go:363\nkubevirt.io/controller-lifecycle-operator-sdk/pkg/sdk/reconciler.(*Reconciler).Reconcile\n\t/remote-source/app/vendor/kubevirt.io/controller-lifecycle-operator-sdk/pkg/sdk/reconciler/reconciler.go:152\nkubevirt.io/containerized-data-importer/pkg/operator/controller.(*ReconcileCDI).Reconcile\n\t/remote-source/app/pkg/operator/controller/controller.go:236\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:114\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:311\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:227"} {"level":"info","ts":1652795514.7443233,"logger":"cdi-operator","msg":"Orphan object exists","Request.Namespace":"","Request.Name":"cdi-kubevirt-hyperconverged","obj":{"apiVersion":"v1","kind":"Secret","namespace":"openshift-cnv","name":"cdi-uploadserver-client-signer"}} W/A is to manually delete the CDI secrets and configmaps, to let cdi-operator complete the reconciliation. Moving to CDI team.
Looking at CDI operator logs we see that quickly after the deletion of CDI CR, CDI operator recreates secretes and config maps: {"level":"info","ts":1652864475.2948112,"logger":"cdi-operator","msg":"Reconciling CDI","Request.Namespace":"","Request.Name":"cdi-kubevirt-hyperconverged"} {"level":"info","ts":1652864475.2948842,"logger":"cdi-operator","msg":"Doing reconcile update","Request.Namespace":"","Request.Name":"cdi-kubevirt-hyperconverged"} {"level":"info","ts":1652864476.3855922,"logger":"cdi-operator","msg":"Resource created","Request.Namespace":"","Request.Name":"cdi-kubevirt-hyperconverged","namespace":"kubevirt-hyperconverged","name":"cdi-insecure-registries","type":"*v1.ConfigMap"} {"level":"info","ts":1652864476.4569244,"logger":"cdi-operator","msg":"Resource created","Request.Namespace":"","Request.Name":"cdi-kubevirt-hyperconverged","namespace":"kubevirt-hyperconverged","name":"cdi-uploadproxy","type":"*v1.ServiceAccount"} {"level":"info","ts":1652864476.5323896,"logger":"cdi-operator","msg":"Resource created","Request.Namespace":"","Request.Name":"cdi-kubevirt-hyperconverged","namespace":"kubevirt-hyperconverged","name":"cdi-uploadproxy","type":"*v1.RoleBinding"} {"level":"info","ts":1652864476.5842297,"logger":"cdi-operator","msg":"Resource created","Request.Namespace":"","Request.Name":"cdi-kubevirt-hyperconverged","namespace":"kubevirt-hyperconverged","name":"cdi-uploadproxy","type":"*v1.Role"} {"level":"info","ts":1652864476.7492573,"logger":"cdi-operator","msg":"Resource created","Request.Namespace":"","Request.Name":"cdi-kubevirt-hyperconverged","namespace":"kubevirt-hyperconverged","name":"cdi-cronjob","type":"*v1.ServiceAccount"} {"level":"info","ts":1652864476.7851727,"logger":"cdi-operator","msg":"Resource created","Request.Namespace":"","Request.Name":"cdi-kubevirt-hyperconverged","namespace":"kubevirt-hyperconverged","name":"cdi-cronjob","type":"*v1.RoleBinding"} {"level":"info","ts":1652864476.8056452,"logger":"cdi-operator","msg":"Resource created","Request.Namespace":"","Request.Name":"cdi-kubevirt-hyperconverged","namespace":"kubevirt-hyperconverged","name":"cdi-cronjob","type":"*v1.Role"} {"level":"info","ts":1652864476.8206687,"logger":"cdi-operator","msg":"Resource created","Request.Namespace":"","Request.Name":"cdi-kubevirt-hyperconverged","namespace":"","name":"v1beta1.upload.cdi.kubevirt.io","type":"*v1.APIService"} {"level":"info","ts":1652864476.927518,"logger":"cdi-operator","msg":"Resource created","Request.Namespace":"","Request.Name":"cdi-kubevirt-hyperconverged","namespace":"","name":"v1alpha1.upload.cdi.kubevirt.io","type":"*v1.APIService"} {"level":"info","ts":1652864476.9706755,"logger":"cdi-operator","msg":"Resource created","Request.Namespace":"","Request.Name":"cdi-kubevirt-hyperconverged","namespace":"","name":"cdi-api-datavolume-validate","type":"*v1.ValidatingWebhookConfiguration"} {"level":"info","ts":1652864477.0325625,"logger":"cdi-operator","msg":"Resource created","Request.Namespace":"","Request.Name":"cdi-kubevirt-hyperconverged","namespace":"","name":"cdi-api-datavolume-mutate","type":"*v1.MutatingWebhookConfiguration"} {"level":"info","ts":1652864477.0888171,"logger":"cdi-operator","msg":"Resource created","Request.Namespace":"","Request.Name":"cdi-kubevirt-hyperconverged","namespace":"","name":"cdi-api-validate","type":"*v1.ValidatingWebhookConfiguration"} {"level":"info","ts":1652864477.1155927,"logger":"cdi-operator","msg":"Resource created","Request.Namespace":"","Request.Name":"cdi-kubevirt-hyperconverged","namespace":"","name":"objecttransfer-api-validate","type":"*v1.ValidatingWebhookConfiguration"} {"level":"info","ts":1652864477.1779423,"logger":"cdi-operator","msg":"Resource created","Request.Namespace":"","Request.Name":"cdi-kubevirt-hyperconverged","namespace":"","name":"cdi-api-dataimportcron-validate","type":"*v1.ValidatingWebhookConfiguration"} {"level":"info","ts":1652864477.2343497,"logger":"cdi-operator","msg":"Resource created","Request.Namespace":"","Request.Name":"cdi-kubevirt-hyperconverged","namespace":"kubevirt-hyperconverged","name":"cdi-apiserver-signer","type":"*v1.Secret"} {"level":"info","ts":1652864477.2936044,"logger":"cdi-operator","msg":"Resource created","Request.Namespace":"","Request.Name":"cdi-kubevirt-hyperconverged","namespace":"kubevirt-hyperconverged","name":"cdi-apiserver-signer-bundle","type":"*v1.ConfigMap"} {"level":"info","ts":1652864477.3253348,"logger":"cdi-operator","msg":"Resource created","Request.Namespace":"","Request.Name":"cdi-kubevirt-hyperconverged","namespace":"kubevirt-hyperconverged","name":"cdi-apiserver-server-cert","type":"*v1.Secret"} {"level":"info","ts":1652864477.3650944,"logger":"cdi-operator","msg":"Resource created","Request.Namespace":"","Request.Name":"cdi-kubevirt-hyperconverged","namespace":"kubevirt-hyperconverged","name":"cdi-uploadproxy-signer","type":"*v1.Secret"} {"level":"info","ts":1652864477.3879237,"logger":"cdi-operator","msg":"Resource created","Request.Namespace":"","Request.Name":"cdi-kubevirt-hyperconverged","namespace":"kubevirt-hyperconverged","name":"cdi-uploadproxy-signer-bundle","type":"*v1.ConfigMap"} {"level":"info","ts":1652864477.414795,"logger":"cdi-operator","msg":"Resource created","Request.Namespace":"","Request.Name":"cdi-kubevirt-hyperconverged","namespace":"kubevirt-hyperconverged","name":"cdi-uploadproxy-server-cert","type":"*v1.Secret"} {"level":"info","ts":1652864477.4489758,"logger":"cdi-operator","msg":"Resource created","Request.Namespace":"","Request.Name":"cdi-kubevirt-hyperconverged","namespace":"kubevirt-hyperconverged","name":"cdi-uploadserver-signer","type":"*v1.Secret"} {"level":"info","ts":1652864477.4908574,"logger":"cdi-operator","msg":"Resource created","Request.Namespace":"","Request.Name":"cdi-kubevirt-hyperconverged","namespace":"kubevirt-hyperconverged","name":"cdi-uploadserver-signer-bundle","type":"*v1.ConfigMap"} {"level":"info","ts":1652864477.5296195,"logger":"cdi-operator","msg":"Resource created","Request.Namespace":"","Request.Name":"cdi-kubevirt-hyperconverged","namespace":"kubevirt-hyperconverged","name":"cdi-uploadserver-client-signer","type":"*v1.Secret"} {"level":"info","ts":1652864477.5725944,"logger":"cdi-operator","msg":"Resource created","Request.Namespace":"","Request.Name":"cdi-kubevirt-hyperconverged","namespace":"kubevirt-hyperconverged","name":"cdi-uploadserver-client-signer-bundle","type":"*v1.ConfigMap"} {"level":"info","ts":1652864477.6052103,"logger":"cdi-operator","msg":"Resource created","Request.Namespace":"","Request.Name":"cdi-kubevirt-hyperconverged","namespace":"kubevirt-hyperconverged","name":"cdi-uploadserver-client-cert","type":"*v1.Secret"} and then it starts hot lopping on: {"level":"info","ts":1652864508.1824028,"logger":"cdi-operator","msg":"CDI CR does not exist","Request.Namespace":"","Request.Name":"cdi-kubevirt-hyperconverged"} {"level":"info","ts":1652865362.5791483,"logger":"cdi-operator","msg":"Reconciling CDI","Request.Namespace":"","Request.Name":"cdi-kubevirt-hyperconverged"} {"level":"info","ts":1652865362.5931349,"logger":"cdi-operator","msg":"Orphan object exists","Request.Namespace":"","Request.Name":"cdi-kubevirt-hyperconverged","obj":{"apiVersion":"v1","kind":"Secret","namespace":"kubevirt-hyperconverged","name":"cdi-apiserver-signer"}} and it's never going to progress until those objects are explicitly removed (or everything is removed removing the namespace).
Tested with the workaround suggested by Simone and Oren about removing CDI secrets and configmaps Removed CDI secrets [ ~]$ oc delete secret -n openshift-cnv cdi-apiserver-server-cert cdi-apiserver-signer cdi-uploadproxy-server-cert cdi-uploadproxy-signer cdi-uploadserver-client-cert cdi-uploadserver-client-signer cdi-uploadserver-signer Removed CDI configmaps [ ~]$ oc delete cm -n openshift-cnv cdi-apiserver-signer-bundle cdi-uploadproxy-signer-bundle cdi-uploadserver-client-signer-bundle cdi-uploadserver-signer-bundle Creating HCO CR after deleting secret and configmaps works good, and reaches stable condition
SSP team, can you take a look? It seems a fix on your end might do the work here.
Do you mean that SSP operator should remove the CDI Secrets and ConfigMaps? It seems unrelated to SSP.
I'm not sure it's the configmap and the secret, maybe they just a side effect. I can see that SSP still trying to read dataSource and DataImportCron after CDI and its CRDs are removed.
True, the ssp operator does not behave correctly when CRDs are removed. We can fix it, but I'm not sure if it will fix this bug.
Removing target release due to the changed of component.
Deferring this due to capacity.
This bug may have been fixed by the same PR as Bug 2122236.
This bug can be verified in 4.12.
I've tried to verify this bug on my cluster with CNV 4.12.0-745, and it is still happening. The problem is that after recreating HCO, the CDI CRDs are not created, and SSP is waiting for them (datasources.cdi.kubevirt.io, dataimportcrons.cdi.kubevirt.io, datavolumes.cdi.kubevirt.io) There is an error in the cdi-operator log: { "level": "error", "ts": 1669714626.7532284, "logger": "cdi-operator", "msg": "error getting apiserver ca bundle", "error": "ConfigMap \"cdi-apiserver-signer-bundle\" not found", "stacktrace": "{STACKTRACE_BELOW}" } This is the stacktrace: kubevirt.io/containerized-data-importer/pkg/operator/resources/cluster.getAPIServerCABundle /remote-source/app/pkg/operator/resources/cluster/apiserver.go:553 kubevirt.io/containerized-data-importer/pkg/operator/resources/cluster.createDataVolumeMutatingWebhook /remote-source/app/pkg/operator/resources/cluster/apiserver.go:541 kubevirt.io/containerized-data-importer/pkg/operator/resources/cluster.createDynamicAPIServerResources /remote-source/app/pkg/operator/resources/cluster/apiserver.go:54 kubevirt.io/containerized-data-importer/pkg/operator/resources/cluster.createResourceGroup /remote-source/app/pkg/operator/resources/cluster/factory.go:102 kubevirt.io/containerized-data-importer/pkg/operator/resources/cluster.createAllResources /remote-source/app/pkg/operator/resources/cluster/factory.go:88 kubevirt.io/containerized-data-importer/pkg/operator/resources/cluster.CreateAllDynamicResources /remote-source/app/pkg/operator/resources/cluster/factory.go:77 kubevirt.io/containerized-data-importer/pkg/operator/controller.(*ReconcileCDI).GetAllResources /remote-source/app/pkg/operator/controller/cr-manager.go:126 kubevirt.io/controller-lifecycle-operator-sdk/pkg/sdk/reconciler.(*Reconciler).CheckForOrphans /remote-source/app/vendor/kubevirt.io/controller-lifecycle-operator-sdk/pkg/sdk/reconciler/reconciler.go:363 kubevirt.io/controller-lifecycle-operator-sdk/pkg/sdk/reconciler.(*Reconciler).Reconcile /remote-source/app/vendor/kubevirt.io/controller-lifecycle-operator-sdk/pkg/sdk/reconciler/reconciler.go:152 kubevirt.io/containerized-data-importer/pkg/operator/controller.(*ReconcileCDI).Reconcile /remote-source/app/pkg/operator/controller/controller.go:236 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:114 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:311 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:266 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2 /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:227