Description of problem: If there's any error (like an etcd timeout) when cdi-controller is scanning the CRDs, it can fail to identify that there's a Snapshot class associated with a Storage class, causing it to not start the smartclone-controller. This will cause problems when cloning DataVolumes, which get stuck in the SnapshotForSmartCloneInProgress phase. Moreover, no log message is recorded about if there's any failure rescanning the CRDs, so the issue is hard to identify. Version-Release number of selected component (if applicable): OCP 4.8.29 OCP-V 4.8.1 registry.redhat.io/container-native-virtualization/virt-cdi-operator@sha256:a72dc80b1b578a00ecc5043ad3311ec57a2814a6a8d7ea5cda4f08eaa6d6eaf2 How reproducible: Happened once in customer's environment. It could be difficult to reproduce as it will require a etcd timeout when cdi-controller is retrieving the CRDs. Steps to Reproduce: Not reproduced yet. Actual results: smartclone-controller not started. Cloned DataVolume stuck in SnapshotForSmartCloneInProgress. Expected results: Smart cloning working properly. Additional info: PR already submitted upstream: https://github.com/kubevirt/containerized-data-importer/pull/2265
Please create a 4.11 clone for this as well...
Why? This was merged in main before the 4.11 release branch was created, so it is part of 4.11 already.
Edited wrong bug, can't use 4.9 for target release 4.8.
Didn't make 4.8.6 and no new releases scheduled.
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 365 days