Bug 2083049

Summary: smartclone-controller not started and cloned DataVolumes stuck in SnapshotForSmartCloneInProgress
Product: Container Native Virtualization (CNV) Reporter: Juan Orti <jortialc>
Component: StorageAssignee: Alexander Wels <awels>
Status: CLOSED WONTFIX QA Contact: Yan Du <yadu>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.8.1CC: alitke, awels, cnv-qe-bugs, mrashish, yadu
Target Milestone: ---   
Target Release: 4.8.7   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 2085457 (view as bug list) Environment:
Last Closed: 2022-06-10 15:04:00 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2085457, 2085459    

Description Juan Orti 2022-05-09 07:45:48 UTC
Description of problem:
If there's any error (like an etcd timeout) when cdi-controller is scanning the CRDs, it can fail to identify that there's a Snapshot class associated with a Storage class, causing it to not start the smartclone-controller.

This will cause problems when cloning DataVolumes, which get stuck in the SnapshotForSmartCloneInProgress phase.

Moreover, no log message is recorded about if there's any failure rescanning the CRDs, so the issue is hard to identify.

Version-Release number of selected component (if applicable):
OCP 4.8.29
OCP-V 4.8.1
registry.redhat.io/container-native-virtualization/virt-cdi-operator@sha256:a72dc80b1b578a00ecc5043ad3311ec57a2814a6a8d7ea5cda4f08eaa6d6eaf2

How reproducible:
Happened once in customer's environment. It could be difficult to reproduce as it will require a etcd timeout when cdi-controller is retrieving the CRDs.

Steps to Reproduce:
Not reproduced yet.

Actual results:
smartclone-controller not started. Cloned DataVolume stuck in SnapshotForSmartCloneInProgress.

Expected results:
Smart cloning working properly.

Additional info:
PR already submitted upstream:
https://github.com/kubevirt/containerized-data-importer/pull/2265

Comment 2 Adam Litke 2022-05-18 20:20:42 UTC
Please create a 4.11 clone for this as well...

Comment 3 Alexander Wels 2022-05-18 20:30:38 UTC
Why? This was merged in main before the 4.11 release branch was created, so it is part of 4.11 already.

Comment 4 Maya Rashish 2022-06-07 00:27:29 UTC
Edited wrong bug, can't use 4.9 for target release 4.8.

Comment 6 Alexander Wels 2022-06-10 15:04:00 UTC
Didn't make 4.8.6 and no new releases scheduled.

Comment 7 Red Hat Bugzilla 2023-09-15 01:54:35 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 365 days