Bug 2083049 - smartclone-controller not started and cloned DataVolumes stuck in SnapshotForSmartCloneInProgress
Summary: smartclone-controller not started and cloned DataVolumes stuck in SnapshotFor...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Container Native Virtualization (CNV)
Classification: Red Hat
Component: Storage
Version: 4.8.1
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
: 4.8.7
Assignee: Alexander Wels
QA Contact: Yan Du
URL:
Whiteboard:
Depends On:
Blocks: 2085457 2085459
TreeView+ depends on / blocked
 
Reported: 2022-05-09 07:45 UTC by Juan Orti
Modified: 2025-12-26 13:01 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 2085457 (view as bug list)
Environment:
Last Closed: 2022-06-10 15:04:00 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github kubevirt containerized-data-importer pull 2265 0 None Merged Start smart clone controller from datavolume controller when needed 2022-06-07 00:26:23 UTC
Red Hat Knowledge Base (Solution) 6957008 0 None None None 2022-05-09 07:45:47 UTC

Description Juan Orti 2022-05-09 07:45:48 UTC
Description of problem:
If there's any error (like an etcd timeout) when cdi-controller is scanning the CRDs, it can fail to identify that there's a Snapshot class associated with a Storage class, causing it to not start the smartclone-controller.

This will cause problems when cloning DataVolumes, which get stuck in the SnapshotForSmartCloneInProgress phase.

Moreover, no log message is recorded about if there's any failure rescanning the CRDs, so the issue is hard to identify.

Version-Release number of selected component (if applicable):
OCP 4.8.29
OCP-V 4.8.1
registry.redhat.io/container-native-virtualization/virt-cdi-operator@sha256:a72dc80b1b578a00ecc5043ad3311ec57a2814a6a8d7ea5cda4f08eaa6d6eaf2

How reproducible:
Happened once in customer's environment. It could be difficult to reproduce as it will require a etcd timeout when cdi-controller is retrieving the CRDs.

Steps to Reproduce:
Not reproduced yet.

Actual results:
smartclone-controller not started. Cloned DataVolume stuck in SnapshotForSmartCloneInProgress.

Expected results:
Smart cloning working properly.

Additional info:
PR already submitted upstream:
https://github.com/kubevirt/containerized-data-importer/pull/2265

Comment 2 Adam Litke 2022-05-18 20:20:42 UTC
Please create a 4.11 clone for this as well...

Comment 3 Alexander Wels 2022-05-18 20:30:38 UTC
Why? This was merged in main before the 4.11 release branch was created, so it is part of 4.11 already.

Comment 4 Maya Rashish 2022-06-07 00:27:29 UTC
Edited wrong bug, can't use 4.9 for target release 4.8.

Comment 6 Alexander Wels 2022-06-10 15:04:00 UTC
Didn't make 4.8.6 and no new releases scheduled.

Comment 7 Red Hat Bugzilla 2023-09-15 01:54:35 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 365 days


Note You need to log in before you can comment on or make changes to this bug.