Bug 2308550 - ODF should alert when csi-clones exceed clone limits
Summary: ODF should alert when csi-clones exceed clone limits
Keywords:
Status: NEW
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: ceph-monitoring
Version: 4.14
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: ---
Assignee: arun kumar mohan
QA Contact: Harish NV Rao
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2024-08-29 19:19 UTC by Jenifer Abrams
Modified: 2024-09-04 15:15 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker OCSBZM-8891 0 None None None 2024-08-29 19:19:52 UTC

Description Jenifer Abrams 2024-08-29 19:19:40 UTC
Description of problem (please be detailed as possible and provide log
snippests):
It is a known issue that 100s of csi-clones of a single source can cause exponentially slower clone performance due to clone/flattening limitations. The ODF recommendation is to update docs to recommend a VolumeSnapshot cloning method for this type of scale, however there should be some alert for users if the cluster has exceeded these cloning limits since currently the only symptom may be extremely slow clone performance.

More background here: https://issues.redhat.com/browse/CNV-41845

Version of all relevant components (if applicable):
4.x


Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
Without an alert, users do not understand why clones may take many hours to complete 

Is there any workaround available to the best of your knowledge?
Working to update docs for VolumeSnapshot recommendation, but it is not the default clone strategy

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
2

Can this issue reproducible?
Y

Can this issue reproduce from the UI?
Y

If this is a regression, please provide more details to justify this:
N

Steps to Reproduce:
1. Create source pvc
2. Create 200+ csi-clones


Actual results:
Once clone limit is reached it can take exponentially longer to complete

Expected results:
Users are alerted of this limit and advised to use VolumeSnapshot

Additional info:
Doc recommendation draft: https://docs.google.com/document/d/1_I5ayeVHtvP5Has1dpdGNkpPOUESZL0Gb-IfSi6O8sE/edit


Note You need to log in before you can comment on or make changes to this bug.