Bug 2195989
| Summary: | timeout during waiting for condition. "error preparing volumesnapshots" | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat OpenShift Data Foundation | Reporter: | David Vaanunu <dvaanunu> | 
| Component: | csi-driver | Assignee: | Nobody <nobody> | 
| Status: | CLOSED ERRATA | QA Contact: | krishnaram Karthick <kramdoss> | 
| Severity: | medium | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 4.12 | CC: | bniver, kbg, muagarwa, ocs-bugs, odf-bz-bot, sostapov | 
| Target Milestone: | --- | ||
| Target Release: | ODF 4.12.4 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | 4.12.4-1 | Doc Type: | Bug Fix | 
| Doc Text: | Previously, stale RADOS block device (RBD) images were left in the cluster as there was trouble deleting the the RBD image due to "numerical result is out of range" error. With this fix, the number of trash entries list is increased in go-ceph. So, stale RBD images are not found in the Ceph cluster. | Story Points: | --- | 
| Clone Of: | Environment: | ||
| Last Closed: | 2023-06-14 21:20:41 UTC | Type: | Bug | 
| Regression: | --- | Mount Type: | --- | 
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Not a 4.13 blocker Moving the bug to verified based on the regression run on 4.12.4-1 - https://ocs4-jenkins-csb-odf-qe.apps.ocp-c1.prod.psi.redhat.com/job/qe-deploy-ocs-cluster-prod/7951/ Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: Red Hat OpenShift Data Foundation 4.12.4 security and Bug Fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2023:3609 | 
Description of problem (please be detailed as possible and provide log snippests): During OADP testing (backup & restore), while running restore flow getting errors (OADP logs) regarding Volumesnapshot" Errors: time="2023-05-04T05:32:17Z" level=error msg="Namespace perf-busy-data-cephrbd-50pods, resource restore error: error preparing volumesnapshots.snapshot.storage.k8s.io/perf-busy-data-cephrbd-50pods/velero-pvc-busy-data-rbd-50pods-1-szwqr: rpc error: code = Unknown desc = timed out waiting for the condition" logSource="/remote-source/velero/app/pkg/controller/restore_controller.go:498" restore=openshift-adp/dm-restore-rbd-50pvs-cc50-iter4 time="2023-05-04T05:29:07Z" level=error msg="Timed out awaiting reconciliation of volumesnapshotrestoreList" cmd=/plugins/velero-plugin-for-vsm logSource="/remote-source/app/internal/util/util.go:393" pluginName=velero-plugin-for-vsm restore=openshift-adp/dm-restore-rbd-50pvs-cc50-iter4 Version of all relevant components (if applicable): OCP 4.12.9 ODF 4.12.2 OADP 1.2.0-63 Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? Yes. the tests are failing and can't complete a full cycle of backup & restore. Is there any workaround available to the best of your knowledge? No Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? 1 Can this issue reproducible? yes Can this issue reproduce from the UI? no If this is a regression, please provide more details to justify this: Steps to Reproduce: 1. create ns with a few PVs (+data) 2. running OADP backup - end with 'Completed' status 3. delete the ns 4. running OADP restore Actual results: restore failed with 'PartiallyFailed' status. Expected results: restore should succeed ('Completed' status) Additional info: