Bug 2195989 - timeout during waiting for condition. "error preparing volumesnapshots"
Summary: timeout during waiting for condition. "error preparing volumesnapshots"
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: csi-driver
Version: 4.12
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: ODF 4.12.4
Assignee: Nobody
QA Contact: krishnaram Karthick
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-05-07 09:31 UTC by David Vaanunu
Modified: 2023-08-09 16:37 UTC (History)
6 users (show)

Fixed In Version: 4.12.4-1
Doc Type: Bug Fix
Doc Text:
Previously, stale RADOS block device (RBD) images were left in the cluster as there was trouble deleting the the RBD image due to "numerical result is out of range" error. With this fix, the number of trash entries list is increased in go-ceph. So, stale RBD images are not found in the Ceph cluster.
Clone Of:
Environment:
Last Closed: 2023-06-14 21:20:41 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github red-hat-storage ceph-csi pull 162 0 None open BUG 2195989: rebase: update go-ceph to v0.19.0 2023-05-22 11:10:26 UTC
Red Hat Product Errata RHSA-2023:3609 0 None None None 2023-06-14 21:21:03 UTC

Description David Vaanunu 2023-05-07 09:31:12 UTC
Description of problem (please be detailed as possible and provide log
snippests):

During OADP testing (backup & restore), while running restore flow
getting errors (OADP logs)  regarding  Volumesnapshot"

Errors:

time="2023-05-04T05:32:17Z" level=error msg="Namespace perf-busy-data-cephrbd-50pods, resource restore error: error preparing volumesnapshots.snapshot.storage.k8s.io/perf-busy-data-cephrbd-50pods/velero-pvc-busy-data-rbd-50pods-1-szwqr: rpc error: code = Unknown desc = timed out waiting for the condition" logSource="/remote-source/velero/app/pkg/controller/restore_controller.go:498" restore=openshift-adp/dm-restore-rbd-50pvs-cc50-iter4


time="2023-05-04T05:29:07Z" level=error msg="Timed out awaiting reconciliation of volumesnapshotrestoreList" cmd=/plugins/velero-plugin-for-vsm logSource="/remote-source/app/internal/util/util.go:393" pluginName=velero-plugin-for-vsm restore=openshift-adp/dm-restore-rbd-50pvs-cc50-iter4


Version of all relevant components (if applicable):

OCP 4.12.9
ODF 4.12.2
OADP 1.2.0-63

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?

Yes. the tests are failing and can't complete a full cycle of backup & restore.

Is there any workaround available to the best of your knowledge?
No

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
1

Can this issue reproducible?
yes

Can this issue reproduce from the UI?
no

If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1. create ns with a few PVs (+data)
2. running OADP backup - end with 'Completed' status
3. delete the ns
4. running OADP restore


Actual results:
restore failed with 'PartiallyFailed' status.


Expected results:
restore should succeed ('Completed' status)


Additional info:

Comment 2 Mudit Agarwal 2023-05-10 03:17:01 UTC
Not a 4.13 blocker

Comment 12 krishnaram Karthick 2023-06-01 11:54:33 UTC
Moving the bug to verified based on the regression run on 4.12.4-1 - https://ocs4-jenkins-csb-odf-qe.apps.ocp-c1.prod.psi.redhat.com/job/qe-deploy-ocs-cluster-prod/7951/

Comment 20 errata-xmlrpc 2023-06-14 21:20:41 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat OpenShift Data Foundation 4.12.4 security and Bug Fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:3609


Note You need to log in before you can comment on or make changes to this bug.