Description of problem: The default limits permit three mirror snapshots per image -- at which point the "limit - 1" mirror snapshot will be removed (oldest->newest ordering). Normally the rbd-mirror daemon will delete all but the most-recent snapshot once it has performed its sync. However, if the limit is reached while rbd-mirror is syncing between the oldest and next oldest snapshot, the next oldest snapshot will be removed while its in-use potentially leading to data corruption. Version-Release number of selected component (if applicable): 4.2 How reproducible: 100% under a loaded system with new snapshots being generated Steps to Reproduce: 1. load the system so that snapshot pruning is occuring Actual results: Potential for data corruption if the OSDs can act on the removed snapshot before the delta-sync completes. In upstream, it can lead to an assertion failure due to other bug fixes. Expected results: An in-use snapshot will not be removed. Additional info:
Thanks for the comments Ilya will move to QA verified. QA did not see any backlogged snapshots while creating hundreds of snapshots and checking the status. Hence moving to Verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: Red Hat Ceph Storage 4.2 Security and Bug Fix Update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2445