Bug 2106575

Summary: [rbd-mirror] : snap ls --all : Lot of non_primary snapshots stuck in trash namespace since days
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Vasishta <vashastr>
Component: RBD-MirrorAssignee: Ilya Dryomov <idryomov>
Status: NEW --- QA Contact: Sunil Angadi <sangadi>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 5.2CC: ceph-eng-bugs, cephqe-warriors, nibalach, sangadi
Target Milestone: ---Flags: sangadi: needinfo+
Target Release: 7.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Vasishta 2022-07-13 04:06:42 UTC
Description of problem:
1) created two clusters, created 26 images both the sides with snapshot schedule with 2 min, ran some IOs
Created 76 more images on one of the sites (site-a)

2) site-b had laggy osds and some undersized pgs by 1 replica. Scaled up and uppgraded twice to latest build of rhcs 5.2 (just 2-3 builds ahead).

4) Upon observing site-b with 102 non_primary images, lot of images had too old mirroring non_primary snapshots in trash namespace stuck.
http://pastebin.test.redhat.com/1064322

Version-Release number of selected component (if applicable):
16.2.8-65.el8cp - 16.2.8-71.el8cp

How reproducible:
Tried once

Steps to Reproduce:
Mentioned in description
(Steps did not involve failover-failback scenario)

Actual results:
Many snaps stuck in trash namespace.

Expected results:
All snaps to get deleted after accordingly after copying and new snap creation.

Additional info: