Bug 2237304

Summary: [rbd-mirror] : primary- 'demoted ' snapshots piling up after consecutive planned failovers (relocation) [7.0]
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Ilya Dryomov <idryomov>
Component: RBD-MirrorAssignee: Ilya Dryomov <idryomov>
Status: CLOSED ERRATA QA Contact: Sunil Angadi <sangadi>
Severity: urgent Docs Contact: Rivka Pollack <rpollack>
Priority: unspecified    
Version: 6.1CC: akraj, ceph-eng-bugs, cephqe-warriors, idryomov, sangadi, tserlin, vashastr, vereddy
Target Milestone: ---   
Target Release: 7.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ceph-18.2.0-10.el9cp Doc Type: Bug Fix
Doc Text:
.Demoted mirror snapshot is removed following the promotion of the image Previously, due to an implementation defect, the demoted mirror snapshots would not be removed following the promotion of the image, whether on the secondary image or on the primary image. Due to this, demoted mirror snapshots would pile up and consume storage space. With this fix, the implementation defect is fixed and the appropriate demoted mirror snapshot is removed following the promotion of the image.
Story Points: ---
Clone Of: 2214278 Environment:
Last Closed: 2023-12-13 15:22:45 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2237662    

Description Ilya Dryomov 2023-09-04 16:24:50 UTC
+++ This bug was initially created as a clone of Bug #2214278 +++

Description of problem:
snapshots with "state": "demoted" are getting piled up (or not removed) after consecutive planned failover (relocation) operations.

Version-Release number of selected component (if applicable):
17.2.6-70.el9cp

How reproducible:
Tried three time in a cluster, Observed three times

Steps to Reproduce:
1. Configure snapshot based two way mirroring between two clusters
2. Perform planned failover-failback more than two times consecutively and observe mirror snapshots of the images

Actual results:
1374066  .mirror.primary.62ae3f4c-653e-4c07-b413-f1ea026324f4.84fda36e-b16a-40b6-a779-402ac39484e4      100 GiB             Mon Jun 12 03:18:00 2023  mirror (demoted peer_uuids:[])
1379742  .mirror.primary.62ae3f4c-653e-4c07-b413-f1ea026324f4.06fc3331-79e2-4dcf-93e8-625f4a66671c      100 GiB             Mon Jun 12 06:34:55 2023  mirror (demoted peer_uuids:[])
1385799  .mirror.primary.62ae3f4c-653e-4c07-b413-f1ea026324f4.998234c6-63bd-459f-b8dc-cf344ca4c157      100 GiB             Mon Jun 12 09:28:43 2023  mirror (demoted peer_uuids:[])
1388488  .mirror.non_primary.62ae3f4c-653e-4c07-b413-f1ea026324f4.f85b00b8-068e-4751-804b-28615e2106ab  100 GiB             Mon Jun 12 10:38:25 2023  mirror (demoted peer_uuids:[] 68226a72-f0e3-451d-9048-cd35e7d0d0c6:1350197 copied)
1388588  .mirror.primary.62ae3f4c-653e-4c07-b413-f1ea026324f4.309c2860-76a6-4277-a8e3-4792c87069f6      100 GiB             Mon Jun 12 10:39:10 2023  mirror (primary peer_uuids:[])
1388612  .mirror.primary.62ae3f4c-653e-4c07-b413-f1ea026324f4.390655c4-c139-461a-9224-7897902802cd      100 GiB             Mon Jun 12 10:40:01 2023  mirror (primary peer_uuids:[68a6c766-37ed-4d54-97c3-67556467bb89])

Expected results:
Preserve snapshots which are required for the mirroring functionality

--- Additional comment from Vasishta on 2023-06-12 13:10:58 UTC ---

Additional info:
Sequence of operations - http://magna002.ceph.redhat.com/ceph-qe-logs/vasishta/2023/extra_snap/event_time

Snap ls
1) after initial failover-failback 
http://magna002.ceph.redhat.com/ceph-qe-logs/vasishta/2023/extra_snap/snap_ls_primary_cali020.log
http://magna002.ceph.redhat.com/ceph-qe-logs/vasishta/2023/extra_snap/snap_ls_secondary_pluto009.log

2) after secondary failover-failback 
http://magna002.ceph.redhat.com/ceph-qe-logs/vasishta/2023/extra_snap/snap_ls_second_failover_cali020.log
http://magna002.ceph.redhat.com/ceph-qe-logs/vasishta/2023/extra_snap/snap_ls_second_failover_pluto009.log

3) After third 
http://magna002.ceph.redhat.com/ceph-qe-logs/vasishta/2023/extra_snap/after_third_failback_primary.log
http://magna002.ceph.redhat.com/ceph-qe-logs/vasishta/2023/extra_snap/after_third_failback_secondary.log

Daemon logs
http://magna002.ceph.redhat.com/ceph-qe-logs/vasishta/2023/extra_snap/initial_primary_logs/ (cali020)
http://magna002.ceph.redhat.com/ceph-qe-logs/vasishta/2023/extra_snap/initial_secondary_logs/ (pluto009)

--- Additional comment from Red Hat Bugzilla on 2023-07-31 21:50:26 UTC ---

remove performed by PnT Account Manager <pnt-expunge>

--- Additional comment from Red Hat Bugzilla on 2023-07-31 21:50:34 UTC ---

remove performed by PnT Account Manager <pnt-expunge>

--- Additional comment from Red Hat Bugzilla on 2023-08-03 08:29:41 UTC ---

Account disabled by LDAP Audit

--- Additional comment from Ilya Dryomov on 2023-09-04 15:54:22 UTC ---

Pushed to ceph-6.1-rhel-patches.

--- Additional comment from Ilya Dryomov on 2023-09-04 16:22:20 UTC ---

Hi Sunil,

Could you please provide qa_ack?

Comment 8 errata-xmlrpc 2023-12-13 15:22:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat Ceph Storage 7.0 Bug Fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:7780