Bug 2210790

Summary:	[rbd-mirror] : reusage of thick-provisioned image did not mirror properly despite copied snap id matches : snapshot-based mirroring doesn't propagate discards
Product:	[Red Hat Storage] Red Hat Ceph Storage	Reporter:	Vasishta <vashastr>
Component:	RBD-Mirror	Assignee:	Ilya Dryomov <idryomov>
Status:	NEW ---	QA Contact:	Sunil Angadi <sangadi>
Severity:	high	Docs Contact:
Priority:	unspecified
Version:	6.1	CC:	ceph-eng-bugs, cephqe-warriors, idryomov, jdurgin, nibalach, sangadi, tserlin
Target Milestone:	---
Target Release:	9.0
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:		Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Vasishta 2023-05-29 13:52:33 UTC

Description of problem:

We have 2 clusters with two-way rbd mirroring configured between mirror_pool in both the clusters (mirror pool info and status from both sites - http://pastebin.test.redhat.com/1101267 )

I created a **thick-provisioned** 10G image - mirror_pool/cr4m_p9_b2_1 - on the site - 1d1a43ac-f3f5-11ed-9e26-b49691cee2a0
image info on both sites - http://pastebin.test.redhat.com/1101269

Mirroring was enabled on the image - snapshot based and snapshot schedule was added at image level for 3 minutes.
mirror image status at both clusters - http://pastebin.test.redhat.com/1101271

I mapped the image and created xfs on top of the image and wrote a 100 MB file.
Despite primary snapshot ID matching with copied secondary snapshot
there was a mismatch in rbd du at sites.
Primary
#  rbd du mirror_pool/cr4m_p9_b2_1  --debug-rbd 0
NAME          PROVISIONED  USED
cr4m_p9_b2_1       10 GiB  168 MiB
Secondary
~]#  rbd du mirror_pool/cr4m_p9_b2_1  --debug-rbd 0
NAME          PROVISIONED  USED
cr4m_p9_b2_1       10 GiB  10 GiB

Tried to export images at both clusters and observed that primary downloaded a 168 Mb file and secondary was trying to download 10G file

Version-Release number of selected component (if applicable):
 "rbd-mirror": {
        "ceph version 17.2.6-69.el9cp (d62b1a5d46b7355ca8b5056f78b7ebe3581e0d53) quincy (stable)": 1
    },

How reproducible:
Observed once

Steps to Reproduce:
(Clearly mentioned in description)

Actual results:
Deletions of data in primary images not reflected to secondary (?)

Expected results:
Mirroring should ensure that images in both primary and secondary are the same.

Additional info: