Bug 1906246

Summary: health changes to warning when an rbd is removed from the cluster
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Paul Cuzner <pcuzner>
Component: RBD-MirrorAssignee: Ilya Dryomov <idryomov>
Status: NEW --- QA Contact: Preethi <pnataraj>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.2CC: ceph-eng-bugs, ekuric, pnataraj
Target Milestone: ---   
Target Release: 7.0   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Paul Cuzner 2020-12-10 02:01:51 UTC
Description of problem:
When an rbd image is removed from a pool where snapshot based rbd-mirror is defined, there is a period of time where the secondary and the primary clusters both report a health of warning.

In itself this is a blip, that soon rectifies itself but presents a problem for
1. bulk rbd removal
2. in the kubernetes use case, the removal of a PV that was replication would result in a potential alert
3. any alerting that you would want to trigger based on replication health - would trigger unexpectedly

Version-Release number of selected component (if applicable):
rhcs5
ceph-16 (pacific)

How reproducible:
100%


Steps to Reproduce:
1. Establish rbd mirror between two clusters
2. create 50 rbd images and enable them for replication
3. remove 10 images, and observe the state of the relationship with rbd mirror pool status on both sides
(rbd -p <pool> mirror pool status --format json)

Actual results:
health (and image_health)goes into a warning state

Expected results:
the rbd rm command should not trigger a health status change. The removal has been requested, so the overall state should remain healthy.

alternatively, block the rbd rm command if the image is in a replicated state.

Additional info:
This issue has already been discussed with JasonD.