.rbd-mirror daemon no longer enters the `UNKNOWN` state when reusing a user group snapshot
Previously, when deleting a group snapshot on the primary cluster and recreating another with the same name before the deletion of the original group snapshot is mirrored to the secondary cluster, the rbd-mirror daemon would enter an `UNKNOWN` state.
With this fix, group snapshot IDs are now compared instead of the IDs. As a result, a snapshot is only preserved if the ID match and not with the names. Sync issues are now resolved when group snapshot names are reused and rbd-mirror daemons no longer enter the `UNKNOWN` state due to stale remote group snapshots.
Description of problem:
The Adhoc scenario focuses on playing around snapshot sequencing both with user create group snapshots and system created mirror group snapshots.
Test Steps:
1.Create rbd image
2.Add to the group
3.Create group snapshot 'snap_1'
4. enable mirroring on the group
Now group has 2 snapshot (one user created group snapshot and another system created mirror group snapshot)
[ceph: root@ceph-rbd1-aarti-1-z8kn5p-node1-installer /]# rbd group snap list --pool pool_1 --group group_1 --debug_rbd 0
ID NAME STATE NAMESPACE
4056f9d2de72 snap_1 complete user
4062a3fd2d59 .mirror.primary.59744227-253d-4352-a195-042c8b544655.4062a3fd2d59 complete mirror (primary peer_uuids:[fd3be1a4-06a4-4c0d-9f24-59f93dff3fdb])
[ceph: root@ceph-rbd1-aarti-1-z8kn5p-node1-installer /]#
5. Wait for both snapshot to be completed on site-b
[root@ceph-rbd2-aarti-1-z8kn5p-node1-installer ~]# rbd group snap list --pool pool_1 --group group_1 --debug_rbd 0
ID NAME STATE NAMESPACE
4056f9d2de72 snap_1 complete user
4062a3fd2d59 .mirror.non-primary.59744227-253d-4352-a195-042c8b544655.4062a3fd2d59 complete mirror (non-primary peer_uuids:[] 51835de5-c34b-4658-874d-8eee505d2681:4062a3fd2d59)
[root@ceph-rbd2-aarti-1-z8kn5p-node1-installer ~]#
6. Delete user group snap_1 from site-a
[ceph: root@ceph-rbd1-aarti-1-z8kn5p-node1-installer /]# rbd group snap remove --pool pool_1 --group group_1 --snap snap_1 --debug_rbd 0
[ceph: root@ceph-rbd1-aarti-1-z8kn5p-node1-installer /]# rbd group snap list --pool pool_1 --group group_1 --debug_rbd 0
ID NAME STATE NAMESPACE
4062a3fd2d59 .mirror.primary.59744227-253d-4352-a195-042c8b544655.4062a3fd2d59 complete mirror (primary peer_uuids:[fd3be1a4-06a4-4c0d-9f24-59f93dff3fdb])
[ceph: root@ceph-rbd1-aarti-1-z8kn5p-node1-installer /]#
7. Create user group snap_1 again on site-a
[ceph: root@ceph-rbd1-aarti-1-z8kn5p-node1-installer /]# rbd group snap create --pool pool_1 --group group_1 --snap snap_1 --debug_rbd 0
[ceph: root@ceph-rbd1-aarti-1-z8kn5p-node1-installer /]# rbd group snap list --pool pool_1 --group group_1 --debug_rbd 0
ID NAME STATE NAMESPACE
4062a3fd2d59 .mirror.primary.59744227-253d-4352-a195-042c8b544655.4062a3fd2d59 complete mirror (primary peer_uuids:[fd3be1a4-06a4-4c0d-9f24-59f93dff3fdb])
407a981c5c65 snap_1 complete user
[ceph: root@ceph-rbd1-aarti-1-z8kn5p-node1-installer /]#
8. Create another mirror group snapshot
[ceph: root@ceph-rbd1-aarti-1-z8kn5p-node1-installer /]# rbd mirror group snapshot --pool pool_1 --group group_1 --debug_rbd 0
Snapshot ID: 408c15feb27f
[ceph: root@ceph-rbd1-aarti-1-z8kn5p-node1-installer /]# rbd group snap list --pool pool_1 --group group_1 --debug_rbd 0
ID NAME STATE NAMESPACE
4062a3fd2d59 .mirror.primary.59744227-253d-4352-a195-042c8b544655.4062a3fd2d59 complete mirror (primary peer_uuids:[fd3be1a4-06a4-4c0d-9f24-59f93dff3fdb])
407a981c5c65 snap_1 complete user
408c15feb27f .mirror.primary.59744227-253d-4352-a195-042c8b544655.408c15feb27f complete mirror (primary peer_uuids:[fd3be1a4-06a4-4c0d-9f24-59f93dff3fdb])
[ceph: root@ceph-rbd1-aarti-1-z8kn5p-node1-installer /]#
9. The mirror group snapshot created in step#8 is not getting propagated to site-b, Also group status on site-b toggles between "up+starting_replay" and "down+starting_replay". Additionally: After some time i see rbd-mirror daemon going down in UNKNOWN state
[root@ceph-rbd2-aarti-1-z8kn5p-node1-installer ~]# rbd group snap list --pool pool_1 --group group_1 --debug_rbd 0
ID NAME STATE NAMESPACE
4056f9d2de72 snap_1 complete user
4062a3fd2d59 .mirror.non-primary.59744227-253d-4352-a195-042c8b544655.4062a3fd2d59 complete mirror (non-primary peer_uuids:[] 51835de5-c34b-4658-874d-8eee505d2681:4062a3fd2d59)
[root@ceph-rbd2-aarti-1-z8kn5p-node1-installer ~]#
[root@ceph-rbd2-aarti-1-z8kn5p-node1-installer ~]# rbd mirror group status --pool pool_1 --group group_1 --debug_rbd 0
group_1:
global_id: 59744227-253d-4352-a195-042c8b544655
state: up+starting_replay
description: starting replay
service: ceph-rbd2-aarti-1-z8kn5p-node5.mfqqkl on ceph-rbd2-aarti-1-z8kn5p-node5
last_update: 2025-05-26 17:20:21
images:
image: 6/80ac99a8-3fd9-4990-8065-32576e15ac5a
state: up+replaying
description: replaying, {"bytes_per_second":0.0,"bytes_per_snapshot":0.0,"last_snapshot_bytes":0,"last_snapshot_sync_seconds":0,"local_snapshot_timestamp":1748279761,"remote_snapshot_timestamp":1748279916,"replay_state":"idle"}
peer_sites:
name: site-a
state: up+stopped
description: local group is primary
last_update: 2025-05-26 17:20:35
images:
image: 6/80ac99a8-3fd9-4990-8065-32576e15ac5a
state: up+stopped
description: local image is primary
[root@ceph-rbd2-aarti-1-z8kn5p-node1-installer ~]# rbd mirror group status --pool pool_1 --group group_1 --debug_rbd 0
group_1:
global_id: 59744227-253d-4352-a195-042c8b544655
state: down+starting_replay
description: starting replay
last_update: 2025-05-26 17:20:21
images:
image: 6/80ac99a8-3fd9-4990-8065-32576e15ac5a
state: down+replaying
description: replaying, {"bytes_per_second":0.0,"bytes_per_snapshot":0.0,"last_snapshot_bytes":0,"last_snapshot_sync_seconds":0,"local_snapshot_timestamp":1748279761,"remote_snapshot_timestamp":1748279916,"replay_state":"idle"}
peer_sites:
name: site-a
state: up+stopped
description: local group is primary
last_update: 2025-05-26 17:20:35
images:
image: 6/80ac99a8-3fd9-4990-8065-32576e15ac5a
state: up+stopped
description: local image is primary
[root@ceph-rbd2-aarti-1-z8kn5p-node1-installer ~]#
[root@ceph-rbd2-aarti-1-z8kn5p-node1-installer ~]# rbd mirror pool status pool_1
2025-05-26T17:19:34.962+0000 7f7c179653c0 20 librbd::api::mirror: mode_get:
2025-05-26T17:19:34.967+0000 7f7c179653c0 20 librbd::api::mirror: group_status_summary:
health: WARNING
daemon health: OK
image health: OK
group health: WARNING
images: 1 total
1 replaying
groups: 1 total
1 starting_replay
[root@ceph-rbd2-aarti-1-z8kn5p-node1-installer ~]#
[root@ceph-rbd2-aarti-1-z8kn5p-node1-installer ~]# ceph orch ps
NAME HOST PORTS STATUS REFRESHED AGE MEM USE MEM LIM VERSION IMAGE ID CONTAINER ID
mgr.ceph-rbd2-aarti-1-z8kn5p-node1-installer.vilieq ceph-rbd2-aarti-1-z8kn5p-node1-installer *:9283,8765 running (3h) 3m ago 3h 526M - 19.2.1-210.el9cp 56436d51c94e 66010229e45d
mgr.ceph-rbd2-aarti-1-z8kn5p-node3.fkxxkx ceph-rbd2-aarti-1-z8kn5p-node3 *:8443,8765 running (3h) 3m ago 3h 451M - 19.2.1-210.el9cp 56436d51c94e 427c83a169cd
mon.ceph-rbd2-aarti-1-z8kn5p-node1-installer ceph-rbd2-aarti-1-z8kn5p-node1-installer running (3h) 3m ago 3h 129M 2048M 19.2.1-210.el9cp 56436d51c94e 28c1ea5e0168
mon.ceph-rbd2-aarti-1-z8kn5p-node3 ceph-rbd2-aarti-1-z8kn5p-node3 running (3h) 3m ago 3h 123M 2048M 19.2.1-210.el9cp 56436d51c94e 73ae26336379
mon.ceph-rbd2-aarti-1-z8kn5p-node4 ceph-rbd2-aarti-1-z8kn5p-node4 running (3h) 3m ago 3h 122M 2048M 19.2.1-210.el9cp 56436d51c94e 8cb6c6f761c4
osd.0 ceph-rbd2-aarti-1-z8kn5p-node3 running (3h) 3m ago 3h 901M 4096M 19.2.1-210.el9cp 56436d51c94e d031a0724f27
osd.1 ceph-rbd2-aarti-1-z8kn5p-node4 running (3h) 3m ago 3h 692M 1088M 19.2.1-210.el9cp 56436d51c94e df2827a8729d
osd.2 ceph-rbd2-aarti-1-z8kn5p-node5 running (3h) 2m ago 3h 400M 1088M 19.2.1-210.el9cp 56436d51c94e b48fece77aca
osd.3 ceph-rbd2-aarti-1-z8kn5p-node3 running (3h) 3m ago 3h 603M 4096M 19.2.1-210.el9cp 56436d51c94e a6a1a59520cd
osd.4 ceph-rbd2-aarti-1-z8kn5p-node4 running (3h) 3m ago 3h 547M 1088M 19.2.1-210.el9cp 56436d51c94e 687350318d43
osd.5 ceph-rbd2-aarti-1-z8kn5p-node5 running (3h) 2m ago 3h 715M 1088M 19.2.1-210.el9cp 56436d51c94e 9c8d23226e0a
osd.6 ceph-rbd2-aarti-1-z8kn5p-node3 running (3h) 3m ago 3h 290M 4096M 19.2.1-210.el9cp 56436d51c94e 7207903165e5
osd.7 ceph-rbd2-aarti-1-z8kn5p-node4 running (3h) 3m ago 3h 536M 1088M 19.2.1-210.el9cp 56436d51c94e 1c575003c0c0
osd.8 ceph-rbd2-aarti-1-z8kn5p-node5 running (3h) 2m ago 3h 329M 1088M 19.2.1-210.el9cp 56436d51c94e e173691d4dd0
osd.9 ceph-rbd2-aarti-1-z8kn5p-node3 running (3h) 3m ago 3h 607M 4096M 19.2.1-210.el9cp 56436d51c94e 8d29da8e785c
osd.10 ceph-rbd2-aarti-1-z8kn5p-node4 running (3h) 3m ago 3h 764M 1088M 19.2.1-210.el9cp 56436d51c94e 51839ad897ce
osd.11 ceph-rbd2-aarti-1-z8kn5p-node5 running (3h) 2m ago 3h 633M 1088M 19.2.1-210.el9cp 56436d51c94e 38f1e0758edd
rbd-mirror.ceph-rbd2-aarti-1-z8kn5p-node5.mfqqkl ceph-rbd2-aarti-1-z8kn5p-node5 error 2m ago 3h - - <unknown> <unknown> <unknown>
[root@ceph-rbd2-aarti-1-z8kn5p-node1-installer ~]#
Version-Release number of selected component (if applicable):
[ceph: root@ceph-rbd1-aarti-1-z8kn5p-node1-installer /]# ceph version
ceph version 19.2.1-210.el9cp (d7ac9a7e698531c972a3567a67da1f0a9a266075) squid (stable)
[ceph: root@ceph-rbd1-aarti-1-z8kn5p-node1-installer /]#
Note: Seen this issue in 8.1, but raising bz in 9.0 as we are near RC. Kindly access and move the target release if required and applicable.
How reproducible: Always
Steps to Reproduce: as above
Actual results: rbd mirror daemon going to error state
Expected results: rbd mirror daemon should not error out
Additional info:
[root@ceph-rbd2-aarti-1-z8kn5p-node1-installer ~]# ceph health detail
HEALTH_WARN 1 failed cephadm daemon(s)
[WRN] CEPHADM_FAILED_DAEMON: 1 failed cephadm daemon(s)
daemon rbd-mirror.ceph-rbd2-aarti-1-z8kn5p-node5.mfqqkl on ceph-rbd2-aarti-1-z8kn5p-node5 is in error state
[root@ceph-rbd2-aarti-1-z8kn5p-node1-installer ~]#
[root@ceph-rbd2-aarti-1-z8kn5p-node1-installer ~]# ceph orch ls
NAME PORTS RUNNING REFRESHED AGE PLACEMENT
mgr 2/2 9m ago 4h label:mgr
mon 3/3 9m ago 4h label:mon
osd.all-available-devices 12 9m ago 4h *
rbd-mirror 0/1 8m ago 4h ceph-rbd2-aarti-1-z8kn5p-node5
[root@ceph-rbd2-aarti-1-z8kn5p-node1-installer ~]#
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory (Red Hat Ceph Storage 8.1 security and bug fix updates), and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHBA-2025:14015
Description of problem: The Adhoc scenario focuses on playing around snapshot sequencing both with user create group snapshots and system created mirror group snapshots. Test Steps: 1.Create rbd image 2.Add to the group 3.Create group snapshot 'snap_1' 4. enable mirroring on the group Now group has 2 snapshot (one user created group snapshot and another system created mirror group snapshot) [ceph: root@ceph-rbd1-aarti-1-z8kn5p-node1-installer /]# rbd group snap list --pool pool_1 --group group_1 --debug_rbd 0 ID NAME STATE NAMESPACE 4056f9d2de72 snap_1 complete user 4062a3fd2d59 .mirror.primary.59744227-253d-4352-a195-042c8b544655.4062a3fd2d59 complete mirror (primary peer_uuids:[fd3be1a4-06a4-4c0d-9f24-59f93dff3fdb]) [ceph: root@ceph-rbd1-aarti-1-z8kn5p-node1-installer /]# 5. Wait for both snapshot to be completed on site-b [root@ceph-rbd2-aarti-1-z8kn5p-node1-installer ~]# rbd group snap list --pool pool_1 --group group_1 --debug_rbd 0 ID NAME STATE NAMESPACE 4056f9d2de72 snap_1 complete user 4062a3fd2d59 .mirror.non-primary.59744227-253d-4352-a195-042c8b544655.4062a3fd2d59 complete mirror (non-primary peer_uuids:[] 51835de5-c34b-4658-874d-8eee505d2681:4062a3fd2d59) [root@ceph-rbd2-aarti-1-z8kn5p-node1-installer ~]# 6. Delete user group snap_1 from site-a [ceph: root@ceph-rbd1-aarti-1-z8kn5p-node1-installer /]# rbd group snap remove --pool pool_1 --group group_1 --snap snap_1 --debug_rbd 0 [ceph: root@ceph-rbd1-aarti-1-z8kn5p-node1-installer /]# rbd group snap list --pool pool_1 --group group_1 --debug_rbd 0 ID NAME STATE NAMESPACE 4062a3fd2d59 .mirror.primary.59744227-253d-4352-a195-042c8b544655.4062a3fd2d59 complete mirror (primary peer_uuids:[fd3be1a4-06a4-4c0d-9f24-59f93dff3fdb]) [ceph: root@ceph-rbd1-aarti-1-z8kn5p-node1-installer /]# 7. Create user group snap_1 again on site-a [ceph: root@ceph-rbd1-aarti-1-z8kn5p-node1-installer /]# rbd group snap create --pool pool_1 --group group_1 --snap snap_1 --debug_rbd 0 [ceph: root@ceph-rbd1-aarti-1-z8kn5p-node1-installer /]# rbd group snap list --pool pool_1 --group group_1 --debug_rbd 0 ID NAME STATE NAMESPACE 4062a3fd2d59 .mirror.primary.59744227-253d-4352-a195-042c8b544655.4062a3fd2d59 complete mirror (primary peer_uuids:[fd3be1a4-06a4-4c0d-9f24-59f93dff3fdb]) 407a981c5c65 snap_1 complete user [ceph: root@ceph-rbd1-aarti-1-z8kn5p-node1-installer /]# 8. Create another mirror group snapshot [ceph: root@ceph-rbd1-aarti-1-z8kn5p-node1-installer /]# rbd mirror group snapshot --pool pool_1 --group group_1 --debug_rbd 0 Snapshot ID: 408c15feb27f [ceph: root@ceph-rbd1-aarti-1-z8kn5p-node1-installer /]# rbd group snap list --pool pool_1 --group group_1 --debug_rbd 0 ID NAME STATE NAMESPACE 4062a3fd2d59 .mirror.primary.59744227-253d-4352-a195-042c8b544655.4062a3fd2d59 complete mirror (primary peer_uuids:[fd3be1a4-06a4-4c0d-9f24-59f93dff3fdb]) 407a981c5c65 snap_1 complete user 408c15feb27f .mirror.primary.59744227-253d-4352-a195-042c8b544655.408c15feb27f complete mirror (primary peer_uuids:[fd3be1a4-06a4-4c0d-9f24-59f93dff3fdb]) [ceph: root@ceph-rbd1-aarti-1-z8kn5p-node1-installer /]# 9. The mirror group snapshot created in step#8 is not getting propagated to site-b, Also group status on site-b toggles between "up+starting_replay" and "down+starting_replay". Additionally: After some time i see rbd-mirror daemon going down in UNKNOWN state [root@ceph-rbd2-aarti-1-z8kn5p-node1-installer ~]# rbd group snap list --pool pool_1 --group group_1 --debug_rbd 0 ID NAME STATE NAMESPACE 4056f9d2de72 snap_1 complete user 4062a3fd2d59 .mirror.non-primary.59744227-253d-4352-a195-042c8b544655.4062a3fd2d59 complete mirror (non-primary peer_uuids:[] 51835de5-c34b-4658-874d-8eee505d2681:4062a3fd2d59) [root@ceph-rbd2-aarti-1-z8kn5p-node1-installer ~]# [root@ceph-rbd2-aarti-1-z8kn5p-node1-installer ~]# rbd mirror group status --pool pool_1 --group group_1 --debug_rbd 0 group_1: global_id: 59744227-253d-4352-a195-042c8b544655 state: up+starting_replay description: starting replay service: ceph-rbd2-aarti-1-z8kn5p-node5.mfqqkl on ceph-rbd2-aarti-1-z8kn5p-node5 last_update: 2025-05-26 17:20:21 images: image: 6/80ac99a8-3fd9-4990-8065-32576e15ac5a state: up+replaying description: replaying, {"bytes_per_second":0.0,"bytes_per_snapshot":0.0,"last_snapshot_bytes":0,"last_snapshot_sync_seconds":0,"local_snapshot_timestamp":1748279761,"remote_snapshot_timestamp":1748279916,"replay_state":"idle"} peer_sites: name: site-a state: up+stopped description: local group is primary last_update: 2025-05-26 17:20:35 images: image: 6/80ac99a8-3fd9-4990-8065-32576e15ac5a state: up+stopped description: local image is primary [root@ceph-rbd2-aarti-1-z8kn5p-node1-installer ~]# rbd mirror group status --pool pool_1 --group group_1 --debug_rbd 0 group_1: global_id: 59744227-253d-4352-a195-042c8b544655 state: down+starting_replay description: starting replay last_update: 2025-05-26 17:20:21 images: image: 6/80ac99a8-3fd9-4990-8065-32576e15ac5a state: down+replaying description: replaying, {"bytes_per_second":0.0,"bytes_per_snapshot":0.0,"last_snapshot_bytes":0,"last_snapshot_sync_seconds":0,"local_snapshot_timestamp":1748279761,"remote_snapshot_timestamp":1748279916,"replay_state":"idle"} peer_sites: name: site-a state: up+stopped description: local group is primary last_update: 2025-05-26 17:20:35 images: image: 6/80ac99a8-3fd9-4990-8065-32576e15ac5a state: up+stopped description: local image is primary [root@ceph-rbd2-aarti-1-z8kn5p-node1-installer ~]# [root@ceph-rbd2-aarti-1-z8kn5p-node1-installer ~]# rbd mirror pool status pool_1 2025-05-26T17:19:34.962+0000 7f7c179653c0 20 librbd::api::mirror: mode_get: 2025-05-26T17:19:34.967+0000 7f7c179653c0 20 librbd::api::mirror: group_status_summary: health: WARNING daemon health: OK image health: OK group health: WARNING images: 1 total 1 replaying groups: 1 total 1 starting_replay [root@ceph-rbd2-aarti-1-z8kn5p-node1-installer ~]# [root@ceph-rbd2-aarti-1-z8kn5p-node1-installer ~]# ceph orch ps NAME HOST PORTS STATUS REFRESHED AGE MEM USE MEM LIM VERSION IMAGE ID CONTAINER ID mgr.ceph-rbd2-aarti-1-z8kn5p-node1-installer.vilieq ceph-rbd2-aarti-1-z8kn5p-node1-installer *:9283,8765 running (3h) 3m ago 3h 526M - 19.2.1-210.el9cp 56436d51c94e 66010229e45d mgr.ceph-rbd2-aarti-1-z8kn5p-node3.fkxxkx ceph-rbd2-aarti-1-z8kn5p-node3 *:8443,8765 running (3h) 3m ago 3h 451M - 19.2.1-210.el9cp 56436d51c94e 427c83a169cd mon.ceph-rbd2-aarti-1-z8kn5p-node1-installer ceph-rbd2-aarti-1-z8kn5p-node1-installer running (3h) 3m ago 3h 129M 2048M 19.2.1-210.el9cp 56436d51c94e 28c1ea5e0168 mon.ceph-rbd2-aarti-1-z8kn5p-node3 ceph-rbd2-aarti-1-z8kn5p-node3 running (3h) 3m ago 3h 123M 2048M 19.2.1-210.el9cp 56436d51c94e 73ae26336379 mon.ceph-rbd2-aarti-1-z8kn5p-node4 ceph-rbd2-aarti-1-z8kn5p-node4 running (3h) 3m ago 3h 122M 2048M 19.2.1-210.el9cp 56436d51c94e 8cb6c6f761c4 osd.0 ceph-rbd2-aarti-1-z8kn5p-node3 running (3h) 3m ago 3h 901M 4096M 19.2.1-210.el9cp 56436d51c94e d031a0724f27 osd.1 ceph-rbd2-aarti-1-z8kn5p-node4 running (3h) 3m ago 3h 692M 1088M 19.2.1-210.el9cp 56436d51c94e df2827a8729d osd.2 ceph-rbd2-aarti-1-z8kn5p-node5 running (3h) 2m ago 3h 400M 1088M 19.2.1-210.el9cp 56436d51c94e b48fece77aca osd.3 ceph-rbd2-aarti-1-z8kn5p-node3 running (3h) 3m ago 3h 603M 4096M 19.2.1-210.el9cp 56436d51c94e a6a1a59520cd osd.4 ceph-rbd2-aarti-1-z8kn5p-node4 running (3h) 3m ago 3h 547M 1088M 19.2.1-210.el9cp 56436d51c94e 687350318d43 osd.5 ceph-rbd2-aarti-1-z8kn5p-node5 running (3h) 2m ago 3h 715M 1088M 19.2.1-210.el9cp 56436d51c94e 9c8d23226e0a osd.6 ceph-rbd2-aarti-1-z8kn5p-node3 running (3h) 3m ago 3h 290M 4096M 19.2.1-210.el9cp 56436d51c94e 7207903165e5 osd.7 ceph-rbd2-aarti-1-z8kn5p-node4 running (3h) 3m ago 3h 536M 1088M 19.2.1-210.el9cp 56436d51c94e 1c575003c0c0 osd.8 ceph-rbd2-aarti-1-z8kn5p-node5 running (3h) 2m ago 3h 329M 1088M 19.2.1-210.el9cp 56436d51c94e e173691d4dd0 osd.9 ceph-rbd2-aarti-1-z8kn5p-node3 running (3h) 3m ago 3h 607M 4096M 19.2.1-210.el9cp 56436d51c94e 8d29da8e785c osd.10 ceph-rbd2-aarti-1-z8kn5p-node4 running (3h) 3m ago 3h 764M 1088M 19.2.1-210.el9cp 56436d51c94e 51839ad897ce osd.11 ceph-rbd2-aarti-1-z8kn5p-node5 running (3h) 2m ago 3h 633M 1088M 19.2.1-210.el9cp 56436d51c94e 38f1e0758edd rbd-mirror.ceph-rbd2-aarti-1-z8kn5p-node5.mfqqkl ceph-rbd2-aarti-1-z8kn5p-node5 error 2m ago 3h - - <unknown> <unknown> <unknown> [root@ceph-rbd2-aarti-1-z8kn5p-node1-installer ~]# Version-Release number of selected component (if applicable): [ceph: root@ceph-rbd1-aarti-1-z8kn5p-node1-installer /]# ceph version ceph version 19.2.1-210.el9cp (d7ac9a7e698531c972a3567a67da1f0a9a266075) squid (stable) [ceph: root@ceph-rbd1-aarti-1-z8kn5p-node1-installer /]# Note: Seen this issue in 8.1, but raising bz in 9.0 as we are near RC. Kindly access and move the target release if required and applicable. How reproducible: Always Steps to Reproduce: as above Actual results: rbd mirror daemon going to error state Expected results: rbd mirror daemon should not error out Additional info: [root@ceph-rbd2-aarti-1-z8kn5p-node1-installer ~]# ceph health detail HEALTH_WARN 1 failed cephadm daemon(s) [WRN] CEPHADM_FAILED_DAEMON: 1 failed cephadm daemon(s) daemon rbd-mirror.ceph-rbd2-aarti-1-z8kn5p-node5.mfqqkl on ceph-rbd2-aarti-1-z8kn5p-node5 is in error state [root@ceph-rbd2-aarti-1-z8kn5p-node1-installer ~]# [root@ceph-rbd2-aarti-1-z8kn5p-node1-installer ~]# ceph orch ls NAME PORTS RUNNING REFRESHED AGE PLACEMENT mgr 2/2 9m ago 4h label:mgr mon 3/3 9m ago 4h label:mon osd.all-available-devices 12 9m ago 4h * rbd-mirror 0/1 8m ago 4h ceph-rbd2-aarti-1-z8kn5p-node5 [root@ceph-rbd2-aarti-1-z8kn5p-node1-installer ~]#