Bug 2361747

Summary: [Scale] Mirror group operation failed with : librbd::mirror::GroupGetInfoRequest: 0x5560070f3690 handle_get_last_mirror_snapshot_state: failed to list group snapshots of group 'group_29': (22) Invalid argument
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: aarsharm
Component: RBD-MirrorAssignee: N Balachandran <nibalach>
Status: CLOSED ERRATA QA Contact: aarsharm
Severity: high Docs Contact:
Priority: unspecified    
Version: 8.1CC: ceph-eng-bugs, cephqe-warriors, idryomov, nibalach, tserlin
Target Milestone: ---   
Target Release: 8.1   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ceph-19.2.1-166.el9cp Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2025-06-26 12:30:56 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description aarsharm 2025-04-22 19:05:17 UTC
Description of problem:
In a scale setup, where 100 groups with 1 image each (size of image 22G)
Only one group i.e group_29 (out of 100 failed with below error)

[ceph: root@tala001 /]# rbd mirror group status --pool pool_2 --group group_29  --debug_rbd 0
rbd: failed to get mirror info for group: 2025-04-22T18:56:42.890+0000 7f39dd57b640 -1 librbd::mirror::GroupGetInfoRequest: 0x5619341d7680 handle_get_last_mirror_snapshot_state: failed to list group snapshots of group 'group_29': (22) Invalid argument
(22) Invalid argument
[ceph: root@tala001 /]#

[ceph: root@tala001 /]# rbd group snap list --pool pool_2 --group group_29  --debug_rbd 0
ID             NAME                                                                STATE     NAMESPACE
16e814b317b28  .mirror.primary.6f803dc2-25d4-43de-985e-f1f3ea32e8bc.16e814b317b28  complete  mirror (primary peer_uuids:[])
18db94735015b  .mirror.primary.6f803dc2-25d4-43de-985e-f1f3ea32e8bc.18db94735015b  complete  mirror (primary peer_uuids:[e2d30a1f-3c7d-45bc-b91e-ca81cc428230])
1b8dfe216f73d  .mirror.primary.6f803dc2-25d4-43de-985e-f1f3ea32e8bc.1b8dfe216f73d  complete  mirror (demoted peer_uuids:[e2d30a1f-3c7d-45bc-b91e-ca81cc428230])
[ceph: root@tala001 /]#

site-b:
[ceph: root@tala005 /]# rbd mirror group status --pool pool_2 --group group_29  --debug_rbd 0
group_29:
  global_id:   6f803dc2-25d4-43de-985e-f1f3ea32e8bc
  state:       up+error
  description: bootstrap failed
  service:     tala005.jwqeia on tala005
  last_update: 2025-04-22 18:41:42
  images:
  peer_sites:
    name: site-a
    state: up+error
    description: bootstrap failed
    last_update: 2025-04-22 18:42:01
    images:
  snapshots:
    .mirror.non-primary.6f803dc2-25d4-43de-985e-f1f3ea32e8bc.1b8dfe216f73d
    .mirror.primary.6f803dc2-25d4-43de-985e-f1f3ea32e8bc.1330b1824f430
[ceph: root@tala005 /]# rbd group snap list --pool pool_2 --group group_29  --debug_rbd 0
ID             NAME                                                                    STATE     NAMESPACE                                                         
1b8dfe216f73d  .mirror.non-primary.6f803dc2-25d4-43de-985e-f1f3ea32e8bc.1b8dfe216f73d  complete  mirror (demoted peer_uuids:[a3fa4ccf-91c5-4cf2-b717-470c30908b23] 359d7808-3dbb-4e59-8b66-ad1cfe162441:1b8dfe216f73d)
1330b1824f430  .mirror.primary.6f803dc2-25d4-43de-985e-f1f3ea32e8bc.1330b1824f430      complete  mirror (primary peer_uuids:[a3fa4ccf-91c5-4cf2-b717-470c30908b23])
[ceph: root@tala005 /]#


Version-Release number of selected component (if applicable):
[ceph: root@tala005 /]# ceph version
ceph version 19.2.1-151.el9cp (642684c3e6e5556e1132f363e49d5e26b605e8ef) squid (stable)
[ceph: root@tala005 /]#


Steps to Reproduce:
1. perform demote on all 100 groups on site-a
2. perform promote on all 100 groups on site-b
[ceph: root@tala005 /]# rbd mirror group promote --pool pool_2 --group group_1 --debug_rbd 0
rbd mirror group promote --pool pool_2 --group group_2 --debug_rbd 0
rbd mirror group promote --pool pool_2 --group group_3 --debug_rbd 0
rbd mirror group promote --pool pool_2 --group group_4 --debug_rbd 0
rbd mirror group promote --pool pool_2 --group group_5 --debug_rbd 0
rbd mirror group promote --pool pool_2 --group group_6 --debug_rbd 0
rbd mirror group promote --pool pool_2 --group group_7 --debug_rbd 0
rbd mirror group promote --pool pool_2 --group group_8 --debug_rbd 0
rbd mirror group promote --pool pool_2 --group group_9 --debug_rbd 0
rbd mirror group promote --pool pool_2 --group group_10  --debug_rbd 0
rbd mirror group promote --pool pool_2 --group group_11  --debug_rbd 0
rbd mirror group promote --pool pool_2 --group group_12  --debug_rbd 0
rbd mirror group promote --pool pool_2 --group group_13  --debug_rbd 0
rbd mirror group promote --pool pool_2 --group group_14  --debug_rbd 0
rbd mirror group promote --pool pool_2 --group group_15  --debug_rbd 0
rbd mirror group promote --pool pool_2 --group group_16  --debug_rbd 0
rbd mirror group promote --pool pool_2 --group group_17  --debug_rbd 0
rbd mirror group promote --pool pool_2 --group group_18  --debug_rbd 0
rbd mirror group promote --pool pool_2 --group group_19  --debug_rbd 0
rbd mirror group promote --pool pool_2 --group group_20  --debug_rbd 0
rbd mirror group promote --pool pool_2 --group group_21  --debug_rbd 0
rbd mirror group promote --pool pool_2 --group group_22  --debug_rbd 0
rbd mirror group promote --pool pool_2 --group group_23  --debug_rbd 0
rbd mirror group promote --pool pool_2 --group group_24  --debug_rbd 0
rbd mirror group promote --pool pool_2 --group group_25  --debug_rbd 0
rbd mirror group promote --pool pool_2 --group group_26  --debug_rbd 0
rbd mirror group promote --pool pool_2 --group group_27  --debug_rbd 0
rbd mirror group promote --pool pool_2 --group group_28  --debug_rbd 0
rbd mirror group promote --pool pool_2 --group group_29  --debug_rbd 0
rbd mirror group promote --pool pool_2 --group group_30  --debug_rbd 0
rbd mirror group promote --pool pool_2 --group group_31  --debug_rbd 0
rbd mirror group promote --pool pool_2 --group group_32  --debug_rbd 0
rbd mirror group promote --pool pool_2 --group group_33  --debug_rbd 0
rbd mirror group promote --pool pool_2 --group group_34  --debug_rbd 0
rbd mirror group promote --pool pool_2 --group group_35  --debug_rbd 0
rbd mirror group promote --pool pool_2 --group group_36  --debug_rbd 0
rbd mirror group promote --pool pool_2 --group group_37  --debug_rbd 0
rbd mirror group promote --pool pool_2 --group group_38  --debug_rbd 0
rbd mirror group promote --pool pool_2 --group group_39  --debug_rbd 0
rbd mirror group promote --pool pool_2 --group group_40  --debug_rbd 0
rbd mirror group promote --pool pool_2 --group group_41  --debug_rbd 0
rbd mirror group promote --pool pool_2 --group group_100 --debug_rbd 0
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
rbd: error promoting group to primary
2025-04-22T18:08:07.262+0000 7f696547b380 -1 librbd::api::Mirror: group_promote: group group_5 is still primary within a remote cluster
rbd: error promoting group to primary
2025-04-22T18:08:07.300+0000 7fa0b411b380 -1 librbd::api::Mirror: group_promote: group group_6 is still primary within a remote cluster
rbd: error promoting group to primary
2025-04-22T18:08:07.338+0000 7fb601dfd380 -1 librbd::api::Mirror: group_promote: group group_7 is still primary within a remote cluster
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
rbd: error promoting group to primary
2025-04-22T18:08:50.183+0000 7f17de8e1380 -1 librbd::api::Mirror: group_promote: group group_32 is still primary within a remote cluster
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
rbd: error promoting group to primary
2025-04-22T18:09:26.233+0000 7f6e55252380 -1 librbd::api::Mirror: group_promote: group group_62 is still primary within a remote cluster
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
Group promoted to primary
[ceph: root@tala005 /]#

3. perform force promote on site-a for all groups
4. perform demote on all 100 groups on site-a
[ceph: root@tala001 /]# rbd mirror group demote --pool pool_2 --group group_1 --debug_rbd 0
rbd mirror group demote --pool pool_2 --group group_2 --debug_rbd 0
rbd mirror group demote --pool pool_2 --group group_3 --debug_rbd 0
rbd mirror group demote --pool pool_2 --group group_4 --debug_rbd 0
rbd mirror group demote --pool pool_2 --group group_5 --debug_rbd 0
rbd mirror group demote --pool pool_2 --group group_6 --debug_rbd 0
rbd mirror group demote --pool pool_2 --group group_7 --debug_rbd 0
rbd mirror group demote --pool pool_2 --group group_8 --debug_rbd 0
rbd mirror group demote --pool pool_2 --group group_9 --debug_rbd 0
rbd mirror group demote --pool pool_2 --group group_10  --debug_rbd 0
rbd mirror group demote --pool pool_2 --group group_11  --debug_rbd 0
rbd mirror group demote --pool pool_2 --group group_12  --debug_rbd 0
rbd mirror group demote --pool pool_2 --group group_13  --debug_rbd 0
rbd mirror group demote --pool pool_2 --group group_14  --debug_rbd 0
rbd mirror group demote --pool pool_2 --group group_15  --debug_rbd 0
rbd mirror group demote --pool pool_2 --group group_16  --debug_rbd 0
rbd mirror group demote --pool pool_2 --group group_17  --debug_rbd 0
rbd mirror group demote --pool pool_2 --group group_18  --debug_rbd 0
rbd mirror group demote --pool pool_2 --group group_19  --debug_rbd 0
rbd mirror group demote --pool pool_2 --group group_20  --debug_rbd 0
rbd mirror group demote --pool pool_2 --group group_21  --debug_rbd 0
rbd mirror group demote --pool pool_2 --group group_22  --debug_rbd 0
rbd mirror group demote --pool pool_2 --group group_23  --debug_rbd 0
rbd mirror group demote --pool pool_2 --group group_24  --debug_rbd 0
rbd mirror group demote --pool pool_2 --group group_25  --debug_rbd 0
rbd mirror group demote --pool pool_2 --group group_26  --debug_rbd 0
rbd mirror group demote --pool pool_2 --group group_27  --debug_rbd 0
rbd mirror group demote --pool pool_2 --group group_28  --debug_rbd 0
rbd mirror group demote --pool pool_2 --group group_29  --debug_rbd 0
rbd mirror group demote --pool pool_2 --group group_30  --debug_rbd 0
rbd mirror group demote --pool pool_2 --group group_31  --debug_rbd 0
rbd mirror group demote --pool pool_2 --group group_32  --debug_rbd 0
rbd mirror group demote --pool pool_2 --group group_33  --debug_rbd 0
rbd mirror group demote --pool pool_2 --group group_34  --debug_rbd 0
rbd mirror group demote --pool pool_2 --group group_35  --debug_rbd 0
rbd mirror group demote --pool pool_2 --group group_36  --debug_rbd 0
rbd mirror group demote --pool pool_2 --group group_37  --debug_rbd 0
rbd mirror group demote --pool pool_2 --group group_38  --debug_rbd 0
rbd mirror group demote --pool pool_2 --group group_39  --debug_rbd 0
rbd mirror group demote --pool pool_2 --group group_40  --debug_rbd 0
rbd mirror group demote --pool pool_2 --group group_41  --debug_rbd 0
rbd mirror group demote --pool pool_2 --group group_42  --debug_rbd 0
rbd mirror group demote --pool pool_2 --group group_43  --debug_rbd 0
rbd mirror group demote --pool pool_2 --group group_44  --debug_rbd 0
rbd mirror group demote --pool pool_2 --group group_100 --debug_rbd 0
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
2025-04-22T18:35:02.443+0000 7f249effd640 -1 librbd::mirror::snapshot::GroupUnlinkPeerRequest: 0x55fb88cbb5a0 handle_remove_peer_uuid: failed to remove group snapshot mirror peer: (17) File exists
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
2025-04-22T18:35:17.447+0000 7ff0ef7fe640 -1 librbd::mirror::GroupGetInfoRequest: 0x55f9f1258090 handle_get_last_mirror_snapshot_state: failed to list group snapshots of group 'group_29': (22) Invalid argument
rbd: failed to get mirror info for group: (22) Invalid argument
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
2025-04-22T18:35:32.395+0000 7fe426ffd640 -1 librbd::mirror::snapshot::GroupUnlinkPeerRequest: 0x55c89fd8ffc0 handle_remove_peer_uuid: failed to remove group snapshot mirror peer: (17) File exists
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
2025-04-22T18:36:02.419+0000 7f410affd640 -1 librbd::mirror::snapshot::GroupUnlinkPeerRequest: 0x55b774c685a0 handle_remove_peer_uuid: failed to remove group snapshot mirror peer: (17) File exists
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
Group demoted to non-primary
2025-04-22T18:36:32.427+0000 7f9899ffb640 -1 librbd::mirror::snapshot::GroupUnlinkPeerRequest: 0x55c6ea1f3390 handle_remove_peer_uuid: failed to remove group snapshot mirror peer: (17) File exists
Group demoted to non-primary
[ceph: root@tala001 /]#


Actual results: group_29 mirror group info is not displayed on site-a


Expected results: group_29 mirror group info should be displayed on site-a


Additional info: NA

Comment 10 errata-xmlrpc 2025-06-26 12:30:56 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat Ceph Storage 8.1 security, bug fix, and enhancement updates), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2025:9775