Bug 2004013 - [DR] After performing failover mirroringStatus reports image_health: ERROR
Summary: [DR] After performing failover mirroringStatus reports image_health: ERROR
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: rook
Version: 4.9
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ODF 4.9.0
Assignee: Shyamsundar
QA Contact: Pratik Surve
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-09-14 10:23 UTC by Pratik Surve
Modified: 2023-08-09 17:03 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-12-13 17:46:17 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2021:5086 0 None None None 2021-12-13 17:46:45 UTC

Comment 3 Santosh Pillai 2021-09-14 10:59:44 UTC
Failing to import Bootstrap token:

From Rook Logs: 

2021-09-08T14:57:23.080194855Z 2021-09-08 14:57:23.080053 I | cephclient: add rbd-mirror bootstrap peer token for pool "ocs-storagecluster-cephblockpool"
2021-09-08T15:47:23.145091892Z 2021-09-08 15:47:23.144587 E | ceph-block-pool-controller: failed to reconcile. failed to add ceph rbd mirror peer: failed to import bootstrap peer token: failed to add rbd-mirror peer token for pool "ocs-storagecluster-cephblockpool". . 2021-09-08T14:57:23.128+0000 7ff97b7ca2c0 -1 auth: unable to find a keyring on /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
2021-09-08T15:47:23.145091892Z 2021-09-08T14:57:23.129+0000 7ff97b7ca2c0 -1 auth: unable to find a keyring on /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
2021-09-08T15:47:23.145091892Z 2021-09-08T14:57:23.130+0000 7ff97b7ca2c0 -1 auth: unable to find a keyring on /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
2021-09-08T15:47:23.145091892Z 2021-09-08T15:02:23.130+0000 7ff97b7ca2c0 -1 auth: unable to find a keyring on /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
2021-09-08T15:47:23.145091892Z 2021-09-08T15:02:23.130+0000 7ff97b7ca2c0 -1 auth: unable to find a keyring on /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
2021-09-08T15:47:23.145091892Z 2021-09-08T15:07:23.131+0000 7ff97b7ca2c0 -1 auth: unable to find a keyring on /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
2021-09-08T15:47:23.145091892Z 2021-09-08T15:07:23.131+0000 7ff97b7ca2c0 -1 auth: unable to find a keyring on /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
2021-09-08T15:47:23.145091892Z 2021-09-08T15:12:23.132+0000 7ff97b7ca2c0 -1 auth: unable to find a keyring on /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
2021-09-08T15:47:23.145091892Z 2021-09-08T15:12:23.132+0000 7ff97b7ca2c0 -1 auth: unable to find a keyring on /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
2021-09-08T15:47:23.145091892Z 2021-09-08T15:17:23.132+0000 7ff97b7ca2c0 -1 auth: unable to find a keyring on /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
2021-09-08T15:47:23.145091892Z 2021-09-08T15:17:23.132+0000 7ff97b7ca2c0 -1 auth: unable to find a keyring on /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
2021-09-08T15:47:23.145091892Z 2021-09-08T15:22:23.133+0000 7ff97b7ca2c0 -1 auth: unable to find a keyring on /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
2021-09-08T15:47:23.145091892Z 2021-09-08T15:22:23.133+0000 7ff97b7ca2c0 -1 auth: unable to find a keyring on /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
2021-09-08T15:47:23.145091892Z 2021-09-08T15:27:23.133+0000 7ff97b7ca2c0 -1 auth: unable to find a keyring on /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
2021-09-08T15:47:23.145091892Z 2021-09-08T15:27:23.133+0000 7ff97b7ca2c0 -1 auth: unable to find a keyring on /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
2021-09-08T15:47:23.145091892Z 2021-09-08T15:32:23.134+0000 7ff97b7ca2c0 -1 auth: unable to find a keyring on /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
2021-09-08T15:47:23.145091892Z 2021-09-08T15:32:23.134+0000 7ff97b7ca2c0 -1 auth: unable to find a keyring on /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
2021-09-08T15:47:23.145091892Z 2021-09-08T15:37:23.136+0000 7ff97b7ca2c0 -1 auth: unable to find a keyring on /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
2021-09-08T15:47:23.145091892Z 2021-09-08T15:37:23.136+0000 7ff97b7ca2c0 -1 auth: unable to find a keyring on /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
2021-09-08T15:47:23.145091892Z 2021-09-08T15:42:23.135+0000 7ff97b7ca2c0 -1 auth: unable to find a keyring on /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
2021-09-08T15:47:23.145091892Z 2021-09-08T15:42:23.135+0000 7ff97b7ca2c0 -1 auth: unable to find a keyring on /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
2021-09-08T15:47:23.145091892Z 2021-09-08T15:47:23.136+0000 7ff97b7ca2c0 -1 librbd::api::Mirror: peer_bootstrap_import: failed to connect to peer cluster: (110) Connection timed out
2021-09-08T15:47:23.145091892Z rbd: failed to import peer bootstrap token: exit status 110

@Prateek  Can you please the token details inside this secret (ad6a06422d73c536059c82d8ce48b27f585fd60).  Need to check the contents.

Comment 4 Santosh Pillai 2021-09-14 12:25:21 UTC
(In reply to Santosh Pillai from comment #3)
> Failing to import Bootstrap token:
> 
> From Rook Logs: 
> 
> 2021-09-08T14:57:23.080194855Z 2021-09-08 14:57:23.080053 I | cephclient:
> add rbd-mirror bootstrap peer token for pool
> "ocs-storagecluster-cephblockpool"
> 2021-09-08T15:47:23.145091892Z 2021-09-08 15:47:23.144587 E |
> ceph-block-pool-controller: failed to reconcile. failed to add ceph rbd
> mirror peer: failed to import bootstrap peer token: failed to add rbd-mirror
> peer token for pool "ocs-storagecluster-cephblockpool". .
> 2021-09-08T14:57:23.128+0000 7ff97b7ca2c0 -1 auth: unable to find a keyring
> on
> /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> bin,: (2) No such file or directory
> 2021-09-08T15:47:23.145091892Z 2021-09-08T14:57:23.129+0000 7ff97b7ca2c0 -1
> auth: unable to find a keyring on
> /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> bin,: (2) No such file or directory
> 2021-09-08T15:47:23.145091892Z 2021-09-08T14:57:23.130+0000 7ff97b7ca2c0 -1
> auth: unable to find a keyring on
> /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> bin,: (2) No such file or directory
> 2021-09-08T15:47:23.145091892Z 2021-09-08T15:02:23.130+0000 7ff97b7ca2c0 -1
> auth: unable to find a keyring on
> /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> bin,: (2) No such file or directory
> 2021-09-08T15:47:23.145091892Z 2021-09-08T15:02:23.130+0000 7ff97b7ca2c0 -1
> auth: unable to find a keyring on
> /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> bin,: (2) No such file or directory
> 2021-09-08T15:47:23.145091892Z 2021-09-08T15:07:23.131+0000 7ff97b7ca2c0 -1
> auth: unable to find a keyring on
> /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> bin,: (2) No such file or directory
> 2021-09-08T15:47:23.145091892Z 2021-09-08T15:07:23.131+0000 7ff97b7ca2c0 -1
> auth: unable to find a keyring on
> /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> bin,: (2) No such file or directory
> 2021-09-08T15:47:23.145091892Z 2021-09-08T15:12:23.132+0000 7ff97b7ca2c0 -1
> auth: unable to find a keyring on
> /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> bin,: (2) No such file or directory
> 2021-09-08T15:47:23.145091892Z 2021-09-08T15:12:23.132+0000 7ff97b7ca2c0 -1
> auth: unable to find a keyring on
> /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> bin,: (2) No such file or directory
> 2021-09-08T15:47:23.145091892Z 2021-09-08T15:17:23.132+0000 7ff97b7ca2c0 -1
> auth: unable to find a keyring on
> /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> bin,: (2) No such file or directory
> 2021-09-08T15:47:23.145091892Z 2021-09-08T15:17:23.132+0000 7ff97b7ca2c0 -1
> auth: unable to find a keyring on
> /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> bin,: (2) No such file or directory
> 2021-09-08T15:47:23.145091892Z 2021-09-08T15:22:23.133+0000 7ff97b7ca2c0 -1
> auth: unable to find a keyring on
> /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> bin,: (2) No such file or directory
> 2021-09-08T15:47:23.145091892Z 2021-09-08T15:22:23.133+0000 7ff97b7ca2c0 -1
> auth: unable to find a keyring on
> /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> bin,: (2) No such file or directory
> 2021-09-08T15:47:23.145091892Z 2021-09-08T15:27:23.133+0000 7ff97b7ca2c0 -1
> auth: unable to find a keyring on
> /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> bin,: (2) No such file or directory
> 2021-09-08T15:47:23.145091892Z 2021-09-08T15:27:23.133+0000 7ff97b7ca2c0 -1
> auth: unable to find a keyring on
> /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> bin,: (2) No such file or directory
> 2021-09-08T15:47:23.145091892Z 2021-09-08T15:32:23.134+0000 7ff97b7ca2c0 -1
> auth: unable to find a keyring on
> /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> bin,: (2) No such file or directory
> 2021-09-08T15:47:23.145091892Z 2021-09-08T15:32:23.134+0000 7ff97b7ca2c0 -1
> auth: unable to find a keyring on
> /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> bin,: (2) No such file or directory
> 2021-09-08T15:47:23.145091892Z 2021-09-08T15:37:23.136+0000 7ff97b7ca2c0 -1
> auth: unable to find a keyring on
> /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> bin,: (2) No such file or directory
> 2021-09-08T15:47:23.145091892Z 2021-09-08T15:37:23.136+0000 7ff97b7ca2c0 -1
> auth: unable to find a keyring on
> /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> bin,: (2) No such file or directory
> 2021-09-08T15:47:23.145091892Z 2021-09-08T15:42:23.135+0000 7ff97b7ca2c0 -1
> auth: unable to find a keyring on
> /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> bin,: (2) No such file or directory
> 2021-09-08T15:47:23.145091892Z 2021-09-08T15:42:23.135+0000 7ff97b7ca2c0 -1
> auth: unable to find a keyring on
> /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> bin,: (2) No such file or directory
> 2021-09-08T15:47:23.145091892Z 2021-09-08T15:47:23.136+0000 7ff97b7ca2c0 -1
> librbd::api::Mirror: peer_bootstrap_import: failed to connect to peer
> cluster: (110) Connection timed out
> 2021-09-08T15:47:23.145091892Z rbd: failed to import peer bootstrap token:
> exit status 110
> 
> @Prateek  Can you please the token details inside this secret
> (ad6a06422d73c536059c82d8ce48b27f585fd60).  Need to check the contents.

Does not look like the root cause. The clusters are able to connect to each other. Tried manually bootstrapping the peer and it worked.

Comment 5 Sébastien Han 2021-09-14 13:43:26 UTC
(In reply to Santosh Pillai from comment #4)
> (In reply to Santosh Pillai from comment #3)
> > Failing to import Bootstrap token:
> > 
> > From Rook Logs: 
> > 
> > 2021-09-08T14:57:23.080194855Z 2021-09-08 14:57:23.080053 I | cephclient:
> > add rbd-mirror bootstrap peer token for pool
> > "ocs-storagecluster-cephblockpool"
> > 2021-09-08T15:47:23.145091892Z 2021-09-08 15:47:23.144587 E |
> > ceph-block-pool-controller: failed to reconcile. failed to add ceph rbd
> > mirror peer: failed to import bootstrap peer token: failed to add rbd-mirror
> > peer token for pool "ocs-storagecluster-cephblockpool". .
> > 2021-09-08T14:57:23.128+0000 7ff97b7ca2c0 -1 auth: unable to find a keyring
> > on
> > /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> > bin,: (2) No such file or directory
> > 2021-09-08T15:47:23.145091892Z 2021-09-08T14:57:23.129+0000 7ff97b7ca2c0 -1
> > auth: unable to find a keyring on
> > /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> > bin,: (2) No such file or directory
> > 2021-09-08T15:47:23.145091892Z 2021-09-08T14:57:23.130+0000 7ff97b7ca2c0 -1
> > auth: unable to find a keyring on
> > /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> > bin,: (2) No such file or directory
> > 2021-09-08T15:47:23.145091892Z 2021-09-08T15:02:23.130+0000 7ff97b7ca2c0 -1
> > auth: unable to find a keyring on
> > /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> > bin,: (2) No such file or directory
> > 2021-09-08T15:47:23.145091892Z 2021-09-08T15:02:23.130+0000 7ff97b7ca2c0 -1
> > auth: unable to find a keyring on
> > /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> > bin,: (2) No such file or directory
> > 2021-09-08T15:47:23.145091892Z 2021-09-08T15:07:23.131+0000 7ff97b7ca2c0 -1
> > auth: unable to find a keyring on
> > /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> > bin,: (2) No such file or directory
> > 2021-09-08T15:47:23.145091892Z 2021-09-08T15:07:23.131+0000 7ff97b7ca2c0 -1
> > auth: unable to find a keyring on
> > /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> > bin,: (2) No such file or directory
> > 2021-09-08T15:47:23.145091892Z 2021-09-08T15:12:23.132+0000 7ff97b7ca2c0 -1
> > auth: unable to find a keyring on
> > /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> > bin,: (2) No such file or directory
> > 2021-09-08T15:47:23.145091892Z 2021-09-08T15:12:23.132+0000 7ff97b7ca2c0 -1
> > auth: unable to find a keyring on
> > /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> > bin,: (2) No such file or directory
> > 2021-09-08T15:47:23.145091892Z 2021-09-08T15:17:23.132+0000 7ff97b7ca2c0 -1
> > auth: unable to find a keyring on
> > /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> > bin,: (2) No such file or directory
> > 2021-09-08T15:47:23.145091892Z 2021-09-08T15:17:23.132+0000 7ff97b7ca2c0 -1
> > auth: unable to find a keyring on
> > /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> > bin,: (2) No such file or directory
> > 2021-09-08T15:47:23.145091892Z 2021-09-08T15:22:23.133+0000 7ff97b7ca2c0 -1
> > auth: unable to find a keyring on
> > /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> > bin,: (2) No such file or directory
> > 2021-09-08T15:47:23.145091892Z 2021-09-08T15:22:23.133+0000 7ff97b7ca2c0 -1
> > auth: unable to find a keyring on
> > /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> > bin,: (2) No such file or directory
> > 2021-09-08T15:47:23.145091892Z 2021-09-08T15:27:23.133+0000 7ff97b7ca2c0 -1
> > auth: unable to find a keyring on
> > /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> > bin,: (2) No such file or directory
> > 2021-09-08T15:47:23.145091892Z 2021-09-08T15:27:23.133+0000 7ff97b7ca2c0 -1
> > auth: unable to find a keyring on
> > /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> > bin,: (2) No such file or directory
> > 2021-09-08T15:47:23.145091892Z 2021-09-08T15:32:23.134+0000 7ff97b7ca2c0 -1
> > auth: unable to find a keyring on
> > /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> > bin,: (2) No such file or directory
> > 2021-09-08T15:47:23.145091892Z 2021-09-08T15:32:23.134+0000 7ff97b7ca2c0 -1
> > auth: unable to find a keyring on
> > /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> > bin,: (2) No such file or directory
> > 2021-09-08T15:47:23.145091892Z 2021-09-08T15:37:23.136+0000 7ff97b7ca2c0 -1
> > auth: unable to find a keyring on
> > /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> > bin,: (2) No such file or directory
> > 2021-09-08T15:47:23.145091892Z 2021-09-08T15:37:23.136+0000 7ff97b7ca2c0 -1
> > auth: unable to find a keyring on
> > /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> > bin,: (2) No such file or directory
> > 2021-09-08T15:47:23.145091892Z 2021-09-08T15:42:23.135+0000 7ff97b7ca2c0 -1
> > auth: unable to find a keyring on
> > /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> > bin,: (2) No such file or directory
> > 2021-09-08T15:47:23.145091892Z 2021-09-08T15:42:23.135+0000 7ff97b7ca2c0 -1
> > auth: unable to find a keyring on
> > /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> > bin,: (2) No such file or directory
> > 2021-09-08T15:47:23.145091892Z 2021-09-08T15:47:23.136+0000 7ff97b7ca2c0 -1
> > librbd::api::Mirror: peer_bootstrap_import: failed to connect to peer
> > cluster: (110) Connection timed out
> > 2021-09-08T15:47:23.145091892Z rbd: failed to import peer bootstrap token:
> > exit status 110
> > 
> > @Prateek  Can you please the token details inside this secret
> > (ad6a06422d73c536059c82d8ce48b27f585fd60).  Need to check the contents.
> 
> Does not look like the root cause. The clusters are able to connect to each
> other. Tried manually bootstrapping the peer and it worked.

Santosh, it's the same error than when no ceph.conf is present but I thought you had it fixed in your PR.

Comment 6 Santosh Pillai 2021-09-14 14:21:22 UTC
(In reply to Sébastien Han from comment #5)
> (In reply to Santosh Pillai from comment #4)
> > (In reply to Santosh Pillai from comment #3)
> > > Failing to import Bootstrap token:
> > > 
> > > From Rook Logs: 
> > > 
> > > 2021-09-08T14:57:23.080194855Z 2021-09-08 14:57:23.080053 I | cephclient:
> > > add rbd-mirror bootstrap peer token for pool
> > > "ocs-storagecluster-cephblockpool"
> > > 2021-09-08T15:47:23.145091892Z 2021-09-08 15:47:23.144587 E |
> > > ceph-block-pool-controller: failed to reconcile. failed to add ceph rbd
> > > mirror peer: failed to import bootstrap peer token: failed to add rbd-mirror
> > > peer token for pool "ocs-storagecluster-cephblockpool". .
> > > 2021-09-08T14:57:23.128+0000 7ff97b7ca2c0 -1 auth: unable to find a keyring
> > > on
> > > /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> > > bin,: (2) No such file or directory
> > > 2021-09-08T15:47:23.145091892Z 2021-09-08T14:57:23.129+0000 7ff97b7ca2c0 -1
> > > auth: unable to find a keyring on
> > > /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> > > bin,: (2) No such file or directory
> > > 2021-09-08T15:47:23.145091892Z 2021-09-08T14:57:23.130+0000 7ff97b7ca2c0 -1
> > > auth: unable to find a keyring on
> > > /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> > > bin,: (2) No such file or directory
> > > 2021-09-08T15:47:23.145091892Z 2021-09-08T15:02:23.130+0000 7ff97b7ca2c0 -1
> > > auth: unable to find a keyring on
> > > /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> > > bin,: (2) No such file or directory
> > > 2021-09-08T15:47:23.145091892Z 2021-09-08T15:02:23.130+0000 7ff97b7ca2c0 -1
> > > auth: unable to find a keyring on
> > > /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> > > bin,: (2) No such file or directory
> > > 2021-09-08T15:47:23.145091892Z 2021-09-08T15:07:23.131+0000 7ff97b7ca2c0 -1
> > > auth: unable to find a keyring on
> > > /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> > > bin,: (2) No such file or directory
> > > 2021-09-08T15:47:23.145091892Z 2021-09-08T15:07:23.131+0000 7ff97b7ca2c0 -1
> > > auth: unable to find a keyring on
> > > /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> > > bin,: (2) No such file or directory
> > > 2021-09-08T15:47:23.145091892Z 2021-09-08T15:12:23.132+0000 7ff97b7ca2c0 -1
> > > auth: unable to find a keyring on
> > > /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> > > bin,: (2) No such file or directory
> > > 2021-09-08T15:47:23.145091892Z 2021-09-08T15:12:23.132+0000 7ff97b7ca2c0 -1
> > > auth: unable to find a keyring on
> > > /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> > > bin,: (2) No such file or directory
> > > 2021-09-08T15:47:23.145091892Z 2021-09-08T15:17:23.132+0000 7ff97b7ca2c0 -1
> > > auth: unable to find a keyring on
> > > /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> > > bin,: (2) No such file or directory
> > > 2021-09-08T15:47:23.145091892Z 2021-09-08T15:17:23.132+0000 7ff97b7ca2c0 -1
> > > auth: unable to find a keyring on
> > > /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> > > bin,: (2) No such file or directory
> > > 2021-09-08T15:47:23.145091892Z 2021-09-08T15:22:23.133+0000 7ff97b7ca2c0 -1
> > > auth: unable to find a keyring on
> > > /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> > > bin,: (2) No such file or directory
> > > 2021-09-08T15:47:23.145091892Z 2021-09-08T15:22:23.133+0000 7ff97b7ca2c0 -1
> > > auth: unable to find a keyring on
> > > /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> > > bin,: (2) No such file or directory
> > > 2021-09-08T15:47:23.145091892Z 2021-09-08T15:27:23.133+0000 7ff97b7ca2c0 -1
> > > auth: unable to find a keyring on
> > > /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> > > bin,: (2) No such file or directory
> > > 2021-09-08T15:47:23.145091892Z 2021-09-08T15:27:23.133+0000 7ff97b7ca2c0 -1
> > > auth: unable to find a keyring on
> > > /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> > > bin,: (2) No such file or directory
> > > 2021-09-08T15:47:23.145091892Z 2021-09-08T15:32:23.134+0000 7ff97b7ca2c0 -1
> > > auth: unable to find a keyring on
> > > /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> > > bin,: (2) No such file or directory
> > > 2021-09-08T15:47:23.145091892Z 2021-09-08T15:32:23.134+0000 7ff97b7ca2c0 -1
> > > auth: unable to find a keyring on
> > > /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> > > bin,: (2) No such file or directory
> > > 2021-09-08T15:47:23.145091892Z 2021-09-08T15:37:23.136+0000 7ff97b7ca2c0 -1
> > > auth: unable to find a keyring on
> > > /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> > > bin,: (2) No such file or directory
> > > 2021-09-08T15:47:23.145091892Z 2021-09-08T15:37:23.136+0000 7ff97b7ca2c0 -1
> > > auth: unable to find a keyring on
> > > /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> > > bin,: (2) No such file or directory
> > > 2021-09-08T15:47:23.145091892Z 2021-09-08T15:42:23.135+0000 7ff97b7ca2c0 -1
> > > auth: unable to find a keyring on
> > > /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> > > bin,: (2) No such file or directory
> > > 2021-09-08T15:47:23.145091892Z 2021-09-08T15:42:23.135+0000 7ff97b7ca2c0 -1
> > > auth: unable to find a keyring on
> > > /etc/ceph/..keyring,/etc/ceph/.keyring,/etc/ceph/keyring,/etc/ceph/keyring.
> > > bin,: (2) No such file or directory
> > > 2021-09-08T15:47:23.145091892Z 2021-09-08T15:47:23.136+0000 7ff97b7ca2c0 -1
> > > librbd::api::Mirror: peer_bootstrap_import: failed to connect to peer
> > > cluster: (110) Connection timed out
> > > 2021-09-08T15:47:23.145091892Z rbd: failed to import peer bootstrap token:
> > > exit status 110
> > > 
> > > @Prateek  Can you please the token details inside this secret
> > > (ad6a06422d73c536059c82d8ce48b27f585fd60).  Need to check the contents.
> > 
> > Does not look like the root cause. The clusters are able to connect to each
> > other. Tried manually bootstrapping the peer and it worked.
> 
> Santosh, it's the same error than when no ceph.conf is present but I thought
> you had it fixed in your PR.

This looks like a different error that happened when we import the bootstrap token (here https://github.com/rook/rook/blob/b3567f34a7f0ea1dace8830fbf0b1c393ede813b/pkg/daemon/ceph/client/mirror.go#L81).  It does not use the ceph.conf. Just the file containing the token details of the peer cluster.

Comment 7 Shyamsundar 2021-09-14 14:44:42 UTC
The downstream ramen code was refreshed post build #125, we are looking to see when the fix [1] made it to the builds to determine if that is the root cause for the problem.

Background:

On checking Pratik's setup the error in VolumeReplication that was noted was:

{"level":"error","timestamp":"2021-09-13T20:08:25.276Z","logger":"controller-runtime.manager.controller.volumereplication","caller":"controller/controller.go:253","msg":"Reconciler error","reconciler group":"replication.storage.openshift.io","reconciler kind":"VolumeReplication","name":"busybox-pvc","namespace":"busybox-sample","error":"rpc error: code = InvalidArgument desc = secondary image status is up=true and state=error"}

The mirror deamon was reporting:

debug 2021-09-13T20:10:31.714+0000 7f530277c700 -1 rbd::mirror::image_replayer::snapshot::Replayer: 0x5616ea6a8800 scan_remote_mirror_snapshots: split-brain detected: failed to find matching non-primary snapshot in remote image: local_snap_id_start=11, local_snap_ns=[mirror state=primary (demoted), complete=1, mirror_peer_uuids=6c9c8ed3-09cb-4d62-ab2d-d87d1ca8ae09, primary_mirror_uuid=, primary_snap_id=head, last_copied_object_number=0, snap_seqs={}]

The image mirroring info was:

sh-4.4$ rbd mirror image status ocs-storagecluster-cephblockpool/csi-vol-ce53bf1f-14c3-11ec-a809-0a580a80024a
csi-vol-ce53bf1f-14c3-11ec-a809-0a580a80024a:
  global_id:   737e6bfd-3c41-499d-b5bc-9b8fdd564dd0
  state:       up+error
  description: split-brain
  last_update: 2021-09-13 20:14:31
  peer_sites:
    name: 51f32320-86bd-40c1-a621-668a17036c48
    state: up+stopped
    description: local image is primary
    last_update: 2021-09-13 20:14:55

The fix [1] was for VRG reconciler in Ramen to wait till VR reported Resyncing as true, to ensure we do not delete VR when image is in a split-brain. Hence checking the builds to ensure the fix is in builds post the one in use (#125)

[1] "Add more condition checks to ensure healthy secondary": https://github.com/red-hat-storage/ramen/commit/36f797df57775232199de23000727b0d0b1c2a94

Comment 8 Shyamsundar 2021-09-14 14:54:26 UTC
> The fix [1] was for VRG reconciler in Ramen to wait till VR reported
> Resyncing as true, to ensure we do not delete VR when image is in a
> split-brain. Hence checking the builds to ensure the fix is in builds post
> the one in use (#125)
> 
> [1] "Add more condition checks to ensure healthy secondary":
> https://github.com/red-hat-storage/ramen/commit/
> 36f797df57775232199de23000727b0d0b1c2a94

We are unable to do this as we force pushed to the release branch in between.

Although, before the fix the root cause seems to be the same.

Pratik requesting a retest with the latest builds to ensure this is the root cause for the same.

Comment 9 Pratik Surve 2021-09-14 15:04:41 UTC
(In reply to Shyamsundar from comment #8)
> > The fix [1] was for VRG reconciler in Ramen to wait till VR reported
> > Resyncing as true, to ensure we do not delete VR when image is in a
> > split-brain. Hence checking the builds to ensure the fix is in builds post
> > the one in use (#125)
> > 
> > [1] "Add more condition checks to ensure healthy secondary":
> > https://github.com/red-hat-storage/ramen/commit/
> > 36f797df57775232199de23000727b0d0b1c2a94
> 
> We are unable to do this as we force pushed to the release branch in between.
> 
> Although, before the fix the root cause seems to be the same.
> 
> Pratik requesting a retest with the latest builds to ensure this is the root
> cause for the same.

 I have started testing with the new build `4.9.0-138.ci` i will update the result when done with the testing

Comment 12 Michael Adam 2021-09-14 15:36:25 UTC
Moved to ON_QA based on Shyam's comment.

Please fail QA if it still happens.

Comment 19 errata-xmlrpc 2021-12-13 17:46:17 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat OpenShift Data Foundation 4.9.0 enhancement, security, and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:5086


Note You need to log in before you can comment on or make changes to this bug.