Submariner was reinstalled on a an existing setup. When submariner was removed, it deleted all the exported services. When globalnet operator started again, it created all the exported services but this time with different IPs. For example: cluster c1 had exported IP for a service as 242.1.255.* but after globalnet was reinstalled, the exported services were recreated but this time with 242.0.255.*. Rook is still using 242.1.255.* saved in the config map and has no clue about 242.0.255.* In rook we don't really allow mon IP's to change. That's not a supported case. For global IPs scenario, we allow it by failing over one mon at a time. But this is a different situation, all the mon (global) IPs got changed at once when Submariner was reinstalled. @tnielsen Do you think this can be a scenario that rook should support?
Used the following steps to add mons back to quorum. Here we edit only one mon and let rook operator to failover other mons. Obtain the following information from the cluster fsid mon e exported IP: This can be optioned from `oc get service | grep submariner`. Lets say the exported IP for mon-e is 242.0.255.251 in this case - Scale down OCS operator and Rook deployments oc scale deployment ocs-operator --replicas=0 -n openshift-storage oc scale deployment rook-ceph-operator --replicas=0 -n openshift-storage - Update mon deployment to use correct exported IP in `spec.containers[0].args.public_addr` `--public-addr=242.0.255.251` - Copy mon-e deployment oc get deployment rook-ceph-mon-e -o yaml > rook-ceph-mon-e-deployment-c1.yaml - Edit rook-ceph-mon-endpoints to use correct exported IP for mon-e - Patch the rook-ceph-mon-e Deployment to stop this mon working without deleting the mon pod: kubectl patch deployment rook-ceph-mon-e --type='json' -p '[{"op":"remove", "path":"/spec/template/spec/containers/0/livenessProbe"}]' kubectl patch deployment rook-ceph-mon-e -p '{"spec": {"template": {"spec": {"containers": [{"name": "mon", "command": ["sleep", "infinity"], "args": []}]}}}}' - Connect to mon-e pod: oc exec -it <rook-ceph-mon-e> sh - Inside mon-e pod: - Create a temporary monmap monmaptool --create --add e 242.0.255.251 --set-min-mon-release --enable--all-features --clobber /tmp/monmap --fsid <ceph fsid> - Remove this mon-e entry monmaptool --rm e /tmp/monmap - Add v2 protocol (Add V1 protocol as well if cluster supports both) monmaptool --addv e [v2:242.0.255.251:3300] /tmp/monmap - inject this monmap to mon-e ceph-mon -i e --inject-monmap /tmp/monmap - Exit mon-e pod - Scale back ocs and rook deployments: oc scale deployment ocs-operator --replicas=1 -n openshift-storage oc scale deployment rook-ceph-operator --replicas=1 -n openshift-storage - Wait for rook operator to failover other mons
c1 cluster is `healthy` now after using above workaround ```sh-5.1$ ceph status cluster: id: 6bee5946-d3e4-4999-8110-24ed4325fbe2 health: HEALTH_OK services: mon: 3 daemons, quorum e,g,h (age 21m) mgr: a(active, since 24m) mds: 1/1 daemons up, 1 hot standby osd: 3 osds: 3 up (since 21m), 3 in (since 10d) rbd-mirror: 1 daemon active (1 hosts) rgw: 1 daemon active (1 hosts, 1 zones) data: volumes: 1/1 healthy pools: 12 pools, 169 pgs objects: 3.05k objects, 3.3 GiB usage: 5.9 GiB used, 1.5 TiB / 1.5 TiB avail pgs: 169 active+clean io: client: 31 KiB/s rd, 1.5 MiB/s wr, 36 op/s rd, 322 op/s wr ``` c2 has daemons crashing but mon's are up now. ``` sh-5.1$ ceph status cluster: id: c2c61349-f7b5-47c5-8fd6-f687ea46b450 health: HEALTH_WARN 1599 daemons have recently crashed services: mon: 3 daemons, quorum e,h,i (age 32m) mgr: a(active, since 34m) mds: 1/1 daemons up, 1 hot standby osd: 3 osds: 3 up (since 33m), 3 in (since 10d) rbd-mirror: 1 daemon active (1 hosts) rgw: 1 daemon active (1 hosts, 1 zones) data: volumes: 1/1 healthy pools: 12 pools, 169 pgs objects: 2.89k objects, 4.2 GiB usage: 12 GiB used, 1.5 TiB / 1.5 TiB avail pgs: 169 active+clean io: client: 17 KiB/s rd, 64 KiB/s wr, 21 op/s rd, 6 op/s wr ```
(In reply to Santosh Pillai from comment #5) > Used the following steps to add mons back to quorum. Here we edit only one > mon and let rook operator to failover other mons. > > Obtain the following information from the cluster > fsid > mon e exported IP: This can be optioned from `oc get service | grep > submariner`. Lets say the exported IP for mon-e is 242.0.255.251 in this case > > - Scale down OCS operator and Rook deployments > oc scale deployment ocs-operator --replicas=0 -n openshift-storage > oc scale deployment rook-ceph-operator --replicas=0 -n openshift-storage > > - Update mon deployment to use correct exported IP in > `spec.containers[0].args.public_addr` > `--public-addr=242.0.255.251` > > - Copy mon-e deployment > oc get deployment rook-ceph-mon-e -o yaml > > rook-ceph-mon-e-deployment-c1.yaml > > - Edit rook-ceph-mon-endpoints to use correct exported IP for mon-e > > - Patch the rook-ceph-mon-e Deployment to stop this mon working without > deleting the mon pod: > kubectl patch deployment rook-ceph-mon-e --type='json' -p > '[{"op":"remove", "path":"/spec/template/spec/containers/0/livenessProbe"}]' > kubectl patch deployment rook-ceph-mon-e -p '{"spec": {"template": > {"spec": {"containers": [{"name": "mon", "command": ["sleep", "infinity"], > "args": []}]}}}}' > > - Connect to mon-e pod: > oc exec -it <rook-ceph-mon-e> sh > > - Inside mon-e pod: > - Create a temporary monmap > monmaptool --create --add e 242.0.255.251 --set-min-mon-release > --enable--all-features --clobber /tmp/monmap --fsid <ceph fsid> > - Remove this mon-e entry > monmaptool --rm e /tmp/monmap > - Add v2 protocol (Add V1 protocol as well if cluster supports both) > monmaptool --addv e [v2:242.0.255.251:3300] /tmp/monmap > - inject this monmap to mon-e > ceph-mon -i e --inject-monmap /tmp/monmap > - Exit mon-e pod > > - Scale back ocs and rook deployments: > oc scale deployment ocs-operator --replicas=1 -n openshift-storage > oc scale deployment rook-ceph-operator --replicas=1 -n openshift-storage > > - Wait for rook operator to failover other mons One last step is to restart rbd mirror pods on both the clusters. Mirroring health is ok on both the clusters now. oc get cephblockpool ocs-storagecluster-cephblockpool -n openshift-storage -o jsonpath='{.status.mirroringStatus.summary}{"\n"}' {"daemon_health":"OK","health":"OK","image_health":"OK","states":{"replaying":20}}
Santosh great to see the workaround for getting the cluster back up in this scenario of reinstalling Submariner. This scenario is very disruptive. Ceph requires immutable IP addresses for the mons. We cannot support this scenario automatically in Rook. The only way we can hope to support this scenario is that if/when it happens in production, they will need to contact the support team to step through these complicated recovery steps. Even better if we can get this recovery working with the krew plugin, which will just need an addition to the existing --restore-quorum command to support the changing IP. Then there is just the separate question of the best way for the support team to use the krew plugin (or an alternative) that is fully tested by QE.
Based on previous comments, moving out of 4.13 since we can't support anything except the customer working with support for disaster recovery steps.
Hi Vikhyat Did you get a chance to check the last comment by Travis regarding doc support?
(In reply to Santosh Pillai from comment #29) > Hi Vikhyat > > Did you get a chance to check the last comment by Travis regarding doc > support? Hi Santosh, Yes, updating IP should be easy - this is documented here https://access.redhat.com/solutions/3093781 for standalone clusters. I think the basic steps should be the same for ODF. Adding @assingh who can help from ODF side.
(In reply to Vikhyat Umrao from comment #30) > (In reply to Santosh Pillai from comment #29) > > Hi Vikhyat > > > > Did you get a chance to check the last comment by Travis regarding doc > > support? > > Hi Santosh, > > Yes, updating IP should be easy - this is documented here > https://access.redhat.com/solutions/3093781 for standalone clusters. I think > the basic steps should be the same for ODF. Adding @assingh who > can help from ODF side. Ahh, I see in comment#5 you are already able to achieve it and the question is do we need to document it or not? I think yes we should document it @
@
@bkunal and Ashish - can you please check from the KCS point of view?
Thanks for the doc Bipin. I'll take a look at the doc tomorrow.