During the initialization encrypted OSD initialization sequence, we check for the presence of the encrypted container. If it exists we don't try to open it again since this will result in an error. However, there is another case we need to handle, when the underlying device is gone. For instance, if the pod/PV couple was drained and move back, living the orphan dm. Once the pod comes back, the dm is still present and perfectly matches. Unfortunately, the underlying disk is different and thus the dm must be removed and the disk re-opened.
*** Bug 1885666 has been marked as a duplicate of this bug. ***
Setup: Provider:Vmware OCP version:4.6.0-0.nightly-2020-10-22-034051 OCS Version:ocs-operator.v4.6.0-141.ci Test Process: 1.Verify 3 OSDs encrypted: `-ocs-deviceset-0-data-0-skmxl-block-dmcrypt 253:1 0 256G 0 crypt `-ocs-deviceset-1-data-0-vmnvt-block-dmcrypt 253:1 0 256G 0 crypt `-ocs-deviceset-2-data-0-br2xj-block-dmcrypt 253:1 0 256G 0 crypt 2.Scale one of OSDs to 0. $ oc -n openshift-storage scale --replicas=0 deployment/rook-ceph-osd-2 3.Get OSD pods osd-2 doesnt exist. $ oc get pods -n openshift-storage | grep -i osd rook-ceph-osd-0-7ffffbdf78-qkn9v 1/1 Running 0 8h rook-ceph-osd-1-66678f8bcc-pbxlm 1/1 Running 0 8h 4.Wait 15 minutes. 5.Scale the OSD to 1. $ oc -n openshift-storage scale --replicas=1 deployment/rook-ceph-osd-2 6.Check OSD-2 pod status $ oc get pods rook-ceph-osd-2-69555cd8cc-blqmb -n openshift-storage NAME READY STATUS RESTARTS AGE rook-ceph-osd-2-69555cd8cc-blqmb 1/1 Running 0 90s 7. Check ceph health. $ oc -n openshift-storage exec rook-ceph-tools-6c7c4c65d9-q5xs2 -- ceph health HEALTH_OK
Bug not Reconstructed. Encrypted OSD come up when scaled from 0 to 1.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: Red Hat OpenShift Container Storage 4.6.0 security, bug fix, enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5605