1885175 – Handle disappeared underlying device for encrypted OSD

Bug 1885175 - Handle disappeared underlying device for encrypted OSD

Summary: Handle disappeared underlying device for encrypted OSD

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat OpenShift Container Storage
Classification:	Red Hat Storage
Component:	rook
Sub Component:
Version:	4.6
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	OCS 4.6.0
Assignee:	Sébastien Han
QA Contact:	Oded
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	1885666 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2020-10-05 10:46 UTC by Sébastien Han
Modified:	2020-12-17 06:25 UTC (History)
CC List:	7 users (show)
Fixed In Version:	4.6.0-116.ci
Doc Type:	No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed:	2020-12-17 06:24:44 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Github	openshift rook pull 130	None	closed	Bug 1885175: ceph: check underlaying block status	2020-11-22 10:49:40 UTC
Github	rook rook pull 6367	None	closed	ceph: check underlaying block status	2020-11-22 10:49:40 UTC
Red Hat Product Errata	RHSA-2020:5605	None	None	None	2020-12-17 06:25:26 UTC

Description Sébastien Han 2020-10-05 10:46:53 UTC

During the initialization encrypted OSD initialization sequence, we check
for the presence of the encrypted container. If it exists we don't try
to open it again since this will result in an error.
However, there is another case we need to handle, when the underlying
device is gone. For instance, if the pod/PV couple was drained and move
back, living the orphan dm. Once the pod comes back, the dm is still
present and perfectly matches. Unfortunately, the underlying disk is
different and thus the dm must be removed and the disk re-opened.

Comment 4 Sébastien Han 2020-10-08 08:27:44 UTC

*** Bug 1885666 has been marked as a duplicate of this bug. ***

Comment 5 Oded 2020-10-26 08:59:33 UTC

Setup:
Provider:Vmware
OCP version:4.6.0-0.nightly-2020-10-22-034051
OCS Version:ocs-operator.v4.6.0-141.ci

Test Process:
1.Verify 3 OSDs encrypted:
`-ocs-deviceset-0-data-0-skmxl-block-dmcrypt
         253:1    0   256G  0 crypt 

`-ocs-deviceset-1-data-0-vmnvt-block-dmcrypt
         253:1    0   256G  0 crypt 

`-ocs-deviceset-2-data-0-br2xj-block-dmcrypt
          253:1    0   256G  0 crypt 


2.Scale one of OSDs to 0.
$ oc -n openshift-storage scale --replicas=0 deployment/rook-ceph-osd-2

3.Get OSD pods
osd-2 doesnt exist.
$ oc get pods -n openshift-storage | grep -i osd
rook-ceph-osd-0-7ffffbdf78-qkn9v                                  1/1     Running     0          8h
rook-ceph-osd-1-66678f8bcc-pbxlm                                  1/1     Running     0          8h

4.Wait 15 minutes.

5.Scale the OSD to 1.
$ oc -n openshift-storage scale --replicas=1 deployment/rook-ceph-osd-2

6.Check OSD-2 pod status
$ oc get pods rook-ceph-osd-2-69555cd8cc-blqmb -n openshift-storage 
NAME                               READY   STATUS    RESTARTS   AGE
rook-ceph-osd-2-69555cd8cc-blqmb   1/1     Running   0          90s

7. Check ceph health.
$ oc -n openshift-storage exec rook-ceph-tools-6c7c4c65d9-q5xs2 -- ceph health
HEALTH_OK

Comment 6 Oded 2020-10-26 09:06:24 UTC

Bug not Reconstructed.
Encrypted OSD come up when scaled from 0 to 1.

Comment 9 errata-xmlrpc 2020-12-17 06:24:44 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat OpenShift Container Storage 4.6.0 security, bug fix, enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5605

Note You need to log in before you can comment on or make changes to this bug.