1906002 – [Cloned in OCS as tracker] [RFE]There is no option to replace failed storage device via UI on encrypted cluster in LSO

Bug 1906002 - [Cloned in OCS as tracker] [RFE]There is no option to replace failed storage device via UI on encrypted cluster in LSO

Summary: [Cloned in OCS as tracker] [RFE]There is no option to replace failed storage ...

Keywords:
Status:	CLOSED DEFERRED
Alias:	None
Product:	Red Hat OpenShift Container Storage
Classification:	Red Hat Storage
Component:	management-console
Sub Component:
Version:	4.6
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	---
Assignee:	Afreen
QA Contact:	Elad
Docs Contact:
URL:
Whiteboard:
Depends On:	1905963
Blocks:	1882359
TreeView+	depends on / blocked

Reported:	2020-12-09 14:12 UTC by Neha Berry
Modified:	2021-06-03 18:14 UTC (History)
CC List:	12 users (show)
Fixed In Version:
Doc Type:	Known Issue
Doc Text:	.Device replacement action cannot be performed via UI for an encrypted OpenShift Container Storage cluster On an encrypted OpenShift Container Storage cluster, the discovery result CR discovers the device backed by a Ceph OSD (Object Storage Daemon) differently from the one reported in the Ceph alerts. When clicking the alert, the user is presented with `Disk not found` message. Due to the mismatch, console UI cannot enable the disk replacement option for an OCS user. To workaround this issue, use the CLI procedure for failed device replacement.
Clone Of:	1905963
Environment:
Last Closed:	2021-06-03 18:14:18 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Neha Berry 2020-12-09 14:12:26 UTC

Cloning this BZ to OCS to track the inclusion of KNOWN ISSUE in OCS 4.6 Release Notes.



+++ This bug was initially created as a clone of Bug #1905963 +++

Description of problem:
There is no option to replace failed storage device via UI on encrypted cluster

Version-Release number of selected component (if applicable):
Provider:Vmware LSO
OCP Version:4.6.0-0.nightly-2020-12-08-021151
OCS Version:ocs-operator.v4.6.0-189.ci

How reproducible:


Steps to Reproduce:
1.Check Cluster status (OCS+OCP)

2.Check Ceph status:
sh-4.4# ceph health
HEALTH_OK

3.Verify OSDs encrypted:
[root@compute-1 /]# lsblk
NAME                                                    MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
loop0                                                     7:0    0   100G  0 loop  
sda                                                       8:0    0   120G  0 disk  
|-sda1                                                    8:1    0   384M  0 part  /boot
|-sda2                                                    8:2    0   127M  0 part  /boot/efi
|-sda3                                                    8:3    0     1M  0 part  
`-sda4                                                    8:4    0 119.5G  0 part  
  `-coreos-luks-root-nocrypt                            253:0    0 119.5G  0 dm    /sysroot
sdb                                                       8:16   0   100G  0 disk  
`-ocs-deviceset-localblock-1-data-0-5mdwg-block-dmcrypt 253:1    0   100G  0 crypt 
[root@compute-1 /]# dmsetup ls
ocs-deviceset-localblock-1-data-0-5mdwg-block-dmcrypt	(253:1)
coreos-luks-root-nocrypt	(253:0)


3.Scale down the OSD deployment for the OSD to be replaced (via CLI).
[odedviner@localhost ~]$ oc get -n openshift-storage pods -l app=rook-ceph-osd -o wide
NAME                               READY   STATUS    RESTARTS   AGE     IP            NODE        NOMINATED NODE   READINESS GATES
rook-ceph-osd-0-8649856684-bvbn8   1/1     Running   0          5h52m   10.129.2.63   compute-1   <none>           <none>
rook-ceph-osd-1-84c75fd56c-5vhx5   1/1     Running   0          5h52m   10.128.2.36   compute-0   <none>           <none>
rook-ceph-osd-2-559c675859-8cbdl   1/1     Running   0          5h52m   10.131.0.30   compute-2   <none>           <none>

Choose OSD-0:
$ oc scale -n openshift-storage deployment rook-ceph-osd-0 --replicas=0
deployment.apps/rook-ceph-osd-0 scaled

$ oc get -n openshift-storage pods -l ceph-osd-id=0
No resources found in openshift-storage namespace.


4.Persistent Storage dashboard with the alert

5.Add HDD disk to node compute-1 via vcenter

6.Click Troubleshoot in the Disk not responding (on UI)

7.Click the Disks tab. From the Action (:) menu of the failed disk, click Start Disk Replacement:
There is no option to replace the device via UI.[Failed!!!!]


Detailed test procedure:
https://docs.google.com/document/d/1rIGJ3lFh7yXpVQ6rR4rNAqby11MxQRvuQHVUnunwa9s/edit

Actual results:


Expected results:


Additional info:

Note You need to log in before you can comment on or make changes to this bug.