Bug 1905963 - [RFE]There is no option to replace failed storage device via UI on encrypted cluster
Summary: [RFE]There is no option to replace failed storage device via UI on encrypted ...
Keywords:
Status: CLOSED DEFERRED
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Console Storage Plugin
Version: 4.6
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 4.8.0
Assignee: Sanjal Katiyar
QA Contact: Raz Tamir
URL:
Whiteboard:
Depends On:
Blocks: 1906002
TreeView+ depends on / blocked
 
Reported: 2020-12-09 12:11 UTC by Oded
Modified: 2021-06-04 05:34 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1906002 (view as bug list)
Environment:
Last Closed: 2021-06-03 18:14:41 UTC
Target Upstream Version:
Embargoed:
skatiyar: needinfo-


Attachments (Terms of Use)

Description Oded 2020-12-09 12:11:54 UTC
Description of problem:
There is no option to replace failed storage device via UI on encrypted cluster

Version-Release number of selected component (if applicable):
Provider:Vmware LSO
OCP Version:4.6.0-0.nightly-2020-12-08-021151
OCS Version:ocs-operator.v4.6.0-189.ci

How reproducible:


Steps to Reproduce:
1.Check Cluster status (OCS+OCP)

2.Check Ceph status:
sh-4.4# ceph health
HEALTH_OK

3.Verify OSDs encrypted:
[root@compute-1 /]# lsblk
NAME                                                    MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
loop0                                                     7:0    0   100G  0 loop  
sda                                                       8:0    0   120G  0 disk  
|-sda1                                                    8:1    0   384M  0 part  /boot
|-sda2                                                    8:2    0   127M  0 part  /boot/efi
|-sda3                                                    8:3    0     1M  0 part  
`-sda4                                                    8:4    0 119.5G  0 part  
  `-coreos-luks-root-nocrypt                            253:0    0 119.5G  0 dm    /sysroot
sdb                                                       8:16   0   100G  0 disk  
`-ocs-deviceset-localblock-1-data-0-5mdwg-block-dmcrypt 253:1    0   100G  0 crypt 
[root@compute-1 /]# dmsetup ls
ocs-deviceset-localblock-1-data-0-5mdwg-block-dmcrypt	(253:1)
coreos-luks-root-nocrypt	(253:0)


3.Scale down the OSD deployment for the OSD to be replaced (via CLI).
[odedviner@localhost ~]$ oc get -n openshift-storage pods -l app=rook-ceph-osd -o wide
NAME                               READY   STATUS    RESTARTS   AGE     IP            NODE        NOMINATED NODE   READINESS GATES
rook-ceph-osd-0-8649856684-bvbn8   1/1     Running   0          5h52m   10.129.2.63   compute-1   <none>           <none>
rook-ceph-osd-1-84c75fd56c-5vhx5   1/1     Running   0          5h52m   10.128.2.36   compute-0   <none>           <none>
rook-ceph-osd-2-559c675859-8cbdl   1/1     Running   0          5h52m   10.131.0.30   compute-2   <none>           <none>

Choose OSD-0:
$ oc scale -n openshift-storage deployment rook-ceph-osd-0 --replicas=0
deployment.apps/rook-ceph-osd-0 scaled

$ oc get -n openshift-storage pods -l ceph-osd-id=0
No resources found in openshift-storage namespace.


4.Persistent Storage dashboard with the alert

5.Add HDD disk to node compute-1 via vcenter

6.Click Troubleshoot in the Disk not responding (on UI)

7.Click the Disks tab. From the Action (:) menu of the failed disk, click Start Disk Replacement:
There is no option to replace the device via UI.[Failed!!!!]


Detailed test procedure:
https://docs.google.com/document/d/1rIGJ3lFh7yXpVQ6rR4rNAqby11MxQRvuQHVUnunwa9s/edit

Actual results:


Expected results:


Additional info:

Comment 1 Oded 2020-12-09 16:11:18 UTC
must-gather OCS+OCP
http://rhsqe-repo.lab.eng.blr.redhat.com/OCS/ocs-qe-bugs/bz-1905963/

Comment 3 Servesha 2020-12-11 09:09:53 UTC
@afreen Sure. I will look at the logs.

Comment 4 Ankush Behl 2021-01-19 06:47:42 UTC
@servesha any updates?

Comment 7 Nishanth Thomas 2021-02-13 07:22:51 UTC
RFE, Moving this out of 4.7


Note You need to log in before you can comment on or make changes to this bug.