Bug 1958875 - [OCS tracker for OCP bug 1958873] :Device Replacemet UI, The status of the disk is "replacement ready" before I clicked on "start replacement"
Summary: [OCS tracker for OCP bug 1958873] :Device Replacemet UI, The status of the di...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat OpenShift Container Storage
Classification: Red Hat Storage
Component: management-console
Version: 4.7
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: ---
Assignee: Afreen
QA Contact: Elad
URL:
Whiteboard:
Depends On: 1957756 1958873
Blocks: 1967628
TreeView+ depends on / blocked
 
Reported: 2021-05-10 11:05 UTC by Neha Berry
Modified: 2021-09-07 13:53 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: Known Issue
Doc Text:
.The status of the disk is `replacement ready` before `start replacement` is clicked The user interface can not differentiate between a new disk failure on a different or same node and the previously failed disk if both the disks have the same name. Due to this same name issue, disk replacement is not allowed as the user interface considers that this newly failed disk is already replaced. To work around this issue, follow the below steps: 1. On OpenShift Container Platform Web Console --> click *Administrator*. 2. Click *Home* --> *Search*. 3. In *resources dropdown* -> search for `TemplateInstance`. 4. Select `TemplateInstance` and make sure to choose openshift-storage namespace. 5. Delete all template instances.
Clone Of: 1958873
: 1967628 (view as bug list)
Environment:
Last Closed: 2021-09-07 13:53:41 UTC
Embargoed:


Attachments (Terms of Use)

Description Neha Berry 2021-05-10 11:05:00 UTC
Created a clone to track in OCS 4.7 

+++ This bug was initially created as a clone of Bug #1958873 +++

+++ This bug was initially created as a clone of Bug #1957756 +++

Description of problem:
After a successful disk replacement from UI for a particular node, re-initiated the procedure (section 4.2) again for a different OSD in the another node and following was the observation:
The status of the disk is "replacement ready" before I clicked on start replacement

Version-Release number of selected component (if applicable):
OCP Version:4.7.0-0.nightly-2021-04-29-101239
OCS Version:4.7.0-372.ci
Provider: vmware
Setup: lso cluster


How reproducible:


Steps to Reproduce:
1.Replace OSD-1 on compute-1 node, via UI [pass]
2.Replace OSD-3 on compute-3 node, via UI
a.Scale down osd-2
$ oc scale -n openshift-storage deployment rook-ceph-osd-3 --replicas=0
deployment.apps/rook-ceph-osd-3 scaled
b.Click Troubleshoot in the Disk <disk1> not responding or the Disk <disk1> not accessible alert.
c.From the Action (⋮) menu of the failed disk, click Start Disk Replacement.
The status of the disk is "replacement ready" before I clicked on start replacement [Failed!!]

*Attached screenshot

for more details:
https://docs.google.com/document/d/1KuUjbP25i9vBR7Wp_dKwLAULi5LDFLeIXi4kZY4wLjc/edit

Actual results:
The status of the disk is "replacement ready" before I clicked on "start replacement"

Expected results:
The status of the disk is "Not Responding" before I clicked on "start replacement"

Additional info:

--- Additional comment from  on 2021-05-06 12:10:08 UTC ---

missing severity @Oded

--- Additional comment from OpenShift Automated Release Tooling on 2021-05-06 23:25:20 UTC ---

Elliott changed bug status from MODIFIED to ON_QA.

--- Additional comment from  on 2021-05-07 12:56:28 UTC ---

It does not allow initiating osd replacement due to this message.
Hence one cannot do disk replacement.

--- Additional comment from  on 2021-05-07 13:03:29 UTC ---

Hi Neha,

This is the z stream latest information for backporting this BZ - 4.6.29 (may 20) and 4.7.11 (may 19) window opened.

--- Additional comment from  on 2021-05-10 10:41:20 UTC ---

Workaround for now:

Run `oc delete templateinstance -n openshift-storage --all` when this issue is encountered.

Comment 7 Oded 2021-06-16 20:30:57 UTC
Bug Fixed

SetUp:
OCP Version:4.7.0-0.nightly-2021-06-12-151209
OCS Version:ocs-operator.v4.7.1-410.ci
LSO Version: 4.7.0-202105210300.p0
Provider: vmware
type: lso cluster


Test Procedure:
1.Replace OSD-0 on compute-0 node via UI [pass]
2.Replace OSD-0 on compute-0 node via UI [pass]
3.Replace OSD-1 on compute-1 node via UI [pass]


for more details:
https://docs.google.com/document/d/1KuUjbP25i9vBR7Wp_dKwLAULi5LDFLeIXi4kZY4wLjc/edit


Note You need to log in before you can comment on or make changes to this bug.