Bug 1762698 - After deleting ocsinit-cephfilesystem and rook-ceph-mds pods, in the dashboard, it shows: `rook-ceph is not available`
Summary: After deleting ocsinit-cephfilesystem and rook-ceph-mds pods, in the dashboar...
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Console Storage Plugin
Version: 4.3.0
Hardware: x86_64
OS: Linux
unspecified
unspecified
Target Milestone: ---
: 4.3.0
Assignee: umanga
QA Contact: Raz Tamir
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-10-17 09:31 UTC by Servesha
Modified: 2019-11-06 12:10 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-11-06 12:10:42 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Servesha 2019-10-17 09:31:48 UTC
Description of problem: After deleting ocsinit-cephfilesystem and rook-ceph-mds pods, in the dashboard, it shows: `rook-ceph is not available`.


Version-Release number of selected component (if applicable):


How reproducible: Always


Steps to Reproduce:
1. Delete ocsinit-cephfilesystem
2. Remove rook-ceph-mds pods
3. Check the UI
4. Also, check `ceph -s` inside ceph toolbox pod

Actual results: After the deletion of ocsinit-cephfilesystem and rook-ceph-mds pods, in UI, it shows: `rook-ceph is not available`. However, in the ceph toolbox, it shows correct ceph health - HEALTH_OK. 


Expected results: After the deletion of ocsinit-cephfilesystem and rook-ceph-mds pods, in UI, it should show the correct rook-ceph status. Because except rook-ceph-mds other pods are present, up and running fine.


Additional info:

Two ceph-rook-mds pods were in a pending state. when described the MDS pod it was giving warning message - 0/3 nodes are available: 3 Insufficient cpu. At that time my ceph health was :

+++++
[root@master-1 /]# ceph health detail HEALTH_ERR 1 filesystem is offline; 1 filesystem is online with fewer MDS than max_mds MDS_ALL_DOWN 1 filesystem is offline fs ocsinit-cephfilesystem is offline because no MDS is active for it. MDS_UP_LESS_THAN_MAX 1 filesystem is online with fewer MDS than max_mds fs ocsinit-cephfilesystem has 0 MDS online, but wants 1
+++++

It could have been solved by changing the `limits` parameter in storage cluster YAML file but I wanted to clean rook-ceph-mds pods, so as per the output of `ceph health detail`,  ocsinit-cephfilesystem 
was deleted.

In toolbox pod ceph health shows HEALTH_OK but on the dashboard, it says rook-ceph is not available.

Comment 3 Nishanth Thomas 2019-10-25 10:43:54 UTC
@ Servesha, Can you provide requested info?

Comment 4 Servesha 2019-11-06 07:06:08 UTC
@ Nishanth, here is needed info : 

> What was the Health Status before deleting? (Maybe it was already broken?)

- Before deleting `ocsinit-cephfilesystem`, except ceph-mds pods(were pending), other all pods were up and running.
So, the ceph health status was `HEALTH_WARN`.

> Please check if your rook-ceph-mgr pod is running. Also, provide rook-operator logs.

- ceph-mds pods were not running at that time, they were in the pending state. Unfortunately at this instance, I do not have rook-operator logs since that setup had been deleted.

> Did deleting the said resources cause deletion of any other resources?
- The notable deleted resources were two ceph-mds pods after deleting `ocsinit-cephfilesystem`. Then if checked dashboard, it was showing `rook-ceph unavailable`. Except that other things were fine.

Comment 5 umanga 2019-11-06 07:57:06 UTC
I am unable to reproduce this.
Deleting or Recreating cephfilesystem did not affect monitoring at all. ceph-mgr is actively talking to Prometheus.

Comment 8 umanga 2019-11-06 12:10:42 UTC
Works for me and no further instructions to replicate the issue. Closing this.


Note You need to log in before you can comment on or make changes to this bug.