Description of problem: After deleting ocsinit-cephfilesystem and rook-ceph-mds pods, in the dashboard, it shows: `rook-ceph is not available`. Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: 1. Delete ocsinit-cephfilesystem 2. Remove rook-ceph-mds pods 3. Check the UI 4. Also, check `ceph -s` inside ceph toolbox pod Actual results: After the deletion of ocsinit-cephfilesystem and rook-ceph-mds pods, in UI, it shows: `rook-ceph is not available`. However, in the ceph toolbox, it shows correct ceph health - HEALTH_OK. Expected results: After the deletion of ocsinit-cephfilesystem and rook-ceph-mds pods, in UI, it should show the correct rook-ceph status. Because except rook-ceph-mds other pods are present, up and running fine. Additional info: Two ceph-rook-mds pods were in a pending state. when described the MDS pod it was giving warning message - 0/3 nodes are available: 3 Insufficient cpu. At that time my ceph health was : +++++ [root@master-1 /]# ceph health detail HEALTH_ERR 1 filesystem is offline; 1 filesystem is online with fewer MDS than max_mds MDS_ALL_DOWN 1 filesystem is offline fs ocsinit-cephfilesystem is offline because no MDS is active for it. MDS_UP_LESS_THAN_MAX 1 filesystem is online with fewer MDS than max_mds fs ocsinit-cephfilesystem has 0 MDS online, but wants 1 +++++ It could have been solved by changing the `limits` parameter in storage cluster YAML file but I wanted to clean rook-ceph-mds pods, so as per the output of `ceph health detail`, ocsinit-cephfilesystem was deleted. In toolbox pod ceph health shows HEALTH_OK but on the dashboard, it says rook-ceph is not available.
@ Servesha, Can you provide requested info?
@ Nishanth, here is needed info : > What was the Health Status before deleting? (Maybe it was already broken?) - Before deleting `ocsinit-cephfilesystem`, except ceph-mds pods(were pending), other all pods were up and running. So, the ceph health status was `HEALTH_WARN`. > Please check if your rook-ceph-mgr pod is running. Also, provide rook-operator logs. - ceph-mds pods were not running at that time, they were in the pending state. Unfortunately at this instance, I do not have rook-operator logs since that setup had been deleted. > Did deleting the said resources cause deletion of any other resources? - The notable deleted resources were two ceph-mds pods after deleting `ocsinit-cephfilesystem`. Then if checked dashboard, it was showing `rook-ceph unavailable`. Except that other things were fine.
I am unable to reproduce this. Deleting or Recreating cephfilesystem did not affect monitoring at all. ceph-mgr is actively talking to Prometheus.
Works for me and no further instructions to replicate the issue. Closing this.