Bug 2293632

Summary: ODF cephblockpool warning status not informative in UI nor raised in ceph tools
Product: [Red Hat Storage] Red Hat OpenShift Data Foundation Reporter: Sarah Bennert <sbennert>
Component: management-consoleAssignee: Nishanth Thomas <nthomas>
Status: NEW --- QA Contact: Prasad Desala <tdesala>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.15CC: amohan, badhikar, nthomas, odf-bz-bot, sbennert, skatiyar
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
OCP UI Cluster Overview
none
OCP UI Data Foundation dashboard overview
none
OCP UI Data Foundation StorageSystem overview dashboard
none
OCP UI Data Foundation ocs-storagecluster-cephblockpool details none

Description Sarah Bennert 2024-06-21 12:07:44 UTC
Description of problem (please be detailed as possible and provide log
snippests):

While testing CNV upgrade, found ODF did not raise a warning to a high enough visibility level, leaving the user to dig through the UI and CLI to determine the cause.


From the UI:

Storage shows green on the cluster overview.
Data Foundation UI is showing green for both Data Foundation and StorageSystem
StorageSystem shows green
CephBlockPool is showing 'ready', but has a warning triangle with no other feedback.

From ceph-tools:

sh-5.1$ ceph -s
  cluster:
    id:     97e1d345-532a-490e-8cec-16b51ce7d36d
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum b,c,d (age 4d)
    mgr: a(active, since 4d), standbys: b
    mds: 1/1 daemons up, 1 hot standby
    osd: 3 osds: 3 up (since 4d), 3 in (since 5d)
    rgw: 1 daemon active (1 hosts, 1 zones)
 
  data:
    volumes: 1/1 healthy
    pools:   12 pools, 265 pgs
    objects: 394.11k objects, 1.3 TiB
    usage:   3.0 TiB used, 1.4 TiB / 4.4 TiB avail
    pgs:     265 active+clean
 
  io:
    client:   73 MiB/s rd, 479 MiB/s wr, 19.18k op/s rd, 11.70k op/s wr
 

sh-5.1$ ceph df
--- RAW STORAGE ---
CLASS     SIZE    AVAIL     USED  RAW USED  %RAW USED
ssd    4.4 TiB  1.4 TiB  3.0 TiB   3.0 TiB      67.85
TOTAL  4.4 TiB  1.4 TiB  3.0 TiB   3.0 TiB      67.85
 
--- POOLS ---
POOL                                                   ID  PGS    STORED  OBJECTS     USED  %USED  MAX AVAIL
.mgr                                                    1    1   769 KiB        2  2.3 MiB      0    256 GiB
ocs-storagecluster-cephblockpool                        2  128  1007 GiB  393.69k  3.0 TiB  79.77    256 GiB
.rgw.root                                               3    8   5.8 KiB       16  180 KiB      0    256 GiB
ocs-storagecluster-cephobjectstore.rgw.meta             4    8   5.3 KiB       17  152 KiB      0    256 GiB
ocs-storagecluster-cephobjectstore.rgw.buckets.non-ec   5    8       0 B        0      0 B      0    256 GiB
ocs-storagecluster-cephobjectstore.rgw.control          6    8       0 B        8      0 B      0    256 GiB
ocs-storagecluster-cephobjectstore.rgw.otp              7    8       0 B        0      0 B      0    256 GiB
ocs-storagecluster-cephobjectstore.rgw.log              8    8   327 KiB      340  2.8 MiB      0    256 GiB
ocs-storagecluster-cephobjectstore.rgw.buckets.index    9    8   3.1 KiB       11  9.4 KiB      0    256 GiB
ocs-storagecluster-cephfilesystem-metadata             10   16   303 KiB       22  996 KiB      0    256 GiB
ocs-storagecluster-cephfilesystem-data0                11   32       0 B        0      0 B      0    256 GiB
ocs-storagecluster-cephobjectstore.rgw.buckets.data    12   32     1 KiB        1   12 KiB      0    256 GiB


sh-5.1$ ceph osd df
ID  CLASS  WEIGHT   REWEIGHT  SIZE     RAW USE   DATA      OMAP     META     AVAIL    %USE   VAR   PGS  STATUS
 1    ssd  1.45549   1.00000  1.5 TiB  1011 GiB  1007 GiB  918 KiB  4.1 GiB  479 GiB  67.87  1.00  265      up
 0    ssd  1.45549   1.00000  1.5 TiB  1011 GiB  1007 GiB  918 KiB  3.7 GiB  479 GiB  67.84  1.00  265      up
 2    ssd  1.45549   1.00000  1.5 TiB  1011 GiB  1007 GiB  915 KiB  4.0 GiB  479 GiB  67.86  1.00  265      up
                       TOTAL  4.4 TiB   3.0 TiB   3.0 TiB  2.7 MiB   12 GiB  1.4 TiB  67.86                   
MIN/MAX VAR: 1.00/1.00  STDDEV: 0.01


On a separate test cluster (4.15): I purposely maxed out the storage and brought ODF back from a readonly state, deleting the excess VMs in the process to do so. The warning icon displayed on the CephBlockPool details page in the UI disappeared when the ocs-storagecluster-cephblockpool used%, referenced in the output of ceph df, approached 75% (75.24%) I imagine it's not calculated the same in the UI vs in ceph. The warning state, if is correct, should be bubbled further up in the UI and the ceph-tools health status.

Version of all relevant components (if applicable):

4.15

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?

No. User impact is a lack of notification of a warning state either by UI or via ceph tools

Is there any workaround available to the best of your knowledge?

W/A is to reduce cephblockpool usage


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?

1

Can this issue reproducible?

Yes

Can this issue reproduce from the UI?

Yes

If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1. Install OCP/ODF
2. Raise ceph block pool usage to > 75% without triggering warnings


Actual results:

Single warning icon burried in UI with no other information and HEALTH_OK from ceph tools

Expected results:

Warning state obvious to the user either via UI or HEALTH_WARN from ceph tools

Additional info:

Comment 3 Sarah Bennert 2024-06-21 12:08:35 UTC
Created attachment 2037922 [details]
OCP UI Cluster Overview

Comment 4 Sarah Bennert 2024-06-21 12:09:29 UTC
Created attachment 2037923 [details]
OCP UI Data Foundation dashboard overview

Comment 5 Sarah Bennert 2024-06-21 12:10:22 UTC
Created attachment 2037924 [details]
OCP UI Data Foundation StorageSystem overview dashboard

Comment 6 Sarah Bennert 2024-06-21 12:11:17 UTC
Created attachment 2037925 [details]
OCP UI Data Foundation ocs-storagecluster-cephblockpool details

Comment 7 Sunil Kumar Acharya 2024-06-24 08:09:49 UTC
Moving the non-blocker BZ out of ODF-4.16.0. If this is a blocker, feel free to propose it back with justification note.