Bug 1948378 - Alert 'ClusterObjectStoreState' is not triggered when RGW interface is unavailable
Summary: Alert 'ClusterObjectStoreState' is not triggered when RGW interface is unavai...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenShift Container Storage
Classification: Red Hat Storage
Component: ceph-monitoring
Version: 4.7
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: OCS 4.8.0
Assignee: Anmol Sachan
QA Contact: Filip Balák
URL:
Whiteboard:
: 1953615 (view as bug list)
Depends On:
Blocks: 1962161
TreeView+ depends on / blocked
 
Reported: 2021-04-12 06:24 UTC by Sravika
Modified: 2021-08-03 18:16 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
.`ClusterObjectStoreState` alert message is generated when RADOS Object Gateway (RGW) is not available or is unhealthy. Previously, the `ClusterObjectStoreState` alert message was not generated if the RADOS Object Gateway (RGW) was not available or was unhealthy. With a fix implemented in the OpenShift Container Storage operator, users can now see the ClusterObjectStoreState alert when RADOS Object Gateway (RGW) is not available or is unhealthy.
Clone Of:
: 1962161 (view as bug list)
Environment:
Last Closed: 2021-08-03 18:15:56 UTC
Embargoed:


Attachments (Terms of Use)
ocs-ci testcase log (13.58 KB, application/zip)
2021-04-12 06:24 UTC, Sravika
no flags Details
Must Gather Logs (5.10 MB, application/zip)
2021-04-12 06:56 UTC, Sravika
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github openshift ocs-operator pull 1174 0 None open fix ClusterObjectStoreState alert not listing in Prometheus Rules 2021-05-02 15:48:52 UTC
Github openshift ocs-operator pull 1196 0 None open Bug 1948378: [release-4.8] fix ClusterObjectStoreState Alert empty spec 2021-06-01 10:45:11 UTC
Red Hat Product Errata RHBA-2021:3003 0 None None None 2021-08-03 18:16:39 UTC

Internal Links: 2144532

Description Sravika 2021-04-12 06:24:20 UTC
Created attachment 1771272 [details]
ocs-ci testcase log

Description of problem (please be detailed as possible and provide log
snippests):

During the ocs-ci tier4a tests, the following test fails as the "ClusterObjectStoreState" alerts are not generated when the RGW interface is unavailable.

"tests/manage/monitoring/prometheus/test_rgw.py::test_rgw_unavailable "

Version of all relevant components (if applicable):

OCP - 4.7.3
OCS -4.7.0-344.ci 
ceph version 14.2.11-143.el8cp (ab503edb1421ce443f12917d9a75d5b56334dfea) nautilus (stable)
OCS-CI : commit 0d371476e5949ecc118ab3fad142889ef4ccb860

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?

The tier4a test execution results in failures

Is there any workaround available to the best of your knowledge?
NA

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
1

Can this issue reproducible?
Yes

Can this issue reproduce from the UI?
Yes

If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1. Install OCS 4.7.0-344.ci 
2. Perform downscaling of deployment rook-ceph-rgw-ocs-storagecluster-cephobjectstore-a 

oc -n openshift-storage scale --replicas=0 deployment/rook-ceph-rgw-ocs-storagecluster-cephobjectstore-a

3. Check for the alertname "ClusterObjectStoreState" that should be generated when the rgw interface is unavailable


Actual results:

No alert generated 

Expected results:

Alert should be generated when RGW interface is unavailable

Additional info:

Comment 2 Sravika 2021-04-12 06:56:27 UTC
Created attachment 1771275 [details]
Must Gather Logs

Comment 3 Sravika 2021-04-26 13:19:46 UTC
One additional info is that the Object gateway is not supported on IBM Z in OCS 4.7, however this alert was generated in ocs 4.6.2 when the RGW interface was unavailable.

Comment 4 Anmol Sachan 2021-05-04 15:07:59 UTC
*** Bug 1953615 has been marked as a duplicate of this bug. ***

Comment 6 Mudit Agarwal 2021-05-19 11:54:04 UTC
Sure, please create a clone of the BZ

Comment 9 Mudit Agarwal 2021-06-01 10:44:14 UTC
Not backported to 4.8 yet.

Comment 12 Filip Balák 2021-06-11 13:34:54 UTC
Alert is correctly triggered in version ocs-operator.v4.8.0-409.ci --> VERIFIED

Comment 13 Olive Lakra 2021-07-09 04:28:38 UTC
@Mudit - Please review the revised doc text and share feedback

Comment 14 Mudit Agarwal 2021-07-12 06:16:34 UTC
LGTM, thanks

Comment 16 errata-xmlrpc 2021-08-03 18:15:56 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenShift Container Storage 4.8.0 container images bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:3003


Note You need to log in before you can comment on or make changes to this bug.