Bugzilla (bugzilla.redhat.com) will be under maintenance for infrastructure upgrades and will not be available on July 31st between 12:30 AM - 05:30 AM UTC. We appreciate your understanding and patience. You can follow status.redhat.com for details.
Bug 1888614 - [External] Unreachable monitoring-endpoint used during deployment causes ocs-operator to crash
Summary: [External] Unreachable monitoring-endpoint used during deployment causes ocs-...
Alias: None
Product: Red Hat OpenShift Container Storage
Classification: Red Hat
Component: ocs-operator
Version: 4.6
Hardware: Unspecified
OS: Unspecified
Target Milestone: ---
: OCS 4.6.0
Assignee: Anmol Sachan
QA Contact: Rachael
: 1889604 (view as bug list)
Depends On:
Blocks: 1889604
TreeView+ depends on / blocked
Reported: 2020-10-15 11:10 UTC by Rachael
Modified: 2021-06-01 08:43 UTC (History)
8 users (show)

Fixed In Version: 4.6.0-142.ci
Doc Type: Bug Fix
Doc Text:
Clone Of:
Last Closed: 2020-12-17 06:24:47 UTC
Target Upstream Version:

Attachments (Terms of Use)

System ID Private Priority Status Summary Last Updated
Github openshift ocs-operator pull 848 0 None closed Bug 1888614: Changed the execution of connection closing, to avoid closing uncreated connection. 2021-01-14 02:52:20 UTC
Red Hat Product Errata RHSA-2020:5605 0 None None None 2020-12-17 06:25:07 UTC

Description Rachael 2020-10-15 11:10:36 UTC
Description of problem (please be detailed as possible and provide log

When deployment of external mode in OCS 4.6 is attempted using a monitoring-endpoint which is not reachable, the ocs-operator crashes

$ oc get pods
NAME                                   READY   STATUS             RESTARTS   AGE
noobaa-operator-58958fb578-82xfw       1/1     Running            0          7m7s
ocs-metrics-exporter-b6ddd6869-n8nll   1/1     Running            0          7m7s
ocs-operator-6f48c6bff5-s48rg          0/1     CrashLoopBackOff   5          7m7s
rook-ceph-operator-8467f78647-4j56z    1/1     Running            0          7m7s

Version of all relevant components (if applicable):
OCP: 4.6.0-0.nightly-2020-10-14-095718

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
Yes, even though the endpoint is not valid, the operator should handle this and not result in a crash

Is there any workaround available to the best of your knowledge?

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?

Can this issue reproducible?

Can this issue reproduce from the UI?

If this is a regression, please provide more details to justify this:

Steps to Reproduce:
1. Run the exporter script on the RHCS cluster and pass an unreachable IP as the monitoring endpoint
Eg: # python3 exporter46.py --rgw-endpoint --rbd-data-pool-name ocs-cbp-46-rg --run-as-user client.ocs46 --monitoring-endpoint --monitoring-endpoint-port 9283

2. Upload the JSON generated from the script and deploy OCS in external mode

Actual results:
The ocs-operator crashes

Expected results:
The ocs-operator should handle this error gracefully

Comment 3 Mudit Agarwal 2020-10-15 11:53:08 UTC
This needs to be fixed, providing dev_ack.

Comment 6 Mudit Agarwal 2020-10-20 07:41:28 UTC
Backport PR is not yet merged.

Comment 7 Mudit Agarwal 2020-10-20 07:43:00 UTC
*** Bug 1889604 has been marked as a duplicate of this bug. ***

Comment 14 errata-xmlrpc 2020-12-17 06:24:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat OpenShift Container Storage 4.6.0 security, bug fix, enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.