Description of problem ====================== When an unreachable monitoring-endpoint is provided during OCS deployment in external mode, the ocs-operator logs an error message just once. "level":"error","ts":"2020-10-23T08:03:09.344Z","logger":"controller_storagecluster","msg":"Monitoring Endpoint (1.2.3.4:9283) is not reachable","Request.Namespace":"openshift-storage","Request.Name":"ocs-external-storagecluster","error":"dial tcp 1.2.3.4:9283: i/o timeout","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/remote-source/app/vendor/github.com/go-logr/zapr/zapr.go:128\ngithub.com/openshift/ocs-operator/pkg/controller/storagecluster.validateMonitoringEndpoint\n\t/remote-source/app/pkg/controller/storagecluster/external_resources.go:398\ngithub.com/openshift/ocs-operator/pkg/controller/storagecluster It would be good to have these error messages logged for each reconcile, to make debugging of the issue easier. Raising the bug based on: https://bugzilla.redhat.com/show_bug.cgi?id=1888614#c9 Version of all relevant components ================================== ocs-operator.v4.6.0-142.ci Does this issue impact your ability to continue to work with the product? ========================================================================= No Is there any workaround? ======================== No Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? ======================================== 1 Can this issue reproducible? ============================ Yes Can this issue reproduce from the UI? ===================================== If this is a regression ======================= No Steps to Reproduce ================== 1. Deploy an external mode cluster using an unreachable monitoring-endpoint 2. Check ocs-operator logs Actual results ============== The error message is logged once Expected results ================ Error messages should be logged for each reconcile Additional info =============== Logs available here: http://rhsqe-repo.lab.eng.blr.redhat.com/OCS/ocs-qe-bugs/1888614/verification/
This may make sense, but it is not critical for the product. Moving to OCS 4.7.
This is still not critical for the product, though it should be done soon. Moving to OCS 4.8.
Not critical enough to go into 4.8. Definitely fixing this for OCS 4.9 so providing devel_ack+. Scope of the fix would be clear logs for monitoring endpoint. Larger refactor is outside the scope of this BZ.
At this point we believe that this has been fixed over the course of a few PRs, and should already be in the DS builds. Moving to ON_QA.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: Red Hat OpenShift Data Foundation 4.9.0 enhancement, security, and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:5086