Bug 1891098 - Configure "ceph health detail" to run periodically and log output to cluster log.
Summary: Configure "ceph health detail" to run periodically and log output to cluster ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: RADOS
Version: 4.1
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ---
: 4.2
Assignee: Prashant Dhange
QA Contact: Pawan
Amrita
URL:
Whiteboard:
Depends On:
Blocks: 1890121
TreeView+ depends on / blocked
 
Reported: 2020-10-23 19:02 UTC by Christina Meno
Modified: 2021-06-09 16:16 UTC (History)
17 users (show)

Fixed In Version: ceph-14.2.11-79.el8cp, ceph-14.2.11-79.el7cp
Doc Type: Enhancement
Doc Text:
.Ceph health details are logged in the cluster log Previously, the cluster log did not have the Ceph health details , so it was difficult to conclude on the root cause of the issue. With this release, the Ceph health details are logged in the cluster log which enables the review of the issues that might arise in the cluster.
Clone Of:
Environment:
Last Closed: 2021-01-12 14:58:09 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Ceph Project Bug Tracker 48042 0 None None None 2020-10-29 16:41:14 UTC
Github ceph ceph pull 37902 0 None closed mon: Log "ceph health detail" periodically in cluster log 2021-01-25 09:38:38 UTC
Github ceph ceph pull 38118 0 None closed nautilus: mon: Log "ceph health detail" periodically in cluster log 2021-01-25 09:38:39 UTC
Red Hat Product Errata RHSA-2021:0081 0 None None None 2021-01-12 14:58:33 UTC

Description Christina Meno 2020-10-23 19:02:10 UTC
Description of problem:
We don't have detailed cluster health/sanity information available for customers and support to review when problems with the cluster arise.

Version-Release number of selected component (if applicable):
4.1

Comment 1 RHEL Program Management 2020-10-23 19:02:17 UTC
Please specify the severity of this bug. Severity is defined here:
https://bugzilla.redhat.com/page.cgi?id=fields.html#bug_severity.

Comment 3 Prashant Dhange 2020-10-30 06:11:56 UTC
Should we log health detail on health check failure as well as every mon_health_to_clog_interval ? Logging health detail too frequently does not make sense as it will make cluster log grow rapidly in case cluster is unhealthy.

Comment 5 Vikhyat Umrao 2020-10-30 11:26:43 UTC
(In reply to Prashant Dhange from comment #3)
> Should we log health detail on health check failure as well as every
> mon_health_to_clog_interval ? Logging health detail too frequently does not
> make sense as it will make cluster log grow rapidly in case cluster is
> unhealthy.

I think the idea is to log health detail during the health warn/err. Maybe logging every mon_health_to_clog_interval is not necessary? let us wait for Neha's inputs.

Comment 6 Neha Ojha 2020-10-30 16:39:23 UTC
(In reply to Prashant Dhange from comment #3)
> Should we log health detail on health check failure as well as every
> mon_health_to_clog_interval ? Logging health detail too frequently does not
> make sense as it will make cluster log grow rapidly in case cluster is
> unhealthy.

I don't think we need to log every mon_health_to_clog_interval as well. Let's discuss implementation details in the PR.

Comment 7 Yaniv Kaul 2020-11-10 20:43:42 UTC
devel-ack+ please?

Comment 18 errata-xmlrpc 2021-01-12 14:58:09 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat Ceph Storage 4.2 Security and Bug Fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:0081


Note You need to log in before you can comment on or make changes to this bug.