1679609 – RFE: ceph-dashboard should be able to send SNMP trap upon change of cluster status

Bug 1679609 - RFE: ceph-dashboard should be able to send SNMP trap upon change of cluster status

Summary: RFE: ceph-dashboard should be able to send SNMP trap upon change of cluster s...

Keywords:
Status:	CLOSED DUPLICATE of bug 1259160
Alias:	None
Product:	Red Hat Ceph Storage
Classification:	Red Hat Storage
Component:	Cephadm
Sub Component:
Version:	3.2
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	5.1
Assignee:	Paul Cuzner
QA Contact:	Sunil Angadi
Docs Contact:	Karen Norteman
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2019-02-21 14:41 UTC by Matthias Muench
Modified:	2022-01-27 10:21 UTC (History)
CC List:	14 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2022-01-18 13:57:21 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Ceph Project Bug Tracker	52708	None	None	None	2021-10-06 16:40:26 UTC
Github	ceph ceph pull 43274	None	open	monitoring:Adding the Ceph MIB	2021-10-06 16:40:26 UTC
Red Hat Issue Tracker	RHCEPH-1676	None	None	None	2021-09-13 07:50:14 UTC

Description Matthias Muench 2019-02-21 14:41:54 UTC

Description of problem:
Integration of Ceph with existing enterprise monitoring tools would require to at least generate a SNMP trap to a SNMP trap destination server (or ideally a list of multiple).

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:
Integration of Ceph with existing enterprise monitoring systems is not possible due to missing SNMP trap generation upon status changes.


Expected results:
At least, upon change of health of a cluster from healthy to something else should generate a SNMP trap, sent to a list of configured SNMP trap destination servers.


Additional info:
Alternative implementation would be on a ceph-mon/mgr level, however this would require individual configuration for every Ceph cluster. Using dashboard as central point of monitoring could perhaps provide either one config for all (changes for all clusters reported to same destinations, initial solution) or a more sophisticated setup to be able to separate SNMP trap destinations for different clusters to allow deviation of traps depending on assignment within organisations.

Comment 1 Ernesto Puerta 2019-03-27 17:30:02 UTC

An approach explored in the past consisted of:
-  ceph-mgr ==> Prometheus exporter ==> Prometheus ==> Prometheus AlertManager ==> HTTP Webhook API ==> Prometheus SNMPTrapper Webhook (https://github.com/chrusty/prometheus_webhook_snmptrapper)

However, that latter project shows no activity since 2 years ago. On the other hand, this other webhook integration (https://github.com/maxwo/snmp_notifier) has been recently released. Both rely on Net-SNMP stack.

That said, Ceph-Dashboard is not strictly required for this. However, the current upstream approach is to expose AlertManager in Dashboard, so technically we could book a place there for UI.

Pros:
- No code changes required in Ceph, as long as all metrics to send as 'traps' are already exported to Prometheus.
- Prometheus and Alertmanager are already building blocks.
- No big caveats in reliability, as long as SNMP traps shouldn't be used (alone) if reliability is a key concern.

Cons:
- Complexity moved to deployment/configuration stage.
- No FOSS License assessment performed yet on those projects.
- Both projects seem to have marginal community adoption/response (small or no track of issues/bugfixing activity). So a big question mark in terms of code/SNMP implementation quality.

Comment 3 Giridhar Ramaraju 2019-08-20 06:58:03 UTC

Level setting the severity of this defect to "High" with a bulk update. Pls refine it to a more closure value, as defined by the severity definition in https://bugzilla.redhat.com/page.cgi?id=fields.html#bug_severity

Comment 12 Sebastian Wagner 2022-01-18 12:47:41 UTC

backport pr: https://github.com/ceph/ceph/pull/44529

Comment 13 Sebastian Wagner 2022-01-18 13:57:21 UTC


*** This bug has been marked as a duplicate of bug 1259160 ***

Note You need to log in before you can comment on or make changes to this bug.