Bug 1745607 - Backport to Nautilus: Add mgr module for kubernetes event integration
Summary: Backport to Nautilus: Add mgr module for kubernetes event integration
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: Ceph-Mgr Plugins
Version: 4.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: rc
: 4.0
Assignee: Boris Ranto
QA Contact: Sidhant Agrawal
URL:
Whiteboard:
Depends On:
Blocks: 1745617
TreeView+ depends on / blocked
 
Reported: 2019-08-26 13:36 UTC by Anmol Sachan
Modified: 2020-03-11 22:13 UTC (History)
14 users (show)

Fixed In Version: ceph-14.2.4-14.el8cp, ceph-14.2.4-2.el7cp
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-01-31 12:47:06 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Ceph Project Bug Tracker 41435 0 None None None 2019-08-26 13:55:17 UTC
Github ceph ceph pull 29520 0 'None' closed mgr/k8sevents: Add mgr module for kubernetes event integration 2020-05-11 12:17:50 UTC
Github ceph ceph pull 30215 0 None closed nautilus: mgr/k8sevents: Initial ceph -> k8s events integration 2020-05-11 12:17:50 UTC
Red Hat Product Errata RHBA-2020:0312 0 None None None 2020-01-31 12:47:27 UTC

Internal Links: 1745617

Description Anmol Sachan 2019-08-26 13:36:48 UTC
Description of problem:

mgr/k8sevents: Add new mgr module for kubernetes event integration

New mgr module to provide a means of sending ceph related events to the kubernetes
events API, and also retrieval of all kubernetes events from the rook-ceph namespace.
Events may be viewed by a ceph k8sevents namespace command which will show
similar output at the ceph cli to a native kubernetes client kubectl get events command.

Since the events are cached by the module, it would also be possible to expose them to
other mgr module(s) e.g. dashboard

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 2 Yaniv Kaul 2019-09-15 13:52:59 UTC
Backport has been merged >1 week ago. What's the latest on this BZ?

Comment 5 Boris Ranto 2019-09-16 18:33:38 UTC
The back-port hasn't been merged upstream, yet. In fact, it is currently in a DNM state because it is awaiting a fix for the k8sevents module. I did a partial back-port downstream to make some progress though.

Comment 7 Yaniv Kaul 2019-09-18 06:27:02 UTC
(In reply to Boris Ranto from comment #5)
> The back-port hasn't been merged upstream, yet. In fact, it is currently in
> a DNM state because it is awaiting a fix for the k8sevents module. I did a
> partial back-port downstream to make some progress though.

What fix is it waiting for? Please add the issue here so we can track it in one place (https://bugzilla.redhat.com/show_bug.cgi?id=1745617 is waiting for this BZ, so I'm trying to track the whole chain of deps...)

Comment 8 Boris Ranto 2019-09-18 11:49:06 UTC
The upstream nautilus back-port PR is in the DNM state:

https://github.com/ceph/ceph/pull/30215

It is waiting for a fix for:

https://tracker.ceph.com/issues/41737

i.e. module currently crashes in non-k8s environments.

Comment 9 Yaniv Kaul 2019-09-23 06:44:50 UTC
(In reply to Boris Ranto from comment #8)
> The upstream nautilus back-port PR is in the DNM state:
> 
> https://github.com/ceph/ceph/pull/30215
> 
> It is waiting for a fix for:
> 
> https://tracker.ceph.com/issues/41737

PR - https://github.com/ceph/ceph/pull/30482 (easier to track by PR if available)

> 
> i.e. module currently crashes in non-k8s environments.

Comment 11 Boris Ranto 2019-10-15 19:55:05 UTC
There was some progress upstream so I have cherry-picked the rest of the necessary commits for this bz. However, we do have an issue with our downstream builds at the moment as one of the back-ported patches (not for this bz) introduced a build failure on s390x. The s390x build failure will need to be fixed before we can move this to ON_QA.

Comment 14 Raz Tamir 2019-11-06 07:41:34 UTC
Hi Anmol,

Could you please provide steps for testing this functionality?
(QE need this in order to qa_ack this BZ)

Comment 15 Anmol Sachan 2019-11-06 08:25:48 UTC
pcuzner can provide the information as he developed the feature.

Comment 16 Paul Cuzner 2019-11-06 21:07:40 UTC
The merge is still pending upstream - but if we have it downstream in our container build that's great.

Ultimately the module should be enabled by rook - but until this is done, you'll need to enable manually through the tools pod.
ceph mgr module enable k8sevents

this registers a coupld of commands
ceph k8sevents status | ls | ceph

So once loaded you can use those commands to check that it's operational. When storageclasses get created, or osds get added/removed or healthcheck fails you chould see kubernetes events show up in the OCS dashboard.

Raz, if you can confirm the above I'll start the process to auto enable the module within rook.

Comment 18 Raz Tamir 2019-11-13 16:20:55 UTC
Hi Paul,

Is it just for testing?
I'm asking because the tools pod is not going to be part of OCS

Comment 24 Paul Cuzner 2019-11-22 00:05:59 UTC
the tools pod is not required for this. k8sevents is a mgr module, so as long as it's in the image we can enable manually to test it and then raise a PR for rook to automatically enable the module - like we do for prometheus

Comment 25 Raz Tamir 2019-11-28 10:48:18 UTC
@Elad, can you please assign this for verification?

Comment 36 Eran Tamir 2020-01-15 08:34:37 UTC
verified

Comment 39 errata-xmlrpc 2020-01-31 12:47:06 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0312


Note You need to log in before you can comment on or make changes to this bug.