Bug 1313305 - Calamari must filter duplicate events before pushing it to salt event bus. [NEEDINFO]
Calamari must filter duplicate events before pushing it to salt event bus.
Product: Red Hat Ceph Storage
Classification: Red Hat
Component: Calamari (Show other bugs)
Unspecified Unspecified
unspecified Severity unspecified
: rc
: 2.2
Assigned To: Gregory Meno
Depends On:
Blocks: 1291304
  Show dependency treegraph
Reported: 2016-03-01 06:20 EST by Darshan
Modified: 2017-01-05 13:41 EST (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2017-01-05 13:41:23 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
gmeno: needinfo? (mkarnik)

Attachments (Terms of Use)

  None (edit)
Description Darshan 2016-03-01 06:20:00 EST
Description of problem:
Some events related to osd, mon, cluster state change are emitted from all the calamari-lite instances. Since USM will be listening to multiple calamari-lite instances for events, it must send event(push event to salt bus) from only one instance. which instance has to send can be decided based on if it is residing on a leader mon node.

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1. Have a ceph setup with multiple mon nodes(hence multiple calamari-lite instances)
2. Simulate an event for osd state change.

Actual results:
Same event would be sent from multiple calamari-lite instances to salt event bus.

Expected results:
A single event should be sent from only one calamari-lite instance to salt bus.

Additional info:
Comment 2 Gregory Meno 2016-04-06 17:30:18 EDT
I believe that we should address this issue in a different way.
calamari-lite running on all ceph monitors will be a risk to cluster stability and data integrity.

I think that the storage-console should choose a single monitor to enable calamari on in the first release. That way when the inevitable happens we only loose management temporarily and not trigger data-loss

If we organize this way event filtering won't be needed until we design calamari to be HA.

Mrugesh what do you think about this approach as risk-mitigation?

Note You need to log in before you can comment on or make changes to this bug.