Bug 1028090 - Implement event queue in the Indication manager
Implement event queue in the Indication manager
Status: ASSIGNED
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: openlmi-providers (Show other bugs)
7.0
Unspecified Unspecified
unspecified Severity unspecified
: rc
: ---
Assigned To: Vitezslav Crhonek
qe-baseos-daemons
:
Depends On:
Blocks: 1026663 1033026
  Show dependency treegraph
 
Reported: 2013-11-07 11:09 EST by Tomáš Bžatek
Modified: 2017-08-03 03:26 EDT (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Tomáš Bžatek 2013-11-07 11:09:02 EST
The Indication manager we use as a helper/wrapper needs some kind of queue where incoming events will be sent, awaiting further processing. This is to ensure uninterrupted event flow. With the current consecutive watcher() and gather() parts there's a high chance to miss some events as long as watcher() is not always running.

Additionally, when receiving events in batches in some buffer, all events should be fully processed. With no queue, the rest of the buffer is thrown away as long as the watcher needs to end and return success.

Third requirement is a kind of rate-limiting, merging multiple equal events of the same type or for the same object in one, when sent in a short interval (user defined). This will naturally cause slight delay which may be desired for some applications - think of it as a settle timeout.

As a side-effect, the reworked watcher() should deal better with errors when critical issue prevents it from working. A kind of self-recovery would be nice too.
Comment 1 Tomáš Bžatek 2013-11-07 11:09:58 EST
This currently blocks e.g. bug 1026663 to be fully race-free.
Comment 3 Tomáš Bžatek 2013-11-26 06:23:17 EST
Debugging further it turned out there are more rules we should obey to ensure thread safety and play nice with memory allocations:

 - any CMPI call that manipulates with instances should be made from the same thread. I.e. don't free allocated instances from other thread than collected.
 - hold locks as little time as possible, operate on local variables and then fill shared memory with a quick operation with lock held
 - perform proper thread shutdown and cleanup, to free all data, unlock all locks and properly detach threads from CIMOM. No forced thread cancellation.
 - handle a scenario when new filters are registered while worker threads are already running (for the first poll case)

 - implement a clean way to cancel worker threads from outside, the idea was to use poll()/select() with a side fd acting as a cancellation channel, similar to how GCancellable works. This should be exposed to watcher/gather callbacks and indmanager users should integrate it in their code.
Comment 4 Tomáš Bžatek 2013-12-03 11:07:45 EST
Note to myself: GCancellable principle explained: http://blog.verbum.org/2013/12/03/cancelling-computation-gcancellable-or-sigint-versus-threads-versus-exceptions/
Comment 5 Tomáš Bžatek 2014-01-06 11:05:31 EST
- be sure to check CMPI calls error codes and rc statuses

Note You need to log in before you can comment on or make changes to this bug.