Bug 1093948 - Traits and call time measurements stored in Cassandra, RHQ storage node
Summary: Traits and call time measurements stored in Cassandra, RHQ storage node
Keywords:
Status: NEW
Alias: None
Product: RHQ Project
Classification: Other
Component: Core Server
Version: 4.9
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Nobody
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-05-03 19:21 UTC by Elias Ross
Modified: 2022-03-31 04:28 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:
Embargoed:


Attachments (Terms of Use)
Patch for master (commit c53bdea0) (219.23 KB, patch)
2014-05-05 14:46 UTC, Elias Ross
no flags Details | Diff
Updated patch with some bug fixes (217.30 KB, patch)
2014-05-06 20:31 UTC, Elias Ross
no flags Details | Diff

Description Elias Ross 2014-05-03 19:21:25 UTC
This is a placeholder for an upcoming patch, once the lawyers look it over. Here are the notes so far:

---

Support for metric traits and call times in Cassandra storage

Traits are stored into one table, call times one table.

There is a secondary index on call name in Cassandra. (Unclear if this is
really needed, but required when doing filtered queries. Unfortunately
the UI does partial search only.)

TTL is used for expiry. No more need to purge this data hourly.

Migration tooling added but not really tested.

Things yet to do:

Paging and custom sorting doesn't work at all. There is lots of work to
support sorting, less for paging.

The UI shows the latest trait value and timestamp when data was last
reported, not when it changed. To display properly, need to access (the entire)
history of a schedule, but this might be too many round trips for this piece of
data. The UI can probably change to remove this, or the UI can simply retrieve
all history.

Possible issues:

Hard to do data paging in Cassandra. Since custom sorting isn't supported,
the entire result set needs to be retrieved and sorted in memory. (This is what
a 'real database' does in memory.)

Cassandra does have some support for ranges on hashed fields but would need to build
infrastructure around this. PageControl could hold this token, possibly.

Purging of duplicate history items happens weekly. The reason is that traits
are updated usually every hour to 24 hours. With a small window, there are
few duplicates. The window to look back is about 8 days. Even if a trait
doesn't change, the history table will have duplicates.

How to support runtime change of TTL? Currently the server must be restarted
to change the TTL. Older data not changed. (Same issue exists for metrics, though.)

TODOs:
* The UI requires extra data pulled from Oracle. Can fields (display name) be duplicated in traits table?
* Documentation (of course)

Comment 1 Elias Ross 2014-05-05 14:46:16 UTC
Created attachment 892568 [details]
Patch for master (commit c53bdea0)

Please read the commit comments before applying.

It is not a complete fix and there still needs more work, especially in regards to dealing with paging and sorting of results.

Comment 2 Elias Ross 2014-05-06 20:31:31 UTC
Created attachment 893002 [details]
Updated patch with some bug fixes

Comment 3 Elias Ross 2014-06-30 17:40:52 UTC
The fix was moved to GitHub:
https://github.com/rhq-project/rhq/pull/70


Note You need to log in before you can comment on or make changes to this bug.