Bug 1140849 - Make metrics index partitions configurable
Summary: Make metrics index partitions configurable
Keywords:
Status: NEW
Alias: None
Product: RHQ Project
Classification: Other
Component: Core Server, Storage Node
Version: 4.13
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: RHQ 4.13
Assignee: Nobody
QA Contact:
URL:
Whiteboard:
Depends On: 1140905 1140906
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-09-11 20:19 UTC by John Sanda
Modified: 2022-03-31 04:28 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1135629 0 unspecified ON_QA Wide rows in metrics index table can cause read timeouts 2022-03-31 04:27:59 UTC

Internal Links: 1135629

Description John Sanda 2014-09-11 20:19:40 UTC
Description of problem:
For bug 1135629 we introduced the metrics_idx table. Previously, index partitions for a time slice could get very large, contributing to read timeouts. With the metrics_idx table, data for a single time slice is now stored across multiple partitions. 

Currently the number of partitions is hard coded. It should be configurable though because with a fixed number, we can still wind up with "wide rows" that could lead to performance problems. And if we go with a really large number of partitions to avoid the wide rows, then we might wind up with a lot of overhead with reads on the index.

There is more more work involved beyond exposing it as a configurable setting. When the number of partitions is changed, we will need to rebuild the index. This should happen during the DataCalcJob prior to running the data aggregation. It is necessary to make sure we do not miss any data that needs to be aggregated and to avoid doing a whole bunch of duplicate work.

I will illustrate with an example. Suppose we have 5 partitions (0 - 4) and 10 schedule ids, 0 to 9. Here are the partition assignments:

schedule 
id(s)    | partition
---------------------------
0 - 5    | 0
6        | 1
7        | 2
8        | 3
9        | 4

Now let's say we want to reduce the number of partitions to two.

schedule 
id(s)    | partition
---------------------------
0 - 2    | 0
3        | 1
4        | 0
5        | 1
6        | 0
7        | 1
8        | 0
9        | 1

When we insert raw data for schedule id 7 while there are 5 partitions, we will have schedule id 7 in partition 2 of the index. And after decreasing the number of partitions we will also have it in partition 1 after we insert raw data for it. If we do not rebuild the index, we end duplicating data aggregation for schedules. By recalculating, I mean that we need to make sure that schedule ids exist in the correct partitions before we start doing data aggregation. So schedule id 7 for example, should be removed from partition 2 and inserted partition 1. Even though in my example it already exists in partition 1, we still go ahead and insert it since writes are more efficient than reads in Cassandra. Similar logic is necessary when increasing the number of partitions.

Things get more interesting in an HA environment with multiple servers. When the change is made on one server, there is no mechanism for propagating it to other servers in the HA cloud. Beyond a polling mechanism, it would be the user's responsibility to restart the server or reload the config for the change to take effect.


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:


Note You need to log in before you can comment on or make changes to this bug.