Bug 1126105

Summary: RHQ storage determineMostRecentRawDataSinceLastShutdown sub-optimal
Product: [Other] RHQ Project Reporter: Elias Ross <genman>
Component: Core Server, Storage NodeAssignee: John Sanda <jsanda>
Status: CLOSED WONTFIX QA Contact: Mike Foley <mfoley>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 4.12CC: hrupp, jsanda
Target Milestone: ---   
Target Release: RHQ 4.13   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-09-11 16:00:07 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1133605    

Description Elias Ross 2014-08-01 21:48:25 UTC
Description of problem:

The query 

            "SELECT bucket, day, partition, collection_time_slice, start_schedule_id, insert_time_slice, schedule_ids " +
            "FROM " + MetricsTable.METRICS_CACHE_INDEX + " " +
            "WHERE bucket = ? AND day = ? AND partition = ? AND collection_time_slice < ?");

is used to locate metrics needing compression. Unfortunately, it can potentially query a whole's day worth of data, including the schedules (are they needed?) and cause time outs.

Here are my cfstats, for example:

                Column Family: metrics_cache_index
                SSTable count: 13
...
                Compacted row minimum size: 771
                Compacted row maximum size: 268650950
                Compacted row mean size: 139821861

Unfortunately the rows (compressed) are around 140MB and there's practically no way to query that much data practically, or so it would seem.

Version-Release number of selected component (if applicable): 4.12


How reproducible: Depends on size of data


Steps to Reproduce:
1. Have a number of Cassandra nodes
2. Insert ~500 metrics per second and have rows grow to ~100MB in size.
3. Take the server offline for a bit
4. Attempt to start the server

Actual results: timeouts, e.g.

21:47:49,557 WARN  [org.rhq.enterprise.server.storage.StorageClientManager] (pool-6-thread-1) Storage client subsystem wasn't initialized. The RHQ server will be set to MAINTENANCE mode. Please verify  that the storage cluster is operational.: com.datastax.driver.core.exceptions.ReadTimeoutException: Cassandra timeout during read query at consistency ONE (1 responses were required but only 0 replica responded)
	at com.datastax.driver.core.exceptions.ReadTimeoutException.copy(ReadTimeoutException.java:69) [cassandra-driver-core-1.0.5.jar:]
	at com.datastax.driver.core.ResultSetFuture.extractCauseFromExecutionException(ResultSetFuture.java:269) [cassandra-driver-core-1.0.5.jar:]
	at com.datastax.driver.core.ResultSetFuture.getUninterruptibly(ResultSetFuture.java:183) [cassandra-driver-core-1.0.5.jar:]
	at org.rhq.server.metrics.StorageResultSetFuture.get(StorageResultSetFuture.java:57) [rhq-server-metrics-4.12.0.jar:4.12.0]
	at org.rhq.server.metrics.MetricsServer.determineMostRecentRawDataSinceLastShutdown(MetricsServer.java:180) [rhq-server-metrics-4.12.0.jar:4.12.0]
	at org.rhq.server.metrics.MetricsServer.init(MetricsServer.java:160) [rhq-server-metrics-4.12.0.jar:4.12.0]
	at org.rhq.enterprise.server.storage.StorageClientManager.initMetricsServer(StorageClientManager.java:567) [rhq-server.jar:4.12.0]
	at org.rhq.enterprise.server.storage.StorageClientManager.init(StorageClientManager.java:186) [rhq-server.jar:4.12.0]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [rt.jar:1.8.0]
 
Expected results: server functions

Additional info:

https://community.jboss.org/message/883334#883334

Comment 1 John Sanda 2014-08-06 01:50:03 UTC
The metrics_cache_index table does not contain any metric data. It stores schedule ids and two or three different timestamps. It is a replacement of the metrics_index table from pre-4.12. The concern Elias raises about the size of a partition is very valid. It was a concern with the former metrics_index table and even more so with the new metrics_cache_index table.

I think we might need to consider some changes to cap the size of the rows in the index table. Let me explain with an example for aggregating data from the current time slice. This example is applicable to both 4.12 as well earlier versions of RHQ. Suppose we have N schedules with data to be aggregated. The index partition (from metrics_cache_index or from metrics_index) will contain N rows. We load all of those rows in a single query. As N gets really big, it can create hot spots on a node and make us more susceptible to read timeouts.

Since schedule ids are monotonically increasing integers, we can easily implement paging to reduce the likelihood of read timeouts. That does not address the issue of partitions being big, i.e., really wide rows. We could break up the single partition into multiple partitions where schedule id offsets are part of the partition key. This means more reads during aggregation but I think it can effectively prevent the problems that Elias is experiencing.

Comment 2 John Sanda 2014-09-11 16:00:07 UTC
I am closing this because the determineMostRecentRawDataSinceLastShutdown method has been removed as part of the work for bug 1114202.