Bug 1416850 - Cassandra gets OOM killed because of high tombstone cells [NEEDINFO]
Summary: Cassandra gets OOM killed because of high tombstone cells
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Hawkular
Version: 3.3.1
Hardware: Unspecified
OS: Unspecified
Target Milestone: ---
: 3.3.1
Assignee: John Sanda
QA Contact: Peng Li
: 1411427 (view as bug list)
Depends On:
TreeView+ depends on / blocked
Reported: 2017-01-26 15:27 UTC by Wesley Hearn
Modified: 2020-08-13 08:50 UTC (History)
17 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: Size tiered compaction can be inefficient with time series data. SSTables are merged solely based on file size which means expired data can get mixed with live data. Consequence: This can result in tombstones (i.e., deletion markers) being scanned on read requests. In and of itself, this is not a problem but it will slow down reads and can increase the chances of an OutOfMemoryError since the deletion markers are loaded into the Java heap. Fix: Switch to time window compaction which is tailored specifically for time series data. It merges SSTables in a such a way that expired data is excluded. Result: Reads are more stable as well as Cassandra overall.
Clone Of:
Last Closed: 2017-03-15 20:01:33 UTC
Target Upstream Version:
mwringe: needinfo? (mmahut)
jkaur: needinfo? (jsanda)

Attachments (Terms of Use)
cassandra log (25.00 KB, application/x-xz)
2017-01-26 15:27 UTC, Wesley Hearn
no flags Details
heapster log (19.89 KB, application/x-xz)
2017-01-26 15:28 UTC, Wesley Hearn
no flags Details
hawkular log (27.66 KB, application/x-xz)
2017-01-26 16:09 UTC, Wesley Hearn
no flags Details
cassandra2 log (32.91 KB, application/x-xz)
2017-01-26 16:11 UTC, Wesley Hearn
no flags Details
ls -l and sstablemetadata dump (129.60 KB, application/x-xz)
2017-02-06 17:30 UTC, Wesley Hearn
no flags Details

System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker HWKMETRICS-590 0 Critical Resolved Cassandra is reading lots of tombstones on queries 2020-05-27 16:59:53 UTC
Red Hat Product Errata RHBA-2017:0512 0 normal SHIPPED_LIVE OpenShift Container Platform,, and bug fix update 2017-03-16 00:01:17 UTC

Description Wesley Hearn 2017-01-26 15:27:46 UTC
Created attachment 1244797 [details]
cassandra  log

Description of problem:
Cassandra gets OOM killed due to high tomestone cells, as a result heapster looses connection to cassandra and gets in a restart loop.

Version-Release number of selected component (if applicable):

How reproducible:
Always in our staging environment.

Steps to Reproduce:

Actual results:

Expected results:

Additional info:

Comment 1 Wesley Hearn 2017-01-26 15:28:12 UTC
Created attachment 1244799 [details]
heapster log

Comment 2 Matt Wringe 2017-01-26 15:44:22 UTC
@wesley: can you please attach the Cassandra and Hawkular Metrics logs?

Comment 3 John Sanda 2017-01-26 15:48:02 UTC
Can you also please provide the following:

* Data retention used

* Date ranges for queries
I would like to know if it is for the past hour, past day, past week, etc.

* Output of `nodetool tablestats hawkular_metrics` when the OOME happens

Comment 4 Wesley Hearn 2017-01-26 16:09:39 UTC
Created attachment 1244804 [details]
hawkular log

* Data retention used

  - It is the default whihc I believe it is 7 days

* Date ranges for queries

  - On the web console it is a max of 1 week.

Comment 5 Wesley Hearn 2017-01-26 16:11:01 UTC
Created attachment 1244805 [details]
cassandra2 log

Comment 7 John Sanda 2017-01-27 14:06:19 UTC
I should have a patch to test later today.

Comment 8 John Sanda 2017-01-31 03:26:59 UTC
Can you provide the output of the following:

for f in /cassandra_data/data/hawkular_metrics/data-*/*Data.db; do meta=$(/opt/apache-cassandra/tools/binsstablemetadata $f); echo -e "Max:" $(date --date=@$(echo "$meta" | grep Maximum\ time | cut -d" "  -f3| cut -c 1-10) '+%m/%d/%Y') "Min:" $(date --date=@$(echo "$meta" | grep Minimum\ time | cut -d" "  -f3| cut -c 1-10) '+%m/%d/%Y') $(echo "$meta" | grep droppable) ' \t ' $(ls -lh $f | awk '{print $5" "$6" "$7" "$8" "$9}'); done | sort

This will help in configuring the new compaction strategy.

Comment 9 John Sanda 2017-01-31 04:25:29 UTC
We are running Cassandra 2.2.7 in OpenShift 3.3.1. Time Window Compaction Strategy (TWCS) was first included in later versions of Cassandra. We can use it in 2.2.7. The TWCS jar file needs to be placed in the Cassandra lib directory. Once that is done we can change the compaction strategy with CQL commands. 

Matt, can you assist with creating a new image that includes the TWCS jar? And to be clear, I am not asking to back port any changes. Right now I want to make the change only to Wesley's environment in hopes that it make things more stable.

Comment 10 Wesley Hearn 2017-01-31 14:43:34 UTC
Max: 01/26/2017 Min: 01/19/2017 Estimated droppable tombstones: 0.8471775259322556  	  494M Jan 26 13:00 /cassandra_data/data/hawkular_metrics/data-5d696540dd9611e6a550c1486de5a810/lb-295-big-Data.db
Max: 01/30/2017 Min: 01/26/2017 Estimated droppable tombstones: 0.0  	  801M Jan 30 00:08 /cassandra_data/data/hawkular_metrics/data-5d696540dd9611e6a550c1486de5a810/lb-708-big-Data.db
Max: 01/30/2017 Min: 01/30/2017 Estimated droppable tombstones: 0.0  	  198M Jan 30 23:48 /cassandra_data/data/hawkular_metrics/data-5d696540dd9611e6a550c1486de5a810/lb-809-big-Data.db
Max: 01/31/2017 Min: 01/30/2017 Estimated droppable tombstones: 0.0  	  57M Jan 31 09:26 /cassandra_data/data/hawkular_metrics/data-5d696540dd9611e6a550c1486de5a810/lb-838-big-Data.db

Comment 11 Wesley Hearn 2017-01-31 14:44:23 UTC
Not sure why it cleared his needinfo flag.

Comment 12 John Sanda 2017-02-04 03:47:54 UTC
I need to verify the data retention or TTL being used because if it is less than the default of seven days, then console queries going back a week will be scanning tombstone. Wesley for each of the Data.db files, can you run `/opt/apache-cassandra/tools/binsstablemetadata` on them and and also provide the file creation time? From that I will be able to determine the data retention.

Comment 13 Wesley Hearn 2017-02-06 17:30:06 UTC
Created attachment 1248093 [details]
ls -l and sstablemetadata dump

Comment 23 Matt Wringe 2017-02-17 21:45:32 UTC
*** Bug 1411427 has been marked as a duplicate of this bug. ***

Comment 30 Troy Dawson 2017-02-21 15:29:21 UTC
This should be fixed with images
or newer.  These images should be in all regular testing areas.
Attaching this to errata.

Comment 37 Troy Dawson 2017-02-23 21:39:26 UTC
I'm very sorry.  The images were built but were not pushed to the testing areas (registry.ops).
They have been pushed, and I have verified that they are there now.

# docker pull registry.ops.openshift.com/openshift3/metrics-hawkular-metrics:3.3.1-4
Trying to pull repository registry.ops.openshift.com/openshift3/metrics-hawkular-metrics ... 
3.3.1-4: Pulling from registry.ops.openshift.com/openshift3/metrics-hawkular-metrics
239425a20f14: Already exists 
019908b75ec4: Already exists 
0deb2bff8875: Already exists 
b9187e9d6fd8: Already exists 
7cac29ec0f61: Already exists 
6cc45390c873: Already exists 
Digest: sha256:2f5c0f826f39cd3607ef726e901dd129df1d625743a7c52f89da40bf129be3b6
Status: Downloaded newer image for registry.ops.openshift.com/openshift3/metrics-hawkular-metrics:3.3.1-4
# docker pull registry.ops.openshift.com/openshift3/metrics-cassandra:3.3.1-3
Trying to pull repository registry.ops.openshift.com/openshift3/metrics-cassandra ... 
3.3.1-3: Pulling from registry.ops.openshift.com/openshift3/metrics-cassandra
7bd78273b666: Pull complete 
c196631bd9ac: Pull complete 
c18565cb9832: Pull complete 
759980c6d702: Pull complete 
3a8066aceffb: Pull complete 
Digest: sha256:2f6b6f05d8421949d64ebfd00cf0afc970985531ec50c696f1ed2103d9c89f1f
Status: Downloaded newer image for registry.ops.openshift.com/openshift3/metrics-cassandra:3.3.1-3

Comment 38 Peng Li 2017-02-27 02:37:07 UTC
Verified the latest 3.3.1 Metrics use TWCS now.

21:29:54,776 INFO  [org.hawkular.metrics.schema.SchemaService] (metricsservice-lifecycle-thread) The compaction strategy for the data table has been updated to com.jeffjirsa.cassandra.db.compaction.TimeWindowCompactionStrategy

registry.ops.openshift.com/openshift3/metrics-deployer                                3.3.1               d79a58d52ca8        3 days ago          759.2 MB
registry.ops.openshift.com/openshift3/metrics-cassandra                               3.3.1               6d3670affa15        9 days ago          533.1 MB
registry.ops.openshift.com/openshift3/metrics-hawkular-metrics                        3.3.1               306b85a45f53        9 days ago          1.772 GB
registry.ops.openshift.com/openshift3/metrics-heapster                                3.3.1               8234c1028f0f        11 days ago         277.8 MB

Comment 44 errata-xmlrpc 2017-03-15 20:01:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.