Bug 1015628 - Enable compression of storage node data
Summary: Enable compression of storage node data
Alias: None
Product: RHQ Project
Classification: Other
Component: Core Server, Performance, Storage Node
Version: 4.9
Hardware: Unspecified
OS: Unspecified
Target Milestone: ---
: RHQ 4.10
Assignee: John Sanda
QA Contact: Mike Foley
Depends On:
Blocks: 1011084 1065652
TreeView+ depends on / blocked
Reported: 2013-10-04 16:07 UTC by John Sanda
Modified: 2014-04-23 12:31 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1065652 (view as bug list)
Last Closed: 2014-04-23 12:31:46 UTC

Attachments (Terms of Use)

Description John Sanda 2013-10-04 16:07:19 UTC
Description of problem:
When RHQ 4.8 was released, snappy compression was used on storage node data files if snappy was supported on the platform. In RHQ 4.9 though, we removed all native components from the storage node, including snappy-java. We disabled compression as part of those changes. The type of compression used is pluggable and Cassandra provides 3 out of the box - snappy, lz4, and zlib (via java.util.zip). Compression will dramatically reduce the footprint of data on disk as well as improve read/write performance.

zlib makes sense because it is already provided by Cassandra and the JRE, so we know it will be supported across all platforms on which we run. It might be worth it though to look at some other pure Java solutions for comparison. I believe there is a 100% Java port of the snappy library, and lz4-java has a mode that is pure Java as well. 

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:

Actual results:

Expected results:

Additional info:

Comment 1 John Sanda 2013-12-04 15:22:10 UTC
Starting in C* 2.0, lz4 is the default compression using the lz4-java (https://github.com/jpountz/lz4-java) library. lz4-java provides three implementations - one using native libraries via JNI, a Java port that uses the sun.misc.Unsafe API, and lastly a pure Java one. I have started doing some work to  re-enable compression using the LZ4Compressor.

For the same reasons we ditched snappy-java, we want to avoid the JNI impl. We do not want to support platform-specific libraries. In a local branch, I have made build changes to strip out the native libraries from lz4-java. The Unsafe API is available in both Oracle and OpenJDK JREs. IBM JREs can fall back to using the pure Java impl.

I have already tested a heterogeneous JRE deployment where one node was running IBM Java and another was running OpenJDK. Everything worked as expected. It is worth noting that we have internode compression disabled; so on that basis alone, I would expect the heterogeneous JRE deployment to work. 

I also tested switching a node from OpenJDK to IBM. This resulted in some errors like,

ERROR [CompactionExecutor:8] 2013-11-27 10:09:24,855 CassandraDaemon.java (line 192) Exception in thread Thread[CompactionExecutor:8,1,main]
org.apache.cassandra.io.sstable.CorruptSSTableException: org.apache.cassandra.io.compress.CorruptBlockException: (/home/hudson/rhq/rhq-server-4.10.0-SNAPSHOT-lz4/rhq-storage/bin/../../../rhq-data/data/rhq/metrics_index/rhq-metrics_index-ic-1297-Data.db): corruption detected, chunk at 0 of length 1467.
Caused by: org.apache.cassandra.io.compress.CorruptBlockException: (/home/hudson/rhq/rhq-server-4.10.0-SNAPSHOT-lz4/rhq-storage/bin/../../../rhq-data/data/rhq/metrics_index/rhq-metrics_index-ic-1297-Data.db): corruption detected, chunk at 0 of length 1467.
        at org.apache.cassandra.io.compress.CompressedRandomAccessReader.decompressChunk(CompressedRandomAccessReader.java:120)
        at org.apache.cassandra.io.compress.CompressedRandomAccessReader.reBuffer(CompressedRandomAccessReader.java:85)
        ... 25 more

In the event that a user decides to switch from Oracle/OpenJDK to IBM (or vice-versa), there is a work around. Delete the corrupted sstable files and run repair on the table in question.

For reference, the LZ4Compressor was added under https://issues.apache.org/jira/browse/CASSANDRA-5038.

In terms of performance, lz4-java does very well. Some benchmark results of various compression libraries can be found at https://github.com/ning/jvm-compressor-benchmark/wiki. Enabling compression will reduce the data size on disk and should improve read performance.

I am in the process of running some tests to compare read performance with and without compression. I will report back the results here.

Comment 2 John Sanda 2014-02-10 16:43:08 UTC
Compression has been re-enabled using lz4-java, and it has been repackaged to strip out its native components. This means that only the pure Java impl(s) will be used.

Comment 3 John Sanda 2014-02-10 16:43:40 UTC
Compression has been re-enabled using lz4-java, and it has been repackaged to strip out its native components. This means that only the pure Java impl(s) will be used.

master commit hash: cb35dada1

Comment 4 Heiko W. Rupp 2014-02-14 14:20:00 UTC
I think (as master has shown) this may need more work like making sure that the lz4 lib is distributed with the storage node and that a special schema upgrade task needs to be triggered.

Comment 5 John Sanda 2014-02-14 15:29:01 UTC
Heiko, the problems you encountered are specific to the dev-container. If you review the commit cited in comment 3 you will see that lz4-java is in fact packaged with the Storage Node. In fact, lz4-java is included with a stock Cassandra distro. My commit strips out the native components. And the schema change is applied. I have tested upgrading a 4.10 snapshot build. If need be, I would prefer to open a separate BZ for handling the dev-container since the issues you hit are specific to the dev environment. Right now we do not have a really good C* db upgrade process in place for development, partly because we have not had to deal with it much yet. I do not think that should block issues that do not effect regular installs.

Comment 8 John Sanda 2014-02-22 13:09:47 UTC
I retested upgrading from 4.9.0 and there were no issues. Moving this back to ON_QA.

Comment 9 Heiko W. Rupp 2014-04-23 12:31:46 UTC
Bulk closing of 4.10 issues.

If an issue is not solved for you, please open a new BZ (or clone the existing one) with a version designator of 4.10.

Note You need to log in before you can comment on or make changes to this bug.