907485 – Cassandra does not work with Open JDK 1.6

Bug 907485 - Cassandra does not work with Open JDK 1.6

Summary: Cassandra does not work with Open JDK 1.6

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	JBoss Operations Network
Classification:	JBoss
Component:	Monitoring -- Other
Sub Component:
Version:	JON 3.1.2
Hardware:	Unspecified
OS:	Unspecified
Priority:	urgent
Severity:	urgent
Target Milestone:	---
Target Release:	JON 3.2.0
Assignee:	John Sanda
QA Contact:	Mike Foley
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	951619
TreeView+	depends on / blocked

Reported:	2013-02-04 14:29 UTC by Mike Foley
Modified:	2014-01-02 20:38 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:
Type:	Bug
Embargoed:

Attachments	(Terms of Use)
openjdk6.log (4.95 KB, text/x-log) 2013-04-29 15:10 UTC, Armine Hovsepyan	no flags	Details
View All

Description Mike Foley 2013-02-04 14:29:31 UTC

Description of problem:  Cassandra does not work with Open JDK 1.6.  There will be upgrade issues.  


Version-Release number of selected component (if applicable):  JON 3.2 


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:  JON 3.2 PRD has a (deprecated) requirement for Open JDK 1.6


Additional info:

Comment 1 Charles Crouch 2013-03-14 17:00:08 UTC

What's the issue here?

Comment 2 Armine Hovsepyan 2013-03-14 18:24:55 UTC

Cassandra cannot be started in env. with Open JDK 1.6 or 1.7.

Comment 3 John Sanda 2013-03-27 21:17:16 UTC

Please provide Cassandra log file. This bug provides no information so it is hard to offer any help. 

I did run some of our integration tests that work with Cassandra on jenkins with both OpenJDK 1.6 and IBM JDK 1.6. Somewhat to my surprise the tests ran without error. Those tests though do not exercise a lot of functionality around file system I/O where I think we may run into more issues. I will continue to investigate.

Comment 4 Armine Hovsepyan 2013-03-29 12:25:54 UTC

hi John,

There is no log printed. the only log printed is :

xss =  -ea -javaagent:./../lib/jamm-0.2.5.jar -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms946M -Xmx946M -Xmn100M -XX:+HeapDumpOnOutOfMemoryError -Xss180k
Segmentation fault (core dumped)


You can use my environment with OpenJDK 1.6 (please ping me private for details).

Regards,
Armine H

Comment 5 John Sanda 2013-04-02 14:39:18 UTC

Here is the Java version info from Armine's machine where she hit the seg fault.

[hudson@dhcp131-93 bin]$ java -version
java version "1.6.0_24"
OpenJDK Runtime Environment (IcedTea6 1.11.5) (rhel-1.50.1.11.5.el6_3-i386)
OpenJDK Client VM (build 20.0-b12, mixed mode)


This appears to an OpenJDK 6 specific issue. The problem and resolution is described in https://issues.apache.org/jira/browse/CASSANDRA-2441. The <CASSANDRA_HOME>/conf/cassandra-env.sh configure the Java stack size to be 180 KB. OpenJDK tries to allocate a large variable that exceeds that stack size, resulting in the seg fault. I doubled the stack size, e.g., -Xss360k, and Cassandra started without error.

For deployments running on OpenJDK 6, we can configure the Cassandra JVM to use a larger stack size. We need to figure out how much larger the stack needs to be.

Comment 6 John Sanda 2013-04-05 01:45:33 UTC

Here is another error that Armine hit while testing with IBM JDK 1.6,

Exception in thread "main" org.xerial.snappy.SnappyError: [FAILED_TO_LOAD_NATIVE_LIBRARY] no native library is found for os.name=Linux and os.arch=x86
	at org.xerial.snappy.SnappyLoader.findNativeLibrary(SnappyLoader.java:449)
	at org.xerial.snappy.SnappyLoader.loadNativeLibrary(SnappyLoader.java:307)
	at org.xerial.snappy.SnappyLoader.load(SnappyLoader.java:217)
	at org.xerial.snappy.Snappy.<clinit>(Snappy.java:48)
	at java.lang.J9VMInternals.initializeImpl(Native Method)
	at java.lang.J9VMInternals.initialize(J9VMInternals.java:200)
	at org.apache.cassandra.transport.FrameCompressor$SnappyCompressor.<init>(FrameCompressor.java:55)
	at org.apache.cassandra.transport.FrameCompressor$SnappyCompressor.<clinit>(FrameCompressor.java:42)
	at java.lang.J9VMInternals.initializeImpl(Native Method)
	at java.lang.J9VMInternals.initialize(J9VMInternals.java:200)
	at com.datastax.driver.core.ProtocolOptions$Compression.<clinit>(ProtocolOptions.java:32)
	at java.lang.J9VMInternals.initializeImpl(Native Method)
	at java.lang.J9VMInternals.initialize(J9VMInternals.java:200)
	at com.datastax.driver.core.Cluster$Builder.<init>(Cluster.java:230)
	at com.datastax.driver.core.Cluster.builder(Cluster.java:104)
	at org.rhq.cassandra.schema.SchemaManager.initCluster(SchemaManager.java:96)
	at org.rhq.cassandra.schema.SchemaManager.<init>(SchemaManager.java:81)
	at org.rhq.metrics.simulator.Simulator.createSchema(Simulator.java:195)
	at org.rhq.metrics.simulator.Simulator.run(Simulator.java:74)
	at org.rhq.metrics.simulator.SimulatorCLI.runSimulator(SimulatorCLI.java:107)
	at org.rhq.metrics.simulator.SimulatorCLI.exec(SimulatorCLI.java:70)
	at org.rhq.metrics.simulator.SimulatorCLI.main(SimulatorCLI.java:112)


This is a client-side error. Note the "os.arch=x86" towards the top of the stack trace. The testing I did in comment 3 was with a 64 bit arch. I also came across https://code.google.com/p/snappy-java/issues/detail?id=49. A newer release of snappy-java is out, and from the docs it looks like 32 bit support may be getting dropped. We need to continue testing on both 32 and 64 bit arches. We will have to disable the client-side compression and re-run to see if we hit any other issues.

Comment 7 John Sanda 2013-04-05 02:11:57 UTC

I reproduced the error in comment 6 on jon11, which is a 32 bit arch, and with IBM Java 6. The Java version info is, 

[jsanda@jon11 ~]$ java -version
java version "1.6.0"
Java(TM) SE Runtime Environment (build pxi3260sr13ifix-20130303_02(SR13+IV37419))
IBM J9 VM (build 2.4, JRE 1.6.0 IBM J9 2.4 Linux x86-32 jvmxi3260sr13-20130114_134867 (JIT enabled, AOT enabled)
J9VM - 20130114_134867
JIT  - r9_20130108_31100
GC   - 20121212_AA)
JCL  - 20130303_02

Comment 8 John Sanda 2013-04-05 02:49:21 UTC

Even with compression disabled I still get the exception in comment 6 when using IBM JDK 1.6 on jon11 which is a 32 bit arch. The problem is,

com.datastax.driver.core.ProtocolOptions$Compression.<clinit>(ProtocolOptions.java:32)

That line of code is,

SNAPPY("snappy", FrameCompressor.SnappyCompressor.instance)

FrameCompressor is a class a Cassandra class used by the driver that attempts to load the Snappy native library in a static initializer block. I may need to talk with the people who work on the datastax driver and/or snappy-driver projects to determine what we might be able to do to work around this because as of right now I do not see an immediate fix.

Comment 9 John Sanda 2013-04-05 12:48:16 UTC

In Cassandra 1.2.2 FrameCompressor catches the SnappyError, so this issue should be resolved in 1.2.2. I will work on upgrading to 1.2.2 and retest.

Comment 10 John Sanda 2013-04-25 20:58:52 UTC

I have opened bug 956878 to track the issues with IBM's JRE since they are different than the ones that need to be addressed with OpenJDK. Please log any issues related to IBM's JRE under that bug.

Comment 11 John Sanda 2013-04-25 21:04:55 UTC

The default stack size of 180k that Cassandra works fine for 64 bit platforms with OpenJDK. For 32 bit platforms, I am able to avoid the segfault described in https://issues.apache.org/jira/browse/CASSANDRA-2441 by increasing the stack size to 240k.

Comment 12 John Sanda 2013-04-27 02:00:27 UTC

Armine I am moving this to ON_QA. I know that you have already tested, but I will let you close it out.

Comment 13 Armine Hovsepyan 2013-04-29 14:54:19 UTC

verified for both storage installer and simulator  - openjdk 6 and 7.

thank you.

Comment 14 Armine Hovsepyan 2013-04-29 15:06:34 UTC

I am very sorry for reopening - checked for openjdk machine 10.16.23.191 and it is failing again.
please get logs attached.

Comment 15 Armine Hovsepyan 2013-04-29 15:10:35 UTC

Created attachment 741537 [details]
openjdk6.log

Comment 16 John Sanda 2013-04-29 16:27:08 UTC

The problem is the stackSize property in test.json. It has a value of 180k which will cause a segfault. You can remove the stackSize property. With the changes I have made, a default stack size of 240k is used if you are running on a 32 bit arch with OpenJDK 6; otherwise, 180k is used as the default. When you specify the stackSize property in test.json, it overrides the default.

Comment 17 Armine Hovsepyan 2013-04-29 20:04:13 UTC

reopened.
log is here -> http://pastebin.test.redhat.com/139492

Comment 18 John Sanda 2013-04-30 18:11:17 UTC

The default stack size of 180k results in a segfault on amd64 arches as well. We now use a default of 240k on amd64 platforms. This fix is available as of build 176.

Comment 19 Armine Hovsepyan 2013-05-02 11:00:22 UTC

verified. thank you.

Note You need to log in before you can comment on or make changes to this bug.