Bug 1121525 - StorageNode can't be managed or discovered if another instance of Cassandra is running
Summary: StorageNode can't be managed or discovered if another instance of Cassandra i...
Keywords:
Status: NEW
Alias: None
Product: RHQ Project
Classification: Other
Component: Storage Node
Version: 4.13
Hardware: Unspecified
OS: Unspecified
unspecified
high vote
Target Milestone: ---
: ---
Assignee: RHQ Project Maintainer
QA Contact: Mike Foley
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-07-21 07:05 UTC by Michael Burman
Modified: 2014-07-21 07:05 UTC (History)
0 users

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:


Attachments (Terms of Use)

Description Michael Burman 2014-07-21 07:05:31 UTC
Description of problem: If there are two Cassandra instances running which match the same plugin process-scan query, plugin container will return null for matches and the discovery will fail.

Running another Cassandra instance on the computer will fail discovery of the storage-node and make managing storage nodes impossible from RHQ.

Example output from ps xa with two running Cassandras:

michael@grace-mint ~/projects/rhq/dev-container/rhq-agent/logs $ ps xa | grep Cassandra
 2169 ?        SLl    0:22 java -ea -javaagent:/usr/share/cassandra/lib/jamm-0.2.5.jar -XX:+CMSClassUnloadingEnabled -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms1994M -Xmx1994M -Xmn400M -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:StringTableSize=1000003 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseTLAB -XX:+UseCondCardMark -Djava.net.preferIPv4Stack=true -Dcom.sun.management.jmxremote.port=7199 -Dcom.sun.management.jmxremote.rmi.port=7199 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Dlog4j.configuration=log4j-server.properties -Dlog4j.defaultInitOverride=true -Dcassandra-pidfile=/var/run/cassandra/cassandra.pid -cp /etc/cassandra:/usr/share/cassandra/lib/antlr-3.2.jar:/usr/share/cassandra/lib/commons-cli-1.1.jar:/usr/share/cassandra/lib/commons-codec-1.2.jar:/usr/share/cassandra/lib/commons-lang3-3.1.jar:/usr/share/cassandra/lib/compress-lzf-0.8.4.jar:/usr/share/cassandra/lib/concurrentlinkedhashmap-lru-1.3.jar:/usr/share/cassandra/lib/disruptor-3.0.1.jar:/usr/share/cassandra/lib/guava-15.0.jar:/usr/share/cassandra/lib/high-scale-lib-1.1.2.jar:/usr/share/cassandra/lib/jackson-core-asl-1.9.2.jar:/usr/share/cassandra/lib/jackson-mapper-asl-1.9.2.jar:/usr/share/cassandra/lib/jamm-0.2.5.jar:/usr/share/cassandra/lib/jbcrypt-0.3m.jar:/usr/share/cassandra/lib/jline-1.0.jar:/usr/share/cassandra/lib/json-simple-1.1.jar:/usr/share/cassandra/lib/libthrift-0.9.1.jar:/usr/share/cassandra/lib/log4j-1.2.16.jar:/usr/share/cassandra/lib/lz4-1.2.0.jar:/usr/share/cassandra/lib/metrics-core-2.2.0.jar:/usr/share/cassandra/lib/netty-3.6.6.Final.jar:/usr/share/cassandra/lib/reporter-config-2.1.0.jar:/usr/share/cassandra/lib/servlet-api-2.5-20081211.jar:/usr/share/cassandra/lib/slf4j-api-1.7.2.jar:/usr/share/cassandra/lib/slf4j-log4j12-1.7.2.jar:/usr/share/cassandra/lib/snakeyaml-1.11.jar:/usr/share/cassandra/lib/snappy-java-1.0.5.jar:/usr/share/cassandra/lib/snaptree-0.1.jar:/usr/share/cassandra/lib/super-csv-2.1.0.jar:/usr/share/cassandra/lib/thrift-server-internal-only-0.3.3.jar:/usr/share/cassandra/apache-cassandra-2.0.9.jar:/usr/share/cassandra/apache-cassandra-thrift-2.0.9.jar:/usr/share/cassandra/apache-cassandra.jar:/usr/share/cassandra/stress.jar:/usr/share/java/jna.jar: -XX:HeapDumpPath=/var/lib/cassandra/java_1405923279.hprof -XX:ErrorFile=/var/lib/cassandra/hs_err_1405923279.log org.apache.cassandra.service.CassandraDaemon
11876 pts/4    Sl     0:54 /usr/lib/jvm/java-7-oracle//bin/java -ea -javaagent:./../lib/jamm-0.2.5.jar -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms256M -Xmx256M -Xmn64M -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:+CMSClassUnloadingEnabled -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseTLAB -Djava.rmi.server.hostname=localhost -Dcom.sun.management.jmxremote.port=7299 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Djava.net.preferIPv4Stack=true -Dlog4j.configuration=log4j-server.properties -Dlog4j.defaultInitOverride=true -Dcassandra-pidfile=/home/michael/projects/rhq/dev-container/rhq-server/rhq-storage/bin/cassandra.pid -cp ./../conf:./../build/classes/main:./../build/classes/thrift:./../lib/antlr-3.2.jar:./../lib/apache-cassandra-1.2.18.jar:./../lib/apache-cassandra-clientutil-1.2.18.jar:./../lib/apache-cassandra-thrift-1.2.18.jar:./../lib/avro-1.4.0-fixes.jar:./../lib/avro-1.4.0-sources-fixes.jar:./../lib/commons-cli-1.1.jar:./../lib/commons-codec-1.2.jar:./../lib/commons-lang-2.6.jar:./../lib/compress-lzf-0.8.4.jar:./../lib/concurrentlinkedhashmap-lru-1.3.jar:./../lib/guava-13.0.1.jar:./../lib/high-scale-lib-1.1.2.jar:./../lib/jackson-core-asl-1.9.2.jar:./../lib/jackson-mapper-asl-1.9.2.jar:./../lib/jamm-0.2.5.jar:./../lib/jbcrypt-0.3m.jar:./../lib/jline-1.0.jar:./../lib/json-simple-1.1.jar:./../lib/libthrift-0.7.0.jar:./../lib/log4j-1.2.16.jar:./../lib/lz4-1.1.0.jar:./../lib/metrics-core-2.2.0.jar:./../lib/netty-3.6.6.Final.jar:./../lib/netty-3.7.0.Final.jar:./../lib/rhq-cassandra-auth-4.13.0-SNAPSHOT.jar:./../lib/servlet-api-2.5-20081211.jar:./../lib/slf4j-api-1.7.2.jar:./../lib/slf4j-log4j12-1.7.2.jar:./../lib/snakeyaml-1.6.jar:./../lib/snappy-java-1.0.4.1-rhq-p1.jar:./../lib/snaptree-0.1.jar org.apache.cassandra.service.CassandraDaemon

Error when trying to discover Cassandra:

014-07-21 09:18:16,544 WARN  [InventoryManager.discovery-1] (rhq.core.pc.inventory.InventoryManager)- Failure during discovery for [Cassandra Node] Resources - failed after 206 ms.
java.lang.Exception: Discovery component invocation failed.
	at org.rhq.core.pc.util.DiscoveryComponentProxyFactory$ComponentInvocationThread.call(DiscoveryComponentProxyFactory.java:309)
	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException
	at org.rhq.plugins.cassandra.CassandraNodeDiscoveryComponent.getDetails(CassandraNodeDiscoveryComponent.java:174)
	at org.rhq.plugins.cassandra.CassandraNodeDiscoveryComponent.scanForResources(CassandraNodeDiscoveryComponent.java:84)
	at org.rhq.plugins.cassandra.CassandraNodeDiscoveryComponent.discoverResources(CassandraNodeDiscoveryComponent.java:69)
	at sun.reflect.GeneratedMethodAccessor60.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at org.rhq.core.pc.util.DiscoveryComponentProxyFactory$ComponentInvocationThread.call(DiscoveryComponentProxyFactory.java:305)
	... 4 more


Version-Release number of selected component (if applicable): 4.13-SNAPSHOT

How reproducible: Always.


Steps to Reproduce:
1. Start another installation of Cassandra
2. Install RHQ
3. Start RHQ - discovery will fail.

Actual results: NullPointerException in plugin.


Expected results: Both resources should be discovered.

Additional info: This could be more generic than just Cassandra, as processInfo returns null and some plugins can't handle this (and it shouldn't return null, as there were two matches).

Shutting down another instance of Cassandra will make scan succeed for storagenode.


Note You need to log in before you can comment on or make changes to this bug.