Bug 1121525

Summary: StorageNode can't be managed or discovered if another instance of Cassandra is running
Product: [Other] RHQ Project Reporter: Michael Burman <miburman>
Component: Storage NodeAssignee: Nobody <nobody>
Status: NEW --- QA Contact:
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.13   
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Michael Burman 2014-07-21 07:05:31 UTC
Description of problem: If there are two Cassandra instances running which match the same plugin process-scan query, plugin container will return null for matches and the discovery will fail.

Running another Cassandra instance on the computer will fail discovery of the storage-node and make managing storage nodes impossible from RHQ.

Example output from ps xa with two running Cassandras:

michael@grace-mint ~/projects/rhq/dev-container/rhq-agent/logs $ ps xa | grep Cassandra
 2169 ?        SLl    0:22 java -ea -javaagent:/usr/share/cassandra/lib/jamm-0.2.5.jar -XX:+CMSClassUnloadingEnabled -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms1994M -Xmx1994M -Xmn400M -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:StringTableSize=1000003 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseTLAB -XX:+UseCondCardMark -Djava.net.preferIPv4Stack=true -Dcom.sun.management.jmxremote.port=7199 -Dcom.sun.management.jmxremote.rmi.port=7199 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Dlog4j.configuration=log4j-server.properties -Dlog4j.defaultInitOverride=true -Dcassandra-pidfile=/var/run/cassandra/cassandra.pid -cp /etc/cassandra:/usr/share/cassandra/lib/antlr-3.2.jar:/usr/share/cassandra/lib/commons-cli-1.1.jar:/usr/share/cassandra/lib/commons-codec-1.2.jar:/usr/share/cassandra/lib/commons-lang3-3.1.jar:/usr/share/cassandra/lib/compress-lzf-0.8.4.jar:/usr/share/cassandra/lib/concurrentlinkedhashmap-lru-1.3.jar:/usr/share/cassandra/lib/disruptor-3.0.1.jar:/usr/share/cassandra/lib/guava-15.0.jar:/usr/share/cassandra/lib/high-scale-lib-1.1.2.jar:/usr/share/cassandra/lib/jackson-core-asl-1.9.2.jar:/usr/share/cassandra/lib/jackson-mapper-asl-1.9.2.jar:/usr/share/cassandra/lib/jamm-0.2.5.jar:/usr/share/cassandra/lib/jbcrypt-0.3m.jar:/usr/share/cassandra/lib/jline-1.0.jar:/usr/share/cassandra/lib/json-simple-1.1.jar:/usr/share/cassandra/lib/libthrift-0.9.1.jar:/usr/share/cassandra/lib/log4j-1.2.16.jar:/usr/share/cassandra/lib/lz4-1.2.0.jar:/usr/share/cassandra/lib/metrics-core-2.2.0.jar:/usr/share/cassandra/lib/netty-3.6.6.Final.jar:/usr/share/cassandra/lib/reporter-config-2.1.0.jar:/usr/share/cassandra/lib/servlet-api-2.5-20081211.jar:/usr/share/cassandra/lib/slf4j-api-1.7.2.jar:/usr/share/cassandra/lib/slf4j-log4j12-1.7.2.jar:/usr/share/cassandra/lib/snakeyaml-1.11.jar:/usr/share/cassandra/lib/snappy-java-1.0.5.jar:/usr/share/cassandra/lib/snaptree-0.1.jar:/usr/share/cassandra/lib/super-csv-2.1.0.jar:/usr/share/cassandra/lib/thrift-server-internal-only-0.3.3.jar:/usr/share/cassandra/apache-cassandra-2.0.9.jar:/usr/share/cassandra/apache-cassandra-thrift-2.0.9.jar:/usr/share/cassandra/apache-cassandra.jar:/usr/share/cassandra/stress.jar:/usr/share/java/jna.jar: -XX:HeapDumpPath=/var/lib/cassandra/java_1405923279.hprof -XX:ErrorFile=/var/lib/cassandra/hs_err_1405923279.log org.apache.cassandra.service.CassandraDaemon
11876 pts/4    Sl     0:54 /usr/lib/jvm/java-7-oracle//bin/java -ea -javaagent:./../lib/jamm-0.2.5.jar -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms256M -Xmx256M -Xmn64M -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:+CMSClassUnloadingEnabled -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseTLAB -Djava.rmi.server.hostname=localhost -Dcom.sun.management.jmxremote.port=7299 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Djava.net.preferIPv4Stack=true -Dlog4j.configuration=log4j-server.properties -Dlog4j.defaultInitOverride=true -Dcassandra-pidfile=/home/michael/projects/rhq/dev-container/rhq-server/rhq-storage/bin/cassandra.pid -cp ./../conf:./../build/classes/main:./../build/classes/thrift:./../lib/antlr-3.2.jar:./../lib/apache-cassandra-1.2.18.jar:./../lib/apache-cassandra-clientutil-1.2.18.jar:./../lib/apache-cassandra-thrift-1.2.18.jar:./../lib/avro-1.4.0-fixes.jar:./../lib/avro-1.4.0-sources-fixes.jar:./../lib/commons-cli-1.1.jar:./../lib/commons-codec-1.2.jar:./../lib/commons-lang-2.6.jar:./../lib/compress-lzf-0.8.4.jar:./../lib/concurrentlinkedhashmap-lru-1.3.jar:./../lib/guava-13.0.1.jar:./../lib/high-scale-lib-1.1.2.jar:./../lib/jackson-core-asl-1.9.2.jar:./../lib/jackson-mapper-asl-1.9.2.jar:./../lib/jamm-0.2.5.jar:./../lib/jbcrypt-0.3m.jar:./../lib/jline-1.0.jar:./../lib/json-simple-1.1.jar:./../lib/libthrift-0.7.0.jar:./../lib/log4j-1.2.16.jar:./../lib/lz4-1.1.0.jar:./../lib/metrics-core-2.2.0.jar:./../lib/netty-3.6.6.Final.jar:./../lib/netty-3.7.0.Final.jar:./../lib/rhq-cassandra-auth-4.13.0-SNAPSHOT.jar:./../lib/servlet-api-2.5-20081211.jar:./../lib/slf4j-api-1.7.2.jar:./../lib/slf4j-log4j12-1.7.2.jar:./../lib/snakeyaml-1.6.jar:./../lib/snappy-java-1.0.4.1-rhq-p1.jar:./../lib/snaptree-0.1.jar org.apache.cassandra.service.CassandraDaemon

Error when trying to discover Cassandra:

014-07-21 09:18:16,544 WARN  [InventoryManager.discovery-1] (rhq.core.pc.inventory.InventoryManager)- Failure during discovery for [Cassandra Node] Resources - failed after 206 ms.
java.lang.Exception: Discovery component invocation failed.
	at org.rhq.core.pc.util.DiscoveryComponentProxyFactory$ComponentInvocationThread.call(DiscoveryComponentProxyFactory.java:309)
	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException
	at org.rhq.plugins.cassandra.CassandraNodeDiscoveryComponent.getDetails(CassandraNodeDiscoveryComponent.java:174)
	at org.rhq.plugins.cassandra.CassandraNodeDiscoveryComponent.scanForResources(CassandraNodeDiscoveryComponent.java:84)
	at org.rhq.plugins.cassandra.CassandraNodeDiscoveryComponent.discoverResources(CassandraNodeDiscoveryComponent.java:69)
	at sun.reflect.GeneratedMethodAccessor60.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at org.rhq.core.pc.util.DiscoveryComponentProxyFactory$ComponentInvocationThread.call(DiscoveryComponentProxyFactory.java:305)
	... 4 more


Version-Release number of selected component (if applicable): 4.13-SNAPSHOT

How reproducible: Always.


Steps to Reproduce:
1. Start another installation of Cassandra
2. Install RHQ
3. Start RHQ - discovery will fail.

Actual results: NullPointerException in plugin.


Expected results: Both resources should be discovered.

Additional info: This could be more generic than just Cassandra, as processInfo returns null and some plugins can't handle this (and it shouldn't return null, as there were two matches).

Shutting down another instance of Cassandra will make scan succeed for storagenode.