Created attachment 1010623 [details]
server and storage logs
Description of problem:
Storage node is correctly installed and started but it's not possible to stop it via rhqctl. Stop operation hangs and storage node remains in broken state.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. install IBM SDK 8
2. install JON (./rhqctl install)
3. start JON (./rhqctl start)
After step 2:
Stopping RHQ storage node...
RHQ storage node (pid=25564) is stopping...
08:43:47,861 ERROR [org.rhq.server.control.RHQControl] Process  did not finish yet. Terminate it manually and retry.
After step 3:
-all processes (agent, storage node, server) are started but storage node is not accessible. From server.log:
08:58:27,275 WARN [org.rhq.enterprise.server.storage.StorageClientManager] (pool-6-thread-1) Storage client subsystem wasn't initialized because it wasn't possible to connect to the storage cluster. The RHQ server is set to MAINTENANCE mode. Please start the storage cluster as soon as possible.: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: fbr-ibm8.bc.jonqe.lab.eng.bos.redhat.com/10.16.23.134 ([fbr-ibm8.bc.jonqe.lab.eng.bos.redhat.com/10.16.23.134] Cannot connect))
Everything is correctly started after step 2 and no errors after step 3
kill -9 <storageNode PID>
Next invocation of ./rhqctl stop breaks storage node again.
Logs are attached.
The issue is not visible on IBM 1.7.0
I am going to implement the workaround to rhqctl, ie. if we attempt to stop storage node and we detect it still runs after some time, we'll kill it by sending SIGTERM signal (kill -9 ), I'll also increase the time of waiting for cassandra proper shutdown to 1 minute.
time: 2015-04-15 13:18:30 +0200
author: Libor Zoubek - email@example.com
message: Bug 1208854 - Unable to stop storage node when running on IBM SDK 8
Fix rhqctl to kill with SIGKILL when we do not succeed to stop
cassandra the safe way
qe payload verification process to include log files documenting correct shutdown. include the log files and attach to this issue.
Available for test with 3.3.3 ER01 build:
*Note: jon-server-patch-3.3.0.GA.zip maps to ER01 build of
3.3.0.GA Update 03
Build Number :
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.