Bug 958087 - RHQ Controll - rhqctl stop --agent removed agent.pid but doesn't stop process
Summary: RHQ Controll - rhqctl stop --agent removed agent.pid but doesn't stop process
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: JBoss Operations Network
Classification: JBoss
Component: Installer
Version: JON 3.2
Hardware: i686
OS: Linux
high
high
Target Milestone: ---
: JON 3.2.0
Assignee: RHQ Project Maintainer
QA Contact: Mike Foley
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-04-30 11:12 UTC by Armine Hovsepyan
Modified: 2015-09-03 00:01 UTC (History)
4 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2013-09-13 14:27:16 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
rhqctl_start_stop_hangs.png (70.96 KB, text/x-log)
2013-07-02 11:37 UTC, Armine Hovsepyan
no flags Details
rhqctl_stop_agent.png (251.89 KB, image/png)
2013-07-16 09:13 UTC, Armine Hovsepyan
no flags Details
rhqctl_stop_agent_agent.log (6.00 KB, text/x-log)
2013-07-16 09:14 UTC, Armine Hovsepyan
no flags Details
rhqctl_stop_agent_server.log (119.83 KB, text/x-log)
2013-07-16 09:15 UTC, Armine Hovsepyan
no flags Details

Description Armine Hovsepyan 2013-04-30 11:12:12 UTC
Description of problem:
RHQ Controll - rhqctl stop --agent removed agent.pid but doesn't stop process

Version-Release number of selected component (if applicable):
jenkins build 177 

How reproducible:
always

Steps to Reproduce:
1. run ./rhqctl install --storage 
2. run ./rhqctl stop --agent
3. run ./rhqctl start --agent
  
Actual results:
After step 1 both agent and storage are installed and running
After step 2 agent process is running while the pid file is removed
After step 3 no new agent is started  - log is "INFO  [org.jboss.modules] JBoss Modules version 1.1.1.GA" 

Expected results:
After step 1 both agent and storage are installed and running
After step 2 agent process is stopped and the pid file is removed
After step 3 agent is started  -  RHQ Agent (pid {number} running" message.

Additional info:
http://jenkins.jonqe.lab.eng.bos.redhat.com:9080/job/RHQ_Control_Run/44/consoleFull  --  please search for rhqctl start --agent

Comment 1 John Sanda 2013-05-17 14:49:37 UTC
I am seeing this issue consistently. I tested a master build, and did not see the issue there. I think the problem is in the Cassandra plugin. It uses the Cassandra CQL driver which which uses its own internal thread pool. I think that the agent was hanging because the driver threads were still running. I updated the CassandraNodeComponent class to implement the ResourceComponent.shutdown method where it shuts down the driver's thread pool. With this change, my agent exited gracefully. This change will be available in Jenkins build 225 and later.

Comment 2 Armine Hovsepyan 2013-07-02 11:36:54 UTC
Hi,

I am not sure if the steps I took are not too rapid, but rhqctl stop either throws exception at the end of the process or is hanging while stopping agent.

Please get attached log of all actions a logs taken.

Moving back to ON_Dev

Comment 3 Armine Hovsepyan 2013-07-02 11:37:37 UTC
Created attachment 767692 [details]
rhqctl_start_stop_hangs.png

Comment 4 John Sanda 2013-07-11 13:47:20 UTC
Armine, I am not able to reproduce this issue. It may be due to the plugins (or rather the lack of plugins) that I am running. I typically test with a minimal set of plugins. When I previously commented on this issue, there cassandra plugin was not shutting down the datastax driver. That was a plugin-specific issue. There could be another plugin that is causing problems with the shutdown. If you see the issue again, can you provide the list of plugins you are running?

Comment 5 Armine Hovsepyan 2013-07-16 09:13:12 UTC
Hi John,

I have installed rhq using rhqctl install, inventoried agent to server gui, called rhqctl stop --agent and it cannot shut down agent for ~15 mins.

Please get attached screen-shot of rhqctl stop and fragments from server and agent logs.

Comment 6 Armine Hovsepyan 2013-07-16 09:13:55 UTC
Created attachment 774124 [details]
rhqctl_stop_agent.png

Comment 7 Armine Hovsepyan 2013-07-16 09:14:26 UTC
Created attachment 774125 [details]
rhqctl_stop_agent_agent.log

Comment 8 Armine Hovsepyan 2013-07-16 09:15:04 UTC
Created attachment 774126 [details]
rhqctl_stop_agent_server.log

Comment 10 John Sanda 2013-08-06 19:31:50 UTC
I am removing this from the Cassandra tracker and removing me as the assignee since the issue is not specific to rhqctl or the Cassandra feature work.

Comment 11 Thomas Segismont 2013-09-13 14:27:16 UTC
Armine,

I'm closing the issue because we don't have enough elements to analyze it.

If you manage to reproduce the problem, then please take a thread dump of the agent and reopen the BZ.

Thanks


Note You need to log in before you can comment on or make changes to this bug.