Description of problem: Old infrastructure - server is installed in one machine - agent is installed on other machine. After agent installation service jon-agent start doesn't connect to server even if the configuration is set correctly and start via wrapper script does dork correctly. Version-Release number of selected component (if applicable): org.jboss.on-jboss-on-parent-3.1.0.GA-8 How reproducible: always Steps to Reproduce: 1. Install server on rhel6 via zip 2. install agent on another rhel6 machine via jar installation 3. configure agent so that it connects for server 4. Stop agent 5. install agent through rpm 6. start agent via wrapper script 7. stop agent 8. start agent via service jon-aent Actual results: agent never connects to server - auto detection is disabled Expected results: agent is conncted to "separated" server Additional info: agent configuration file contains correct server host, and wrapper script starts agent correctly and connects to server. I've puted severity as medium, but in my eyes it's really high.
This should at least be investigate for jon311
Starting the agent first via the wrapper could be the cause for the agent not being able to start as a service. This could be duplicate of bug 835892. Please repeat the test with the following steps: 1) Install the RPM 2) Update the configuration 3) Start the service without first using the wrapper directly Please attach the agent logs files in case of any failures.
performed an experiment with wireshark to capture the packages. if: run "service jon-agent start", no communication packages captured between server and agent. result: the agent could not connect to the server. else run "rhq-agent.sh", communication packages captured successfully. Result: the agent connected to the server. Further analysis is required.
have changed server address in both agent-configuration.xml and rhq-agent-env.sh files and still service start doesn't work, agent cannot register with server. Log contains: service start with non-root user [hudson@dhcp-31-221 logs]$ tail -f -n 200 agent.log 2012-07-09 11:42:33,490 INFO [main] (org.rhq.enterprise.agent.AgentMain)- {AgentMain.identify-version}Version=[RHQ 4.4.0.JON310GA], Build Number=[a53e41e], Build Date=[Jun 8, 2012 9:48 AM] 2012-07-09 11:42:33,595 INFO [main] (org.rhq.enterprise.agent.AgentMain)- {AgentMain.agent-name-auto-generated}The name of this agent was not predefined so it was auto-generated. The agent name is now [dhcp-31-221.brq.redhat.com] 2012-07-09 11:42:33,800 INFO [main] (org.rhq.enterprise.communications.ServiceContainer)- {ServiceContainer.global-concurrency-limit-disabled}Global concurrency limit has been disabled - there is no limit to the number of incoming commands allowed 2012-07-09 11:42:33,933 INFO [main] (org.rhq.enterprise.communications.ServiceContainer)- {ServiceContainer.started}Service container started - ready to accept incoming commands 2012-07-09 11:42:33,933 INFO [main] (org.rhq.enterprise.agent.AgentMain)- {AgentMain.no-auto-detect}Server auto-detection is not enabled - starting the poller immediately 2012-07-09 11:43:33,956 INFO [main] (org.rhq.enterprise.agent.AgentMain)- {AgentMain.waiting-to-be-registered-begin}The agent will now wait until it has registered with the server... 2012-07-09 11:46:42,693 INFO [main] (org.rhq.enterprise.agent.AgentMain)- {AgentMain.identify-version}Version=[RHQ 4.4.0.JON310GA], Build Number=[a53e41e], Build Date=[Jun 8, 2012 9:48 AM] 2012-07-09 11:46:43,007 INFO [main] (org.rhq.enterprise.communications.ServiceContainer)- {ServiceContainer.global-concurrency-limit-disabled}Global concurrency limit has been disabled - there is no limit to the number of incoming commands allowed 2012-07-09 11:46:43,122 INFO [main] (org.rhq.enterprise.communications.ServiceContainer)- {ServiceContainer.started}Service container started - ready to accept incoming commands 2012-07-09 11:46:43,122 INFO [main] (org.rhq.enterprise.agent.AgentMain)- {AgentMain.no-auto-detect}Server auto-detection is not enabled - starting the poller immediately service start with root user [root@dhcp-31-221 logs]# tail -f -n 200 agent.log 2012-07-09 11:50:58,117 INFO [main] (org.rhq.enterprise.agent.AgentMain)- {AgentMain.identify-version}Version=[RHQ 4.4.0.JON310GA], Build Number=[a53e41e], Build Date=[Jun 8, 2012 9:48 AM] 2012-07-09 11:50:58,378 INFO [main] (org.rhq.enterprise.communications.ServiceContainer)- {ServiceContainer.global-concurrency-limit-disabled}Global concurrency limit has been disabled - there is no limit to the number of incoming commands allowed 2012-07-09 11:50:58,514 INFO [main] (org.rhq.enterprise.communications.ServiceContainer)- {ServiceContainer.started}Service container started - ready to accept incoming commands 2012-07-09 11:50:58,514 INFO [main] (org.rhq.enterprise.agent.AgentMain)- {AgentMain.no-auto-detect}Server auto-detection is not enabled - starting the poller immediately 2012-07-09 11:51:58,542 INFO [main] (org.rhq.enterprise.agent.AgentMain)- {AgentMain.waiting-to-be-registered-begin}The agent will now wait until it has registered with the server... 2012-07-09 13:51:35,923 INFO [main] (org.rhq.enterprise.agent.AgentMain)- {AgentMain.identify-version}Version=[RHQ 4.4.0.JON310GA], Build Number=[a53e41e], Build Date=[Jun 8, 2012 9:48 AM] 2012-07-09 13:51:36,217 INFO [main] (org.rhq.enterprise.communications.ServiceContainer)- {ServiceContainer.global-concurrency-limit-disabled}Global concurrency limit has been disabled - there is no limit to the number of incoming commands allowed 2012-07-09 13:51:36,334 INFO [main] (org.rhq.enterprise.communications.ServiceContainer)- {ServiceContainer.started}Service container started - ready to accept incoming commands 2012-07-09 13:51:36,335 INFO [main] (org.rhq.enterprise.agent.AgentMain)- {AgentMain.no-auto-detect}Server auto-detection is not enabled - starting the poller immediately
enable DEBUG in rhq-agent-env.sh console outputed: The agent will now wait until it has registered with the server... and part of the log shows: There is no security token yet - the server will not accept commands from this agent until the agent is registered. Unable to retrieve response message java.net.ConnectException: Connection refused Failed to successfully poll the server. This is normally due to the server not being up yet. You can usually ignore this message since it will be tried again later, however, you should ensure this failure was not really caused by a misconfiguration. Cause: org.jboss.remoting.CannotConnectException:Can not connect http client invoker. Connection refused. -> java.net.ConnectException:Connection refused The agent will now wait until it has registered with the server... There is no security token yet - the server will not accept commands from this agent until the agent is registered. for complete reference, pls see (available for one month) http://pastebin.test.redhat.com/96430 (console outputs) http://pastebin.test.redhat.com/96431 (log) This debug explains why the agent doesn't request to register the server.
Updated the init scripts (ec2 and regular) to bypass the wrapper and call directly the rhq-agent script to allow reconfiguration and user prompt when the service is invoked with 'service jon-agent config' The following agent options are used simultaneously by the script for the config option: --cleanconfig (clean the previous config) --nostart (do not start the agent at the end of the configuration) --daemon (combined with nostart makes the agent to quit at the end of the setup) --setup (forces the agent to prompt for configuration) --advanced (combined with setup forces the agent to prompt for advanced configuration)
Documentation updates needed for config usage and possible scenarios where this option is useful.
service jon-agent config fixed everything, now agetn can be started via service start and can connect to remote/separated server. verified! bug for documentation is created: https://bugzilla.redhat.com/show_bug.cgi?id=839547