Bug 782890

Summary:	-D option on rhq-agent startup is not working
Product:	[Other] RHQ Project	Reporter:	Mike Foley <mfoley>
Component:	Agent	Assignee:	RHQ Project Maintainer <rhq-maint>
Status:	CLOSED CURRENTRELEASE	QA Contact:	Mike Foley <mfoley>
Severity:	high	Docs Contact:
Priority:	high
Version:	unspecified	CC:	akostadi, hrupp, mazz
Target Milestone:	---	Keywords:	Reopened
Target Release:	RHQ 4.3.0
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:
Clones:	783876 783877 (view as bug list)		Environment:
Last Closed:	2013-08-31 10:16:58 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	783876, 783877

Description Mike Foley 2012-01-18 19:35:38 UTC

Description of problem:  -D option on rhq-agent startup is not working


Version-Release number of selected component (if applicable):  JON 2.4.2 RC#4


How reproducible:
100%

Steps to Reproduce:
1.  ./rhq-agent.sh -Drhq.agent.server.bind-port=7090
2.  at the agent's ">" prompt, type 'getconfig'
3.  observe the port is still 7080
  
Actual results:   -D option does not change agent properties 


Expected results:  D option changes agent properties 


Additional info:

./rhq-agent.sh -help   documents this as a way to change settings.  

options:
   -a, --advanced                If setup is needed at startup, the advanced setup is run, rather than the basic
   -c, --config=<filename>       Specifies an agent configuration preferences file (on filesystem or classpath)
   -d, --daemon                  Agent runs in daemon mode - will not read from stdin for commands
   -D<name>[=<value>]            Overrides an agent configuration preference and sets a system property
   -e, --console=<type>          Specifies the implementation to use when reading console input: jline, sigar, java
   -g, --purgeplugins            Deletes all plugins, forcing the agent to re-download all of them
   -h, --help                    Shows this help message (default)
   -i, --input=<filename>        Specifies a script file to be used for input
   -l, --cleanconfig             Clears out any existing configuration and data files so the agent starts with a totally clean slate
   -n, --nostart                 If specified, the agent will not be automatically started
   -o, --output=<filename>       Specifies a file to write all output (excluding log messages)
   -p, --pref=<preferences name> Specifies the agent preferences name used to identify what configuration to use
   -s, --setup                   Forces the agent to ask setup questions, even if it is fully configured
   -t, --nonative                Forces the agent to disable the native system, even if it is configured for it
   -u, --purgedata               Purges persistent inventory and other data files
   --                            Stop processing options

Comment 2 John Mazzitelli 2012-01-18 19:48:25 UTC

did you already have a failover list? the failover list probably overrides any server addr/port you provided. to test, do what you did, but before you enter the rhq-agent.sh command, delete the data/failover.dat file.

Comment 3 John Mazzitelli 2012-01-20 17:41:20 UTC

Note, if I start the agent with -n (--nostart - this won't start any comm or plugin container), the port is changed. Will look more into this to see what happens at startup that might changn this. I suspect its the failover list.

$ ./rhq-agent.sh -n -Drhq.agent.server.bind-port=7090
RHQ 4.3.0-SNAPSHOT [b295126] (Fri Jan 20 11:17:50 EST 2012)
> getconfig rhq.agent.server.bind-port
rhq.agent.server.bind-port=7090

Comment 4 John Mazzitelli 2012-01-20 17:46:55 UTC

more tests - if you have a clean/new agent, the 7090 port is used:

rhq-agent.sh -l -Drhq.agent.server.bind-port=7090

The setup questions are asked, and the default server port will be 7090.

Comment 5 John Mazzitelli 2012-01-20 18:01:15 UTC

OK, this is working as expected.  I will close this issue as such. Here's the explanation.

I have an agent that has previously registered with the server and as such got assigned a failover-list (see data/failover-list.dat). I shut the agent down and restart it. Here's some cmdline shell output (my shell has a current working directory of my agent's bin directory):

$ cat ../data/failover-list.dat 
mazztower:7080/7443
$ ./rhq-agent.sh -Drhq.agent.server.bind-port=12345
RHQ 4.3.0-SNAPSHOT [b295126] (Fri Jan 20 11:17:50 EST 2012)
> getconfig rhq.agent.server.bind-port
rhq.agent.server.bind-port=7080

What is the magic you ask? Take a look at your agent log messages and you'll see:

2012-01-20 12:50:29,721 INFO  [RHQ Server Polling Thread] (enterprise.communications.command.client.JBossRemotingRemoteCommunicator)- {JBossRemotingRemoteCommunicator.changing-endpoint}Communicator is changing endpoint from [InvokerLocator [servlet://mazztower:12345/jboss-remoting-servlet-invoker/ServerInvokerServlet]] to [InvokerLocator [servlet://mazztower:7080/jboss-remoting-servlet-invoker/ServerInvokerServlet]]

So, here you can see it DID use that override port number specified by the -D (in my case, 12345). BUT! The agent also sees its a bogus port - it can't talk to the server there, so its smart enough to immediately begin its failover backup plan. It says, "OK, this server endpoint is down, I will look for a failover list, and if I have one, go to the next server in the list".

Well, as you see in my above cmdline shell output, I DO have a failover-list.dat and it has "mazztower:7080/7443" as a server that it should try next.

And the agent does. It immediately switches over as you see in the next agent log message:

2012-01-20 12:50:29,843 INFO  [RHQ Server Polling Thread] (org.rhq.enterprise.agent.AgentMain)- {AgentMain.failed-over-to-server}The agent has triggered its failover mechanism and switched to server [servlet://mazztower:7080/jboss-remoting-servlet-invoker/ServerInvokerServlet]

So, this is working as expected. You tried to start the agent with a server port that is down and the agent, quickly sensing the problem, will immediately try to go to another server in its failover list.

If you did NOT have a failover list, you WOULD get an connection problem (because, obviously, the agent wouldn't know where to go next, so it will just sit and wait for that server on port 7090 (in your case) to come back.

So, that is how you can do your tests. Just delete data/failover-list.dat before you run the agent with the bad port.

Comment 6 John Mazzitelli 2012-01-20 18:09:14 UTC

BTW: for giggles, I tried to set my security token to an invalid one (since that was also mentioned in this issue):

$ ./rhq-agent.sh -Drhq.agent.security-token=ABC
RHQ 4.3.0-SNAPSHOT [b295126] (Fri Jan 20 11:17:50 EST 2012)
The server has rejected the agent registration request. Cause: [org.rhq.core.clientapi.server.core.AgentRegistrationException:The agent asking for registration under the name [mazztower] provided an invalid security token. This request will fail. Please consult an administrator to reconfigure this agent with its proper security token.]
Will retry the agent registration request soon...

So this is now working as expected as well.

Comment 7 Charles Crouch 2012-01-23 04:20:24 UTC

Assigning to ON_QA so that QE can verify they are seeing what Mazz describes when they are testing builds for RHQ4.3, i.e. builds off of master

Comment 8 Mike Foley 2012-02-02 20:38:52 UTC

i am seeing -D commands passed to the agent ... on JON 3.01 RC#1 ...change was cherrypicked from master

Comment 9 Heiko W. Rupp 2013-08-31 10:16:58 UTC

Bulk close of old bugs in VERIFIED state.