Bug 790467

Summary: jon-agent-ec2 init script
Product: [JBoss] JBoss Operations Network Reporter: Aleksandar Kostadinov <akostadi>
Component: AgentAssignee: Simeon Pinder <spinder>
Status: CLOSED CURRENTRELEASE QA Contact: Mike Foley <mfoley>
Severity: high Docs Contact:
Priority: high    
Version: JON 3.0.1CC: asantos, hrupp, jsanda, loleary
Target Milestone: GA   
Target Release: JON 3.1.2   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 813536 (view as bug list) Environment:
Last Closed: 2013-09-11 11:03:46 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 707235, 813536    
Attachments:
Description Flags
patch for rhq-agent-wrapper-ec2.sh none

Description Aleksandar Kostadinov 2012-02-14 15:40:57 UTC
this init script seems to be targeted at launching a JON agent on Amazon EC2 machines. It reads a few variables and configures some start-up parameters but I think that's insufficient currently.

Here's the list of params that I believe need to be available:
JON_AGENT_NAME # this currently is forced to INSTANCE_ID
JON_AGENT_ADDR # bind address if non-default
JON_SERVER_URL 
JON_AGENT_TOKEN
JON_AGENT_OPTS # any additional options accepted on first run

Regards

Comment 1 Charles Crouch 2012-02-16 18:29:11 UTC
John, can you comment on effort required here?

Comment 3 John Sanda 2012-02-16 19:07:39 UTC
I believe that the script in the master branch actually is not the most current version. I will double check and update this bug accordingly. The questions from Aleksnadar are still relevant though. A little background is necessary. That script was written with the intent for distribution of AMIs built in brew. The agent was to be packaged as an RPM and installed into /usr/share/jboss-on-<version>/agent. Agent preferences would be stored in /var/lib/jon-agent/data instead of the default location of ~/.java/.userPrefs. 


Those AMIs would be preconfigured with the JON agent. The value of JON_SERVER_URL  is expected to come from user-defined data passed to the machine instance at start up. The URL is expected to be either the private IP address or host name of the JON server. 

JON_AGENT_NAME is set to the INSTANCE_ID because the agent name needs to be unique and consistent across machine restarts as the agent name is used as the resource key for the host machine. The machine host name or IP address should not be used for the agent name.

If you have not already you may want to review http://rhq-project.org/display/JOPR2/Running+RHQ+in+EC2 for additional info on running in EC2.

Comment 4 Aleksandar Kostadinov 2012-02-16 19:18:23 UTC
(In reply to comment #3)
> I believe that the script in the master branch actually is not the most current
> version. I will double check and update this bug accordingly. The questions
> from Aleksnadar are still relevant though. A little background is necessary.
> That script was written with the intent for distribution of AMIs built in brew.
> The agent was to be packaged as an RPM and installed into
> /usr/share/jboss-on-<version>/agent. Agent preferences would be stored in
> /var/lib/jon-agent/data instead of the default location of ~/.java/.userPrefs. 

Very good, exactly our use case :)

> 
> Those AMIs would be preconfigured with the JON agent. The value of
> JON_SERVER_URL  is expected to come from user-defined data passed to the
> machine instance at start up. The URL is expected to be either the private IP
> address or host name of the JON server. 

Same here, users would configure AMI through user data. One concern I have though is that some information might not be available before starting the instance, for example IP address of agent (i.e. machine has more than 1 network interface or has VPN connections).

It would be good if it is made possible for user to put the configuration in a local filesystem file where the init script can read it. If you like we can talk or chat about this.

> JON_AGENT_NAME is set to the INSTANCE_ID because the agent name needs to be
> unique and consistent across machine restarts as the agent name is used as the
> resource key for the host machine. The machine host name or IP address should
> not be used for the agent name.

I believe INSTANCE_ID is a good default but I don't see any reason to limit user choice. It isn't hard to allow an override if provided by user.

> If you have not already you may want to review
> http://rhq-project.org/display/JOPR2/Running+RHQ+in+EC2 for additional info on
> running in EC2.

I know about this already, nice doc.

Thank you for looking into this issue!

Comment 5 John Sanda 2012-02-16 20:24:47 UTC
I just committed the most recent version of the wrapper script to master. The commit hash is 27abe2b95ec5cac4971d5580b04d11a0a0570291. I wouldn't be surprised if you do run into problems with the script as it has not been used in some time.

Can you please summarize what exactly it is you want to do? The script was developed with the intent to provide for as close to zero configuration as possible so that when a user starts up an EAP AMI for example, the agent automatically starts up, connects to a JON server, and starts managing the EAP server, all with minimal to no manual user intervention.

Having the IP address of the agent machine before start up should not be an issue. You only need to know the JON server machine address. When the agent starts, it initiates a registration process with the JON server. The agent sends its IP address, among other things, to the server as part of that registration process. This registration process occurs every time the agent starts; so, if the IP address does change, the agent will send its current IP address to the server. In my investigation, I did not cover scenarios involving like VPN connections. I am not saying it is unimportant; rather, I just didn't have the time to cover some of these other situations.

As for putting the agent config file in a location that can be read by an init script and as for using something other than the INSTANCE_ID for the agent name, this is already possible. If you start the agent manually using the rhq-agent.sh script with the --cleanconfig option, you will be prompted to supply values for agent address, name, and server address. Those values are then stored in a java preferences file and used when the agent restarted in the future. Now it may be the case that the rhq-agent-wrapper-ec2 script is not checking to see if those preferences are already on the file system. If that is the case, the script could certainly be updated.

Comment 6 Charles Crouch 2012-02-16 23:28:47 UTC
Setting this issue back to Target Release of JON3.0.1, so it can be considered for work in the short term.

Comment 7 John Sanda 2012-02-17 15:09:59 UTC
I just realized that I did not push the commit. It has just now been pushed to the remote repo. See http://git.fedorahosted.org/git/?p=rhq/rhq.git;a=commit;h=3dc307d3649b404e7e4ee40781a51b91d8fde306.

Comment 8 Charles Crouch 2012-02-17 15:25:57 UTC
Alek: "Can you please summarize what exactly it is you want us to do?"

Comment 10 John Sanda 2012-02-17 19:48:54 UTC
Alek, take a look at http://git.fedorahosted.org/git/?p=rhq/rhq.git;a=commit;h=49c603c03da29e62db36cb719f30bc47bfb61ed9. It is the latest commit which is just a file renaming. The location from the project root is modules/enterprise/agent/src/etc/rhq-agent-wrapper-ec2.sh.

Comment 11 Charles Crouch 2012-02-17 19:57:04 UTC
Unable to assign to the jon3.0.1 release until the changes required for this issue are clarified

Comment 12 Mike Foley 2012-02-27 15:50:10 UTC
triage to jon 3.1

Comment 13 Aleksandar Kostadinov 2012-02-27 16:46:31 UTC
Created attachment 566088 [details]
patch for rhq-agent-wrapper-ec2.sh

Please find attached a patch to the init script. As far as my testing goes, it should be flexible, yet retains original behavior if user does not need the additional features.

Comment 14 Mike Foley 2012-02-27 16:55:23 UTC
triage to jon 3.01 per asantos, crouch, mfoley, loleary

Comment 15 John Sanda 2012-02-27 18:46:39 UTC
I think the patch in comment 13, and from my perspective, the important thing is to test to make sure everything works across agent machine restarts.

Comment 16 John Sanda 2012-02-27 19:25:13 UTC
Please disregard comment 15. I have looked over the patched submitted in comment 13. It looks fine to me. From my perspective, the most important thing is to ensure that the script works across machine restarts. I am referring to the machines on which the agent runs, not the RHQ server And by working, I mean that the agent maintains connectivity with the JON server without manual intervention after the machine and agent restarts.

Comment 17 Charles Crouch 2012-02-27 19:51:31 UTC
Aleksandar can you confirm you've tested the script as John describes in Comment 16, and that everything works as expected.

Comment 18 Aleksandar Kostadinov 2012-02-27 20:50:06 UTC
I have tested restarting the agent with this script. Connectivity is maintained as long as long as settings in the file stay correct.

Comment 19 Charles Crouch 2012-02-27 22:07:57 UTC
Alek, but did you test across machine restarts?

Comment 20 Aleksandar Kostadinov 2012-02-28 17:38:25 UTC
Tested it works across restarts. Actually it handles changes of EC2 instance IP across stop/start cycles gracefully (when agent IP is NOT hardcoded of course).

Comment 21 Charles Crouch 2012-03-01 15:38:15 UTC
Great, lets get this script into the next 3.0.1 build

Comment 22 Simeon Pinder 2012-03-06 15:20:45 UTC
This is in the latest RC 6 build available from here:

https://brewweb.devel.redhat.com//buildinfo?buildID=201462

Moving to ON_QA.

Comment 23 Mike Foley 2012-03-06 18:27:09 UTC
No action items for JON QE per comment #2.

Marking this VERIFIED per comment #20.

Comment 24 Aleksandar Kostadinov 2012-04-13 15:11:01 UTC
I don't see the patch applied to the RPM built into the brew buildID=201462. Can that be fixed?

Comment 26 Simeon Pinder 2012-04-18 20:15:12 UTC
This is fixed in build:
https://brewweb.devel.redhat.com//buildinfo?buildID=209751

Moving to ON_QA

Aleksandar, can you retest this build and report back on whether the fix works correctly in your environment?