Bug 1124614 - Automatic agent update at agent start no longer works
Summary: Automatic agent update at agent start no longer works
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: JBoss Operations Network
Classification: JBoss
Component: Agent
Version: JON 3.2.2
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ER01
: JON 3.2.3
Assignee: Simeon Pinder
QA Contact: Garik Khachikyan
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-07-29 22:54 UTC by Larry O'Leary
Modified: 2018-12-06 17:31 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-09-05 15:40:23 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
agent log (238.53 KB, text/x-log)
2014-08-21 12:21 UTC, Garik Khachikyan
no flags Details
server log (534.27 KB, text/x-log)
2014-08-21 12:22 UTC, Garik Khachikyan
no flags Details
agent's enable update is true (186.19 KB, image/png)
2014-08-21 12:24 UTC, Garik Khachikyan
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1097921 1 None None None 2021-01-20 06:05:38 UTC
Red Hat Knowledge Base (Solution) 1147223 0 None None None Never

Internal Links: 1097921

Description Larry O'Leary 2014-07-29 22:54:59 UTC
Description of problem:
It appears that due to BZ-1097921 the agent is no longer performing an agent update check when its rhq.agent.agent-update.enabled property is set to true.

It isn't clear if this was on purpose or an oversight in the implementation. However, reviewing the requirements list, it appears that this may have been an accidental oversight.

Even with the relaxed version checking for agent-to-server communication, the expectation was that the agent would continue to perform its auto-update if it was enabled. This means that even though the agent is allowed to talk to a server -- because the server supports the agent's version -- the agent can still update itself if the following is true:

 - its rhq.agent.agent-update.enabled property is set to true
 - the server has a different version of the agent available

This is to ensure that environments where auto-update is enabled, agent's will auto-update themselves. In environments where agent auto-update is disabled, the agent will not auto-update itself but will still be allowed to connect and function as normal -- minus any changes/fixes that may have been introduced in newer agent versiosn.

Version-Release number of selected component (if applicable):
3.2.2

How reproducible:
Always

Steps to Reproduce:
1. Install JBoss ON 3.2 system.
2. Install at least one remote agent -- i.e. not running as part of server/storage installation.
3. Apply JBoss ON 3.2.2 to RHQ_SERVER_HOME.
4. Restart remote agent.

Actual results:
Remote agent will continue to report itself as GA:
    RHQ 4.9.0.JON320GA [734bd56] (Thu Dec 12 10:38:45 CST 2013)

Expected results:
Remote agent should be updated and report itself as 3.2 Update-02:
    RHQ 4.9.0.JON320GA Update 02 [cf4474c] (Thu Jul 10 17:55:56 CDT 2014)

Comment 1 John Mazzitelli 2014-07-30 19:21:09 UTC
Here's what I propose to fix this.

1) First off, for the very first application of any fix, we can't have changes to agent code because existing agents won't have the code to know to update themselves! Catch-22. So the first solution will need to be server-side only to force the agents to auto-update themselves.

2) To support this feature in the future and NOT require user-intervention (which the solution to #1 above will require) then I suggest making some agent-side fixes.

Here's my proposed solutions

1) Add a new row to RHQ_SYSTEM_CONFIG which in turn requires a new GUI element in the Administration>SystemSettings UI page. Next to the existing radio button for "Enable Agent Auto-Update" I think we need to add a "Require All Older Agents To Update" or whatever we want to call it (don't know a good way to word it).

Before a user goes to update the agents for our very next release, they will need to flip that "Require all older agents to update" to true. Under the covers, this will disable the regex check and go back to our strict checking.

Now when an agent (version GA) connects, and its auto-update is enabled, it will be forced to upgrade to version "Update-01" or whatever the latest version is.

We do this here:

org.rhq.enterprise.server.core.AgentManagerBean.isAgentVersionSupported(AgentVersion)

In that method, we now have to ask the SystemSettings SLSB to give us the value of the flag "Require all older agents to update". If that value is true, we don't do the regex check, we do the strict equals check:

// this is pseudo-code - I forget the actual System Settings SLSB API - but you get the idea :)
boolean isUpdateAllOlderAgents = systemSettingsManagerBean.getSystemSetting(IS_UPDATE_ALL_OLDER_AGENTS);
// add the "isUpdateAllOlders==false" check to the existing if-statement
if ((isUpdateAllOlderAgents == false) && (supportedAgentVersions == null || supportedAgentVersions.isEmpty())) {
....and the rest

This should fix the first time people need to upgrade old agents. The user can flip that "require all older agents to ugpgade" back to false and that brings back the regex checking again.

2) Notice this requires users to know to flip that switch in the admin page. We don't want to force users to always remember everytime they upgrade agents to manually flip that switch. So to fix this for future upgrades, put this fix in. Since this has agent code, this is why it won't work the first time the agent update occurs because their old agents won't have this code yet to run!

In ConnectAgentResults, we need to add a field "newestAgentVersion" which the server will fill in with the latest agent version that it knows about (its the version in the agent distro - see rhq-server-agent-version.properties).

In here: org.rhq.enterprise.server.core.CoreServerServiceImpl.connectAgent the server should return a results object with the newest agent version filled in the returned object.

In AgentMain there is this already:

  ConnectAgentResults results = (ConnectAgentResults) connectResponse.getResults();

So when the agent gets that new results object, it will ask itself two questions:

1) is my auto-update enabled flag turned on?
2) does my version match the newest agent version

If the agent answers yes to question #1 and no to question #2, it will auto-update itself. No need for any user intervention (no need for the user to click any buttons). The agent will know "I am version 1.0.GA, but the latest agent is 1.0.UPDATE2, so I need to upgrade myself to go to 1.0.UPDATE2".

Comment 2 Larry O'Leary 2014-07-30 20:14:32 UTC
I think 1) (server side / temporary fix) will only introduce more confusion. 

Instead, we should just accept the fact that if you installed agent 3.2, 3.2.1, or 3.2.2, you will need to manually update agent if you need to run the later version.

What would be ideal is a new operation on the agent topology page (server side) that can invoke update --update or some other currently supported agent operation, on each agent that is selected. However, as captured in Bug 1124619 even this may require an agent side fix so this might not help user's quickly get out of this situation and again, we fall back to the manual update to get out of this situation.

As for 2), this seems like the appropriate fix.

... connect to server ...
... is my auto-update enabled ... if so, does my version match the server's agent version ... if not, perform auto-update ...


Users would then have to manually update to the agent version that includes this fix.

Comment 3 John Mazzitelli 2014-07-30 21:09:35 UTC
ok, as per larry's last post, I will just concentrate on solution 2.

We should document somewhere that agent auto-update feature *today* will only update the agent *IF* the current agent is an unsupported version.

We will change that such that the agent will auto-upgrade itself if it doesn't match the exact version of the agent update distro in the server - even if the current agent version is still supported by the server.

Just keep in mind what this means. It means any agent with auto-update enabled will always be the latest-n-greatest and essentially that whole "regex" checking for compatible agent versions is not used.

The regex checking for "compatible" agents will only come into play for those agents whose "agent auto update" feature is *disabled*.

Comment 5 John Mazzitelli 2014-07-31 18:43:11 UTC
Pull request created off branch bug/1124614: https://github.com/rhq-project/rhq/pull/102

Comment 6 Jay Shaughnessy 2014-08-01 13:50:39 UTC
Reviewed and Merged PR. Master Commits:

commit 8576ff2cf9d7e82e6a2ba6422763cb81d17839cf
Merge: 120bf6e d3d57a4
Author: jshaughn <jshaughn>
Date:   Thu Jul 31 17:14:52 2014 -0400

    Merge pull request #102 from rhq-project/bug/1124614

    BZ 1124614 - auto-update agent if its version doesn't match the latest agent version

commit d3d57a45377d007ccd10febddf1e093330e270f8
Author: John Mazzitelli <mazz>
Date:   Thu Jul 31 14:39:51 2014 -0400

    BZ 1124614 - if an agent's auto-update is enabled, then always update itself if its version is not the same as the latest agent version of the agent distro in the server

Comment 7 Simeon Pinder 2014-08-05 19:00:31 UTC
Taking this to merge to release/jon3.2.x.

Comment 8 Simeon Pinder 2014-08-10 20:49:44 UTC
This is fixed with commit dafb69168a to release/jon3.2.x. Moving to MODIFIED for testing in next build.

Comment 9 Simeon Pinder 2014-08-15 03:19:03 UTC
Moving to ON_QA as this is available for test in JON 3.2.3 ER01 build:

http://jon01.mw.lab.eng.bos.redhat.com:8042/dist/release/jon/3.2.3.GA/8-14-14/

Comment 10 Garik Khachikyan 2014-08-18 09:08:59 UTC
taking QA contact.

Comment 11 Garik Khachikyan 2014-08-21 12:17:50 UTC
# REOPEN

following is the scenario that I did performed and it not working for me:
1. install JON 3.2.0 GA server
2. plug a linux agent of 3.2.0 GA (jar has been retrieved from server's Downloads page) with having "Enable Agent Update" = yes
3. stop that agent
4. update server to 3.2.3 ER01
5. start all services on JON server
6. start the agent (it logs the "old" version)
7. wait for some time the discovery be happen and stop the agent
8. start it again <--- HERE it should be logging the updated agent version but it is NOT 

===
2014-08-21 08:09:05,015 INFO  [main] (org.rhq.enterprise.agent.AgentMain)- {AgentMain.identify-version}Version=[RHQ 4.9.0.JON320GA], Build Number=[734bd56], Build Date=[Dec 12, 2013 11:38 AM]
===

Comment 12 Garik Khachikyan 2014-08-21 12:21:48 UTC
Created attachment 929184 [details]
agent log

Comment 13 Garik Khachikyan 2014-08-21 12:22:09 UTC
Created attachment 929185 [details]
server log

Comment 14 Garik Khachikyan 2014-08-21 12:24:27 UTC
Created attachment 929186 [details]
agent's enable update is true

Comment 15 Garik Khachikyan 2014-08-21 14:01:27 UTC
# COMMENT

BTW - got this, while was starting my agent (Windows) right the first time after server upgrade.

===
2014-08-21 14:59:15,293 WARN  [RHQ Server Polling Thread] (org.rhq.enterprise.agent.PluginUpdate)- {PluginUpdate.plugin-not-on-server}The plugin [plugins\rhq-database-plugin-4.9.0.JON320GA.jar] does not exist on the Server - renaming it to [rhq-database-plugin-4.9.0.JON320GA.jar.REJECTED] so it will not get deployed by the Plugin Container.
===

Comment 16 Simeon Pinder 2014-08-21 15:47:32 UTC
Re: https://bugzilla.redhat.com/show_bug.cgi?id=1124614#c11, step 8, this will not test this update. The problem is that all agents before Update 03 were not smart enough to phone home and auto-update themselves if a newer agent was available on the server.  See https://bugzilla.redhat.com/show_bug.cgi?id=1124614#c2 for more details.  Let me work with Larry and Mazz to see if we can come up with a simpler test process here.

Comment 17 Larry O'Leary 2014-08-21 15:54:07 UTC
I thought we already discussed the test case here. Perhaps that was on a different BZ.

We can not expect the auto-update feature to work in from <3.2.3 to 3.2.3. This is because the bug is in 3.2.0, 3.2.1, and 3.2.2 that is preventing the update from happening. The fix is in 3.2.3. However, in comment 11 a pre-3.2.3 agent was installed and being updated. Therefore, this bug still applies. However, beginning with 3.2.3 agent, installing a hotfix or agent update or agent upgrade should now work.

The only way to test is:

 - install 3.2.3 agent
 - modify 3.2.3 agent on the server as if it was a "new version"
 - verify that the 3.2.3 agent is updated to the "new version"


Note You need to log in before you can comment on or make changes to this bug.