Bug 1124614
Summary: | Automatic agent update at agent start no longer works | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [JBoss] JBoss Operations Network | Reporter: | Larry O'Leary <loleary> | ||||||||
Component: | Agent | Assignee: | Simeon Pinder <spinder> | ||||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Garik Khachikyan <gkhachik> | ||||||||
Severity: | urgent | Docs Contact: | |||||||||
Priority: | unspecified | ||||||||||
Version: | JON 3.2.2 | CC: | gkhachik, jshaughn, ksuzumur, mazz, mkoci, myarboro, spinder | ||||||||
Target Milestone: | ER01 | Keywords: | Regression, Triaged | ||||||||
Target Release: | JON 3.2.3 | ||||||||||
Hardware: | Unspecified | ||||||||||
OS: | Unspecified | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2014-09-05 15:40:23 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Attachments: |
|
Description
Larry O'Leary
2014-07-29 22:54:59 UTC
Here's what I propose to fix this. 1) First off, for the very first application of any fix, we can't have changes to agent code because existing agents won't have the code to know to update themselves! Catch-22. So the first solution will need to be server-side only to force the agents to auto-update themselves. 2) To support this feature in the future and NOT require user-intervention (which the solution to #1 above will require) then I suggest making some agent-side fixes. Here's my proposed solutions 1) Add a new row to RHQ_SYSTEM_CONFIG which in turn requires a new GUI element in the Administration>SystemSettings UI page. Next to the existing radio button for "Enable Agent Auto-Update" I think we need to add a "Require All Older Agents To Update" or whatever we want to call it (don't know a good way to word it). Before a user goes to update the agents for our very next release, they will need to flip that "Require all older agents to update" to true. Under the covers, this will disable the regex check and go back to our strict checking. Now when an agent (version GA) connects, and its auto-update is enabled, it will be forced to upgrade to version "Update-01" or whatever the latest version is. We do this here: org.rhq.enterprise.server.core.AgentManagerBean.isAgentVersionSupported(AgentVersion) In that method, we now have to ask the SystemSettings SLSB to give us the value of the flag "Require all older agents to update". If that value is true, we don't do the regex check, we do the strict equals check: // this is pseudo-code - I forget the actual System Settings SLSB API - but you get the idea :) boolean isUpdateAllOlderAgents = systemSettingsManagerBean.getSystemSetting(IS_UPDATE_ALL_OLDER_AGENTS); // add the "isUpdateAllOlders==false" check to the existing if-statement if ((isUpdateAllOlderAgents == false) && (supportedAgentVersions == null || supportedAgentVersions.isEmpty())) { ....and the rest This should fix the first time people need to upgrade old agents. The user can flip that "require all older agents to ugpgade" back to false and that brings back the regex checking again. 2) Notice this requires users to know to flip that switch in the admin page. We don't want to force users to always remember everytime they upgrade agents to manually flip that switch. So to fix this for future upgrades, put this fix in. Since this has agent code, this is why it won't work the first time the agent update occurs because their old agents won't have this code yet to run! In ConnectAgentResults, we need to add a field "newestAgentVersion" which the server will fill in with the latest agent version that it knows about (its the version in the agent distro - see rhq-server-agent-version.properties). In here: org.rhq.enterprise.server.core.CoreServerServiceImpl.connectAgent the server should return a results object with the newest agent version filled in the returned object. In AgentMain there is this already: ConnectAgentResults results = (ConnectAgentResults) connectResponse.getResults(); So when the agent gets that new results object, it will ask itself two questions: 1) is my auto-update enabled flag turned on? 2) does my version match the newest agent version If the agent answers yes to question #1 and no to question #2, it will auto-update itself. No need for any user intervention (no need for the user to click any buttons). The agent will know "I am version 1.0.GA, but the latest agent is 1.0.UPDATE2, so I need to upgrade myself to go to 1.0.UPDATE2". I think 1) (server side / temporary fix) will only introduce more confusion. Instead, we should just accept the fact that if you installed agent 3.2, 3.2.1, or 3.2.2, you will need to manually update agent if you need to run the later version. What would be ideal is a new operation on the agent topology page (server side) that can invoke update --update or some other currently supported agent operation, on each agent that is selected. However, as captured in Bug 1124619 even this may require an agent side fix so this might not help user's quickly get out of this situation and again, we fall back to the manual update to get out of this situation. As for 2), this seems like the appropriate fix. ... connect to server ... ... is my auto-update enabled ... if so, does my version match the server's agent version ... if not, perform auto-update ... Users would then have to manually update to the agent version that includes this fix. ok, as per larry's last post, I will just concentrate on solution 2. We should document somewhere that agent auto-update feature *today* will only update the agent *IF* the current agent is an unsupported version. We will change that such that the agent will auto-upgrade itself if it doesn't match the exact version of the agent update distro in the server - even if the current agent version is still supported by the server. Just keep in mind what this means. It means any agent with auto-update enabled will always be the latest-n-greatest and essentially that whole "regex" checking for compatible agent versions is not used. The regex checking for "compatible" agents will only come into play for those agents whose "agent auto update" feature is *disabled*. Pull request created off branch bug/1124614: https://github.com/rhq-project/rhq/pull/102 Reviewed and Merged PR. Master Commits: commit 8576ff2cf9d7e82e6a2ba6422763cb81d17839cf Merge: 120bf6e d3d57a4 Author: jshaughn <jshaughn> Date: Thu Jul 31 17:14:52 2014 -0400 Merge pull request #102 from rhq-project/bug/1124614 BZ 1124614 - auto-update agent if its version doesn't match the latest agent version commit d3d57a45377d007ccd10febddf1e093330e270f8 Author: John Mazzitelli <mazz> Date: Thu Jul 31 14:39:51 2014 -0400 BZ 1124614 - if an agent's auto-update is enabled, then always update itself if its version is not the same as the latest agent version of the agent distro in the server Taking this to merge to release/jon3.2.x. This is fixed with commit dafb69168a to release/jon3.2.x. Moving to MODIFIED for testing in next build. Moving to ON_QA as this is available for test in JON 3.2.3 ER01 build: http://jon01.mw.lab.eng.bos.redhat.com:8042/dist/release/jon/3.2.3.GA/8-14-14/ taking QA contact. # REOPEN following is the scenario that I did performed and it not working for me: 1. install JON 3.2.0 GA server 2. plug a linux agent of 3.2.0 GA (jar has been retrieved from server's Downloads page) with having "Enable Agent Update" = yes 3. stop that agent 4. update server to 3.2.3 ER01 5. start all services on JON server 6. start the agent (it logs the "old" version) 7. wait for some time the discovery be happen and stop the agent 8. start it again <--- HERE it should be logging the updated agent version but it is NOT === 2014-08-21 08:09:05,015 INFO [main] (org.rhq.enterprise.agent.AgentMain)- {AgentMain.identify-version}Version=[RHQ 4.9.0.JON320GA], Build Number=[734bd56], Build Date=[Dec 12, 2013 11:38 AM] === Created attachment 929184 [details]
agent log
Created attachment 929185 [details]
server log
Created attachment 929186 [details]
agent's enable update is true
# COMMENT BTW - got this, while was starting my agent (Windows) right the first time after server upgrade. === 2014-08-21 14:59:15,293 WARN [RHQ Server Polling Thread] (org.rhq.enterprise.agent.PluginUpdate)- {PluginUpdate.plugin-not-on-server}The plugin [plugins\rhq-database-plugin-4.9.0.JON320GA.jar] does not exist on the Server - renaming it to [rhq-database-plugin-4.9.0.JON320GA.jar.REJECTED] so it will not get deployed by the Plugin Container. === Re: https://bugzilla.redhat.com/show_bug.cgi?id=1124614#c11, step 8, this will not test this update. The problem is that all agents before Update 03 were not smart enough to phone home and auto-update themselves if a newer agent was available on the server. See https://bugzilla.redhat.com/show_bug.cgi?id=1124614#c2 for more details. Let me work with Larry and Mazz to see if we can come up with a simpler test process here. I thought we already discussed the test case here. Perhaps that was on a different BZ. We can not expect the auto-update feature to work in from <3.2.3 to 3.2.3. This is because the bug is in 3.2.0, 3.2.1, and 3.2.2 that is preventing the update from happening. The fix is in 3.2.3. However, in comment 11 a pre-3.2.3 agent was installed and being updated. Therefore, this bug still applies. However, beginning with 3.2.3 agent, installing a hotfix or agent update or agent upgrade should now work. The only way to test is: - install 3.2.3 agent - modify 3.2.3 agent on the server as if it was a "new version" - verify that the 3.2.3 agent is updated to the "new version" |