Bug 1013674

Summary: Upgraded agent was writing to rhq-agent-OLD/logs/agent.log after upgrade from JON3.1.0.GA to JON3.2.ER1 (only on local agent)
Product: [JBoss] JBoss Operations Network Reporter: Filip Brychta <fbrychta>
Component: Agent, UpgradeAssignee: John Mazzitelli <mazz>
Status: CLOSED NOTABUG QA Contact: Mike Foley <mfoley>
Severity: high Docs Contact:
Priority: unspecified    
Version: JON 3.2CC: mazz
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-10-14 15:39:38 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1010354    

Description Filip Brychta 2013-09-30 14:58:51 UTC
Description of problem:
I upgraded JON3.1.0.GA to JON3.2.ER1. Local agent (agent on machine with rhq server) was upgraded but rhq-agent-OLD/logs/agent.log was rewritten. First message from this log is from a point after the upgrade was started. Remote agent old logs were untouched and last message was 'Now executing agent update - if all goes well, this is the last you will hear of this agent:'

There were two agent processes running after upgrade. One of them disappeared after ~ 5 minutes. Upgraded agent process was still running but writing to rhq-agent-OLD/logs/agent.log. It was upgraded agent process because using ps i can see it is using 'lib/rhq-core-client-api-4.9.0.JON320ER1.jar'

This is most likely caused by bz 1012289

Version-Release number of selected component (if applicable):
Version: 3.2.0.ER1
Build Number: 54dd29c:464a643

How reproducible:
1/1

Steps to Reproduce:
1. JON3.1.0.GA server and agent are running 
2. unzip jon-server-3.2.0.ER1.zip
3. cd jon-server-3.2.0.ER1/bin/
4. ./rhqctl upgrade --from-server-dir /home/hudson/jon-server-3.1.0.GA/ --run-data-migrator do-it --storage-data-root-dir /home/hudson/

Comment 1 John Mazzitelli 2013-10-11 20:05:41 UTC
When you are upgrading from an older JON *and* you have an agent co-located on the same machine as your JON Server, you need to tell the installer where your agent is installed.

Looking at your command line from this issue's description:

   ./rhqctl upgrade --from-server-dir /home/hudson/jon-server-3.1.0.GA/ --run-data-migrator do-it --storage-data-root-dir /home/hudson/

I do not see this. You are missing --from-agent-dir. From the --help documentation:

Upgrades RHQ services from an earlier installed version
    --from-agent-dir <arg>            Full path to install directory of
                                      the RHQ Agent to be upgraded.
                                      Required only if an existing agent
                                      exists and is not installed in the
                                      default location:
                                      <from-server-dir>/../rhq-agent

I suspect this is part of the problem.

Comment 2 John Mazzitelli 2013-10-11 20:37:20 UTC
After paying more attention to the replication procedures, they were missing very important steps. See:

https://docs.jboss.org/author/display/RHQ/Upgrading+the+Server

where the first step says:

"Stop agents installed with rhqctl and wait for them to fully shutdown"

So, you must stop the agent that is co-located with the server prior to upgrading. Wait for it to shutdown. The reason why? Because things like this BZ might happen if you don't :)

Comment 3 Filip Brychta 2013-10-14 09:01:15 UTC
(In reply to John Mazzitelli from comment #2)

Please see bug 1012289, comment 2. I was confused by "Stop agents installed with rhqctl". Previous versions of JON are not installed via rhqctl, so i thought this step doesn't apply to the older co-located agents. I guess both bzs could be fixed just by updating documentation.

Comment 4 John Mazzitelli 2013-10-14 15:39:38 UTC
see bug #1018887 that will make sure this is doc'ed