Red Hat Bugzilla – Bug 1013674
Upgraded agent was writing to rhq-agent-OLD/logs/agent.log after upgrade from JON3.1.0.GA to JON3.2.ER1 (only on local agent)
Last modified: 2013-10-14 11:39:38 EDT
Description of problem:
I upgraded JON3.1.0.GA to JON3.2.ER1. Local agent (agent on machine with rhq server) was upgraded but rhq-agent-OLD/logs/agent.log was rewritten. First message from this log is from a point after the upgrade was started. Remote agent old logs were untouched and last message was 'Now executing agent update - if all goes well, this is the last you will hear of this agent:'
There were two agent processes running after upgrade. One of them disappeared after ~ 5 minutes. Upgraded agent process was still running but writing to rhq-agent-OLD/logs/agent.log. It was upgraded agent process because using ps i can see it is using 'lib/rhq-core-client-api-4.9.0.JON320ER1.jar'
This is most likely caused by bz 1012289
Version-Release number of selected component (if applicable):
Build Number: 54dd29c:464a643
Steps to Reproduce:
1. JON3.1.0.GA server and agent are running
2. unzip jon-server-3.2.0.ER1.zip
3. cd jon-server-3.2.0.ER1/bin/
4. ./rhqctl upgrade --from-server-dir /home/hudson/jon-server-3.1.0.GA/ --run-data-migrator do-it --storage-data-root-dir /home/hudson/
When you are upgrading from an older JON *and* you have an agent co-located on the same machine as your JON Server, you need to tell the installer where your agent is installed.
Looking at your command line from this issue's description:
./rhqctl upgrade --from-server-dir /home/hudson/jon-server-3.1.0.GA/ --run-data-migrator do-it --storage-data-root-dir /home/hudson/
I do not see this. You are missing --from-agent-dir. From the --help documentation:
Upgrades RHQ services from an earlier installed version
--from-agent-dir <arg> Full path to install directory of
the RHQ Agent to be upgraded.
Required only if an existing agent
exists and is not installed in the
I suspect this is part of the problem.
After paying more attention to the replication procedures, they were missing very important steps. See:
where the first step says:
"Stop agents installed with rhqctl and wait for them to fully shutdown"
So, you must stop the agent that is co-located with the server prior to upgrading. Wait for it to shutdown. The reason why? Because things like this BZ might happen if you don't :)
(In reply to John Mazzitelli from comment #2)
Please see bug 1012289, comment 2. I was confused by "Stop agents installed with rhqctl". Previous versions of JON are not installed via rhqctl, so i thought this step doesn't apply to the older co-located agents. I guess both bzs could be fixed just by updating documentation.
see bug #1018887 that will make sure this is doc'ed