Bug 1023019

Summary: Windows 2008 - Upgrade to JON3.2.ER3 freezes on 'Updating RHQ Agent Service'
Product: [JBoss] JBoss Operations Network Reporter: Filip Brychta <fbrychta>
Component: UpgradeAssignee: Jay Shaughnessy <jshaughn>
Status: CLOSED CURRENTRELEASE QA Contact: Mike Foley <mfoley>
Severity: high Docs Contact:
Priority: unspecified    
Version: JON 3.2CC: dlackey, hrupp, jshaughn
Target Milestone: CR01   
Target Release: JON 3.2.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1037824 (view as bug list) Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1010354, 1012435, 1037824    
Attachments:
Description Flags
all logs
none
screen shot none

Description Filip Brychta 2013-10-24 13:09:08 UTC
Created attachment 815778 [details]
all logs

Description of problem:
During an upgrade of jon3.1.2.GA to jon3.2.ER3 on windows machine, the installer freezes after removing the old agent service. Killing the installer leads to rollback and system stays not upgraded.

Using services.msc i can see that old agent service is removed (the one from jon3.1.2.GA) but new agent service (which should be installed via rhqctl) is not there.

I have following set up:
OS - Windows 2008 server, 64b
jdk - oracle jre1.7.0_45 32b
 

Version-Release number of selected component (if applicable):
Version: 3.2.0.ER3
Build Number: c0742ed:90dd474

How reproducible:
3/3

Steps to Reproduce:
1. jon3.1.2.GA is installed under 'Administrator' user, agent service is run as 'Administrator'
2. stop jon server and jon agent
3. uninstall the jon server service: rhq-server.bat remove
4. rhqctl upgrade --from-server-dir c:\jon-server-3.1.2.GA --from-agent-dir c:\rhq-agent --run-data-migrator do-it

Actual results:
Upgrade is frozen on line 'RHQ Agent [rhqagent-WIN-2008] removed.'
Old agent service is removed but new agent service is not created.

Expected results:
New agent service is created and upgrade is finished successfully.

Additional info:
I tried to remove old agent service manualy (rhq-agent-wrapper.bat remove) before upgrade, but it didn't solved the problem. Upgrade was frozen and last lines were
00:55:33,753 INFO  [org.rhq.server.control.command.Upgrade] Updating RHQ Agent Service...
The rhqagent-WIN-2008 service is not installed - The specified service does not exist as an installed service. (0x424)
The rhqagent-WIN-2008 service is not installed - The specified service does not exist as an installed service. (0x424)


When i kill the installer i can see in process explorer that wrapper.exe which handles rhq-agent is still running. So the problem is probably in wrapper. 


Logs attached

Comment 1 Filip Brychta 2013-10-25 13:16:46 UTC
I retested this issue to be sure it is not caused by corrupted JDK installation as discussed in bug 1022620.
I reproduced this issue with correct JDK installation.

Comment 2 Jay Shaughnessy 2013-10-28 14:06:53 UTC
I'm fairly sure this was due to Bug 1022989, and possibly assisted by Bug 1022620.  The fixes for those should resolve this issue.  Setting MODIFIED for ER5 retest.

Comment 3 Simeon Pinder 2013-11-07 02:18:02 UTC
Moving to ON_QA for test with new brew build.

Comment 4 Filip Brychta 2013-11-25 16:53:11 UTC
Still the same behaviour on ER7. Attaching screen shot which shows "frozen" state and processes which handle rhq-agent

Comment 5 Filip Brychta 2013-11-25 16:54:34 UTC
Created attachment 828758 [details]
screen shot

Comment 6 Jay Shaughnessy 2013-11-26 16:48:43 UTC
After spending some time with Filip we discovered the issue here.  It's that the agent service install was stuck waiting for interactive response for the logon account password.

The problem arises when the 3.1.2 agent install was done using RUN_AS or RUN_AS_ME, with the password being supplied interactively via prompt.

The 3.2 upgrade (or agent install) for that matter, does not allow for interactive password prompt for the agent service's logon account.  Therefore, the following env vars must be set:
 RHQ_AGENT_PASSWORD=<password>
 RHQ_AGENT_PASSWORD_PROMPT=false

In fact, the RHQ_AGENT_PASSWORD_PROMPT should probably be obsolete.

This can be resolved with documentation in the near term, but the best fix is likely a code change such that we fail installs that set RHQ_RUN_AS or RHQ_RUN_AS_ME and don't set RHQ_AGENT_PASSWORD.

Note the same is true for SERVER and STORAGE windows service installs.

*** Advise as to whether code change or doc change for 3.2...

Comment 7 Jay Shaughnessy 2013-11-27 22:35:15 UTC
master commit ecb4d5877a6c585f0900389bb5c0f6bf587fe6b7
Author: Jay Shaughnessy <jshaughn>
Date:   Wed Nov 27 17:33:24 2013 -0500

 rhqctl-based  installs will now exit if the password is not set when the
 option is specified. The RHQ_XXX_PASSWORD_PROMPT env vars are no
 longer relevant or documented for rhqctl-based installs. Note that
 the agent is typically installed standalone, outside of rhqctl, and in
 that case the RHQ_AGENT_PASSWORD_PROMPT env var is still relevant.

 Additionally, rhq-agent.bat had some newly discovered problems handling
 certain options (32-bit java install, I think), and has had a decent
 amount of reworking to deal with the issues.

Comment 8 Jay Shaughnessy 2013-12-02 15:46:49 UTC
release/jon3.2.x commit d769ac141cc0ea09dbb5ffaeaec596a9633d7732
Author: Jay Shaughnessy <jshaughn>
Date:   Wed Nov 27 17:33:24 2013 -0500

 Cherry-Pick of master commit ecb4d5877a6c585f0900389bb5c0f6bf587fe6b7



Test-Notes:

Given the number of script changes full Windows install testing should be performed.  Particular to this fix, attempt rhqctl installs where only RHQ_XXX_RUN_AS or RUN_AS_ME is defined, but not  RHQ_XXX_PASSWORD.  The installs should exit with a useful message.  With RHQ_XXX_PASSWORD defined the install should proceed and the service should run as the defined user account (not the system default account).

Also perform non-rhqctl-based windows service installs of the agent. Ensure the RHQ_AGENT_PASSWORD_PROMPT is used properly in those situations (it is ignored by rhqctl).

Comment 9 Filip Brychta 2013-12-03 13:54:15 UTC
I tried following scenario on ER7:
1-  unzip 
2-  set RHQ_SERVER_RUN_AS_ME=true
3-  set RHQ_STORAGE_PASSWORD=*****
(note: i didn't set RHQ_AGENT_PASSWORD_PROMPT=false)
4-  rhqctl install

Result:
Installation was stuck on creating agent service without any useful message.

Is this resolved as a part of commit from comment 8 as well?

Comment 10 Filip Brychta 2013-12-03 14:14:43 UTC
Correction of comment 9: I pasted there incorrect env variables. All env variables (steps 2 and 3) should be relevant for agent. So instead of RHQ_SERVER... and RHQ_STORAGE... should be RHQ_AGENT...

Comment 11 Jay Shaughnessy 2013-12-03 20:22:18 UTC
There were too many changes to worry about ER7.  Please test with CR1, which has (or will have) the changes when it is released.

Comment 14 Simeon Pinder 2013-12-03 23:19:34 UTC
Moving to ON_QA for testing in latest(CR1) brew build.

Comment 15 Filip Brychta 2013-12-05 10:57:46 UTC
Verified on:
Version :	
3.2.0.CR1
Build Number :	
6ecd678:d0dc0b6

Verified following things for both the clean installation and upgrage:
 - installation/upgrade was successful for default user (Local System) 
 - installation/upgrade was successful for Administrator user (set RHQ_AGENT_RUN_AS_ME=true, set RHQ_AGENT_PASSWORD=***)
 - installation/upgrade was stopped with useful message when only RHQ_SERVER_RUN_AS_ME was set
 - non-rhqctl-based windows service installs of the agent works as expected