Bug 1035357
Summary: | When first attempt to upgrade JON fails, second attempt fails as well even though the original problem is resolved | ||||||
---|---|---|---|---|---|---|---|
Product: | [JBoss] JBoss Operations Network | Reporter: | Filip Brychta <fbrychta> | ||||
Component: | Installer | Assignee: | John Mazzitelli <mazz> | ||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Filip Brychta <fbrychta> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | JON 3.2 | CC: | ahovsepy, jshaughn, loleary, mazz, mfoley | ||||
Target Milestone: | ER04 | ||||||
Target Release: | JON 3.3.0 | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2014-12-11 13:59:30 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 1010354 | ||||||
Attachments: |
|
This issue is not Windows specific. Windows only demonstrates how easy it is to cause the initial install to fail due to files still being in use. The result is a corrupted JBoss ON installation due to inadequate revert/recovery. There has been a decent amount of work done in the installer and the recovery stuff in the 3.3 timeframe. Please re-test this against ER03 and we'll go from there. Thanks. Rollback works on linux but it still fails on windows. Simple scenario which fails on windows: 1- install jon3.2.0.GA 2- try to upgrade to jon3.3.er2 During step 2 you will hit bz1128151. Second attempt to upgrade ends with: c:\jon-server-3.3.0.ER02\bin>rhqctl upgrade --from-server-dir c:\jon-server-3.2. 0.GA 11:32:36,974 INFO [org.jboss.modules] JBoss Modules version 1.3.3.Final-redhat- 1 11:32:37,317 INFO [org.rhq.server.control.command.Upgrade] Stopping any running RHQ components... 11:32:37,317 WARN [org.rhq.server.control.command.Upgrade] RHQ is already insta lled so upgrade can not be performed. The RHQ Server [rhqserver-WIN-2008] service was not running. The RHQ Storage [rhqstorage-WIN-2008] service was not running. RHQ storage node has stopped i will try to replicate this. I replicated on Windows 8 using 3.3 ER03 build. I think the problem might be that we try to delete the rhq-storage directory BEFORE we stop it - and windows file locking will thus not remove the dierctory. The next upgrade attempt will see the rhq-storage directory still exists and thing its been installed. Looks like this was already addressed here: Commit e53a218269a501f22ec491927a15362fe31159b2 [BZ 1139780] UndoTasks are done in reverse order, so add stop command after the delete command to the undoTask list The stopping of the rhq-storage node should now occur before the attempt to delete the directory. This went in Sept 12, which I think is after the ER03 build. WORKAROUND: Manually delete the "rhq-storage" directory. Then re-run the upgrade. I tried the workaround and it worked. So I would say, wait for the next ER build since that fix should be in it. I think that will address the problem because once the storage node is stopped, then windows won't lock the rhq-storage files and it should be able to remove them all. I think this is why it works on Linux, because it doesn't have that windows file locking getting in the way. cherry picked 3.3 commit: e53a218269a501f22ec491927a15362fe31159b2 setting to modified - it looks like the earlier fix that was cherry picked might also correct this issue. Will need to have QE retest. Moving to ON_QA as available for test with build: https://brewweb.devel.redhat.com/buildinfo?buildID=388959 Verified on Version : 3.3.0.ER04 Build Number : 99d2107:d7c537e |
Created attachment 829783 [details] vimdiff screen shot Description of problem: I tried to upgrade JON3.1.2.GA to JON3.2.ER7 and the upgrade process correctly failed when the agent was being upgraded. The failure was expected and installation process rolled back installation. Then I resolved original problem (it was not possible to remove the rhq-agent directory) and ran the upgrade again. This second run failed while running a data migration. Version-Release number of selected component (if applicable): Upgrade to JON3.2.ER7 How reproducible: 2/2 Steps to Reproduce: 1. install JON3.1.2.GA a) unzip JON3.1.2.GA b) rhq-server.bat install c) rhq-server.bat start d) finish a server installation in web installer e) install agent (java -jar rhq-agent.jar --install) f) edit rhq-agent\bin>rhq-agent-env.bat (set RHQ_AGENT_RUN_AS_ME=true, set RHQ_AGENT_PASSWORD_PROMPT=false,set RHQ_AGENT_PASSWORD=<your_password>) g) run rhq-agent.bat to set up the agent h) exit interactive agent mode i) run rhq-agent-wrapper.bat install j) run rhq-agent-wrapper.bat start 2. wait until the agent is registered with the server 3. stop the agent 4. stop the server and then remove the service (rhq-server.bat remove) 5. open cmd.exe and cd rhq-agent/bin (this will cause the upgrade process to fail) 6. run the upgrade (rhqctl upgrade --from-server-dir c:\jon-server-3.1.2.GA --from-agent-dir c:\rhq-agent --run-data-migrator do-it) 7. upgrade correctly fails because the rhq-agent directory can't be removed 8. resolve the problem (rm -rf rhq-agent; mv rhq-agent-OLD rhq-agent) and close cmd opened in step 5 9. run the upgrade again (rhqctl upgrade --from-server-dir c:\jon-server-3.1.2.GA --from-agent-dir c:\rhq-agent --run-data-migrator do-it) Actual results: Upgrade is finished but the data migration fails with following exception: 1000 [main] DEBUG org.rhq.server.metrics.migrator.DataMigratorRunner - Server c onfiguration file system property detected. Loading the file: c:\jon-server-3.2. 0.ER7\bin\rhq-server.properties java.lang.RuntimeException: de-obfuscating db password failed: at org.rhq.core.util.obfuscation.PicketBoxObfuscator.decode(PicketBoxObf uscator.java:75) at org.rhq.server.metrics.migrator.DataMigratorRunner.loadConfigurationF romServerPropertiesFile(DataMigratorRunner.java:362) at org.rhq.server.metrics.migrator.DataMigratorRunner.configure(DataMigr atorRunner.java:287) at org.rhq.server.metrics.migrator.DataMigratorRunner.main(DataMigratorR unner.java:170) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl. java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces sorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.jboss.modules.Module.run(Module.java:270) at org.jboss.modules.Main.main(Main.java:411) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl. java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces sorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.rhq.core.util.obfuscation.PicketBoxObfuscator.decode(PicketBoxObf uscator.java:72) ... 9 more Caused by: java.lang.NumberFormatException: Zero length BigInteger at java.math.BigInteger.<init>(BigInteger.java:296) at org.picketbox.datasource.security.SecureIdentityLoginModule.decode(Se cureIdentityLoginModule.java:170) ... 14 more 21:46:24,106 INFO [org.rhq.server.control.command.Upgrade] The data migrator fi nished with exit value 0 This exception is caused by unset properties in rhq-server.properties. See attached vimdiff screen shot to see difference between rhq-server.properties after step 9 and correct rhq-server.properties. Snapshot of this correct rhq-server.properties was taken after step 6 but before step 7 (before the upgrage was rolled back). So for some reason second upgrade (step 9) didn't updated rhq-server.properties correctly. Expected results: Data migration works