Description of problem: This is another issue found while testing bz1185375. I'm not able to reproduce it so any hint what could be causing it would be helpful. Version-Release number of selected component (if applicable): JON3.3.2.ER1 How reproducible: Only once Steps to Reproduce: Set up: (all using JON3.2.0.GA): - server1: JON server (master - postgres db is here) + SN + agent - server2: JON server (slave) + SN + agent - server3: JON server (slave) + agent - server4: SN + agent - two remote agents with EAP6 Scenario: 1 - stop everything except remote agents 2 - for each server unzip JON 3.3.0.GA and 3.3.2.DR01 patch 3 - upgrade server4 (note that rhq-server.properties must be manually updated - bz1157480) 4 - upgrade server1,2,3 5 - apply the patch on all servers 6 - start all storage nodes 7 - rhqctl upgrade --storage-schema Actual results: ./rhqctl upgrade --storage-schema 06:22:00,025 INFO [org.jboss.modules] JBoss Modules version 1.3.5.Final-redhat-1 06:22:00,179 INFO [org.rhq.server.control.command.Upgrade] Updating RHQ Storage Cluster schema 06:22:00,190 INFO [org.rhq.server.control.command.Upgrade] The RHQ Storage Cluster schema update is running 06:22:00,739 INFO [org.jboss.modules] JBoss Modules version 1.3.5.Final-redhat-1 06:22:00,993 INFO [org.rhq.cassandra.schema.VersionManager] Preparing to check storage schema compatibility. 06:22:04,497 WARN [org.rhq.cassandra.schema.VersionManager] Storage cluster schema version:1. Required schema version: 7. Please update storage cluster schema version. 06:22:04,498 INFO [org.rhq.cassandra.schema.VersionManager] Completed storage schema compatibility check. 06:22:04,499 INFO [org.rhq.enterprise.server.installer.InstallerServiceImpl] Storage cluster Schema out of date. Applying Storage Cluster schema updates. 06:22:04,503 INFO [org.rhq.cassandra.schema.VersionManager] Preparing to install storage schema 06:22:04,503 INFO [org.rhq.cassandra.schema.AbstractManager] Shutting down existing cluster connections 06:22:06,770 INFO [org.rhq.cassandra.schema.VersionManager] Installed storage schema version is 1 06:22:06,771 INFO [org.rhq.cassandra.schema.VersionManager] Required storage schema version is 7 06:22:06,772 INFO [org.rhq.cassandra.schema.VersionManager] Storage schema requires udpates. Updating from version 1 to version 7. 06:22:06,773 INFO [org.rhq.cassandra.schema.AbstractManager] Applying update file: schema/update/0002.xml 06:22:16,501 INFO [org.rhq.cassandra.schema.AbstractManager] Applied update file: schema/update/0002.xml 06:22:16,513 INFO [org.rhq.cassandra.schema.VersionManager] Storage schema update schema/update/0002.xml applied. 06:22:16,514 INFO [org.rhq.cassandra.schema.AbstractManager] Applying update file: schema/update/0003.xml 06:22:16,521 INFO [org.rhq.cassandra.schema.AbstractManager] Applied update file: schema/update/0003.xml 06:22:16,546 INFO [org.rhq.cassandra.schema.VersionManager] Storage schema update schema/update/0003.xml applied. 06:22:16,546 INFO [org.rhq.cassandra.schema.AbstractManager] Applying update file: schema/update/0004.xml 06:22:18,470 INFO [org.rhq.cassandra.schema.AbstractManager] Applied update file: schema/update/0004.xml 06:22:18,481 INFO [org.rhq.cassandra.schema.VersionManager] Storage schema update schema/update/0004.xml applied. 06:22:18,482 INFO [org.rhq.cassandra.schema.AbstractManager] Applying update file: schema/update/0005.xml 06:22:18,511 INFO [org.rhq.cassandra.schema.MigrateAggregateMetrics] Starting data migration 06:22:18,611 INFO [org.rhq.cassandra.schema.KeyScanner] Loading tokens for fbr-ha.bc.jonqe.lab.eng.bos.redhat.com/10.16.23.90 06:22:18,658 INFO [org.rhq.cassandra.schema.KeyScanner] Loading tokens for fbr-ha-3.bc.jonqe.lab.eng.bos.redhat.com/10.16.23.106 06:22:18,693 INFO [org.rhq.cassandra.schema.KeyScanner] Loading tokens for fbr-ha-st.bc.jonqe.lab.eng.bos.redhat.com/10.16.23.126 06:22:18,698 INFO [org.rhq.cassandra.schema.MigrateAggregateMetrics] Finished migrating 0 one_hour, 0 six_hour, and 0 twenty_four_hour metrics in 0 sec 06:22:18,699 INFO [org.rhq.cassandra.schema.MigrateAggregateMetrics] Shutting down migration thread pools... 06:22:18,703 ERROR [org.rhq.enterprise.server.installer.InstallerServiceImpl] Could not complete storage cluster schema installation: java.lang.NullPointerException: java.lang.NullPointerException at org.rhq.cassandra.schema.MigrateAggregateMetrics.shutdown(MigrateAggregateMetrics.java:257) [rhq-cassandra-schema-4.12.0.JON330GA-redhat-1.jar:4.12.0.JON330GA-redhat-1] at org.rhq.cassandra.schema.MigrateAggregateMetrics.execute(MigrateAggregateMetrics.java:220) [rhq-cassandra-schema-4.12.0.JON330GA-redhat-1.jar:4.12.0.JON330GA-redhat-1] at org.rhq.cassandra.schema.AbstractManager.execute(AbstractManager.java:283) [rhq-cassandra-schema-4.12.0.JON330GA-redhat-1.jar:4.12.0.JON330GA-redhat-1] at org.rhq.cassandra.schema.VersionManager.update(VersionManager.java:181) [rhq-cassandra-schema-4.12.0.JON330GA-redhat-1.jar:4.12.0.JON330GA-redhat-1] at org.rhq.cassandra.schema.VersionManager.install(VersionManager.java:91) [rhq-cassandra-schema-4.12.0.JON330GA-redhat-1.jar:4.12.0.JON330GA-redhat-1] at org.rhq.cassandra.schema.SchemaManager.install(SchemaManager.java:123) [rhq-cassandra-schema-4.12.0.JON330GA-redhat-1.jar:4.12.0.JON330GA-redhat-1] at org.rhq.enterprise.server.installer.InstallerServiceImpl.prepareStorageSchema(InstallerServiceImpl.java:755) [rhq-installer-util-4.12.0.JON330GA-redhat-1.jar:4.12.0.JON330GA-redhat-1] at org.rhq.enterprise.server.installer.InstallerServiceImpl.updateStorageSchema(InstallerServiceImpl.java:710) [rhq-installer-util-4.12.0.JON330GA-redhat-1.jar:4.12.0.JON330GA-redhat-1] at org.rhq.enterprise.server.installer.Installer.doInstall(Installer.java:122) [rhq-installer-util-4.12.0.JON330GA-redhat-1.jar:4.12.0.JON330GA-redhat-1] at org.rhq.enterprise.server.installer.Installer.main(Installer.java:59) [rhq-installer-util-4.12.0.JON330GA-redhat-1.jar:4.12.0.JON330GA-redhat-1] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [rt.jar:1.7.0_65] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) [rt.jar:1.7.0_65] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.7.0_65] at java.lang.reflect.Method.invoke(Method.java:606) [rt.jar:1.7.0_65] at org.jboss.modules.Module.run(Module.java:312) [jboss-modules.jar:1.3.5.Final-redhat-1] at org.jboss.modules.Main.main(Main.java:460) [jboss-modules.jar:1.3.5.Final-redhat-1] 06:22:18,721 ERROR [org.rhq.enterprise.server.installer.Installer] java.lang.Exception:Could not complete storage cluster schema installation: java.lang.NullPointerException -> java.lang.NullPointerException:null Expected results: No errors Additional info: Interesting thing is that no schedules were found (Finished migrating 0 one_hour, 0 six_hour, and 0 twenty_four_hour metrics in 0 sec) but there were some for sure. Not sure if this is relevant and I don't have exact steps but the issue went away when I updated rhq-server.properties (old vs. new storage properties introduced by bz1207393) and restarted storage node several times. Then the migration was successfully finished and all schedules were found.
This is more than strange. I don't understand that NPE, especially when there are log messages from org.rhq.cassandra.schema.KeyScanner (which seams to be null according to stack trace) The only fix I can think of now is detecting the NPE, safely exit migration and print warning message suggesting to repeat the migration.
I found a potential issue in KeyScanner constructor, which may happen, if we fail loading tokens from any node in cluster. If it fails (which does seem to be above stacktrace) exception being thrown was ignored and finally block was executed. branch: master link: https://github.com/rhq-project/rhq/commit/2d173cd61 time: 2015-06-15 12:19:33 +0200 commit: 2d173cd61f7d0e9eb3eaf8e86f9ed72eb3f6272a author: Libor Zoubek - lzoubek message: Bug 1213782 - Could not complete storage cluster schema installation: java.lang.NullPointerException Correctly handle errors which might occur in KeyScanner constructor
Cherry-picked to release/jon3.3.x: commit b5b3684f4d65cd86a20e14c0cde847c9b3d1fd41 Author: Libor Zoubek <lzoubek> Date: Mon Jun 15 12:19:33 2015 +0200 Bug 1213782 - Could not complete storage cluster schema installation: java.lang.NullPointerException Correctly handle errors which might occur in KeyScanner constructor (cherry picked from commit 2d173cd61f7d0e9eb3eaf8e86f9ed72eb3f6272a)
Available for test with 3.3.3 ER01 build: https://brewweb.devel.redhat.com/buildinfo?buildID=446732 *Note: jon-server-patch-3.3.0.GA.zip maps to ER01 build of jon-server-3.3.0.GA-update-03.zip.
Verified on Version : 3.3.0.GA Update 03 Build Number : e4b348a:2f80c8c Upgrade was canceled and correct msg was shown "Failed to load tokens....". Second invocation worked.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2015-1525.html