Bug 1213782 - Could not complete storage cluster schema installation: java.lang.NullPointerException
Summary: Could not complete storage cluster schema installation: java.lang.NullPointer...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: JBoss Operations Network
Classification: JBoss
Component: Installer, Storage Node
Version: JON 3.3.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ER01
: JON 3.3.3
Assignee: Libor Zoubek
QA Contact: Filip Brychta
URL:
Whiteboard:
Depends On:
Blocks: 1213366
TreeView+ depends on / blocked
 
Reported: 2015-04-21 09:38 UTC by Filip Brychta
Modified: 2015-11-02 00:44 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-07-30 16:42:08 UTC
Type: Bug


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2015:1525 normal SHIPPED_LIVE Moderate: Red Hat JBoss Operations Network 3.3.3 update 2015-07-30 20:41:08 UTC
Red Hat Bugzilla 1185375 None None None Never

Internal Links: 1185375

Description Filip Brychta 2015-04-21 09:38:41 UTC
Description of problem:
This is another issue found while testing bz1185375. I'm not able to reproduce it so any hint what could be causing it would be helpful.

Version-Release number of selected component (if applicable):
JON3.3.2.ER1

How reproducible:
Only once

Steps to Reproduce:
Set up:
(all using JON3.2.0.GA):
- server1: JON server (master - postgres db is here) + SN + agent
- server2: JON server (slave) + SN + agent
- server3: JON server (slave) + agent
- server4: SN + agent
- two remote agents with EAP6

Scenario:
1 - stop everything except remote agents
2 - for each server unzip JON 3.3.0.GA and 3.3.2.DR01 patch
3 - upgrade server4 (note that rhq-server.properties must be manually updated - bz1157480)
4 - upgrade server1,2,3
5 - apply the patch on all servers
6 - start all storage nodes
7 - rhqctl upgrade --storage-schema

Actual results:
./rhqctl upgrade --storage-schema
06:22:00,025 INFO  [org.jboss.modules] JBoss Modules version 1.3.5.Final-redhat-1
06:22:00,179 INFO  [org.rhq.server.control.command.Upgrade] Updating RHQ Storage Cluster schema
06:22:00,190 INFO  [org.rhq.server.control.command.Upgrade] The RHQ Storage Cluster schema update is running
06:22:00,739 INFO  [org.jboss.modules] JBoss Modules version 1.3.5.Final-redhat-1
06:22:00,993 INFO  [org.rhq.cassandra.schema.VersionManager] Preparing to check storage schema compatibility.
06:22:04,497 WARN  [org.rhq.cassandra.schema.VersionManager] Storage cluster schema version:1. Required schema version: 7. Please update storage cluster schema version.
06:22:04,498 INFO  [org.rhq.cassandra.schema.VersionManager] Completed storage schema compatibility check.
06:22:04,499 INFO  [org.rhq.enterprise.server.installer.InstallerServiceImpl] Storage cluster Schema out of date. Applying Storage Cluster schema updates.
06:22:04,503 INFO  [org.rhq.cassandra.schema.VersionManager] Preparing to install storage schema
06:22:04,503 INFO  [org.rhq.cassandra.schema.AbstractManager] Shutting down existing cluster connections
06:22:06,770 INFO  [org.rhq.cassandra.schema.VersionManager] Installed storage schema version is 1
06:22:06,771 INFO  [org.rhq.cassandra.schema.VersionManager] Required storage schema version is 7
06:22:06,772 INFO  [org.rhq.cassandra.schema.VersionManager] Storage schema requires udpates. Updating from version 1 to version 7.
06:22:06,773 INFO  [org.rhq.cassandra.schema.AbstractManager] Applying update file: schema/update/0002.xml
06:22:16,501 INFO  [org.rhq.cassandra.schema.AbstractManager] Applied update file: schema/update/0002.xml
06:22:16,513 INFO  [org.rhq.cassandra.schema.VersionManager] Storage schema update schema/update/0002.xml applied.
06:22:16,514 INFO  [org.rhq.cassandra.schema.AbstractManager] Applying update file: schema/update/0003.xml
06:22:16,521 INFO  [org.rhq.cassandra.schema.AbstractManager] Applied update file: schema/update/0003.xml
06:22:16,546 INFO  [org.rhq.cassandra.schema.VersionManager] Storage schema update schema/update/0003.xml applied.
06:22:16,546 INFO  [org.rhq.cassandra.schema.AbstractManager] Applying update file: schema/update/0004.xml
06:22:18,470 INFO  [org.rhq.cassandra.schema.AbstractManager] Applied update file: schema/update/0004.xml
06:22:18,481 INFO  [org.rhq.cassandra.schema.VersionManager] Storage schema update schema/update/0004.xml applied.
06:22:18,482 INFO  [org.rhq.cassandra.schema.AbstractManager] Applying update file: schema/update/0005.xml
06:22:18,511 INFO  [org.rhq.cassandra.schema.MigrateAggregateMetrics] Starting data migration
06:22:18,611 INFO  [org.rhq.cassandra.schema.KeyScanner] Loading tokens for fbr-ha.bc.jonqe.lab.eng.bos.redhat.com/10.16.23.90
06:22:18,658 INFO  [org.rhq.cassandra.schema.KeyScanner] Loading tokens for fbr-ha-3.bc.jonqe.lab.eng.bos.redhat.com/10.16.23.106
06:22:18,693 INFO  [org.rhq.cassandra.schema.KeyScanner] Loading tokens for fbr-ha-st.bc.jonqe.lab.eng.bos.redhat.com/10.16.23.126
06:22:18,698 INFO  [org.rhq.cassandra.schema.MigrateAggregateMetrics] Finished migrating 0 one_hour, 0 six_hour, and 0 twenty_four_hour metrics in 0 sec
06:22:18,699 INFO  [org.rhq.cassandra.schema.MigrateAggregateMetrics] Shutting down migration thread pools...
06:22:18,703 ERROR [org.rhq.enterprise.server.installer.InstallerServiceImpl] Could not complete storage cluster schema installation: java.lang.NullPointerException: java.lang.NullPointerException
	at org.rhq.cassandra.schema.MigrateAggregateMetrics.shutdown(MigrateAggregateMetrics.java:257) [rhq-cassandra-schema-4.12.0.JON330GA-redhat-1.jar:4.12.0.JON330GA-redhat-1]
	at org.rhq.cassandra.schema.MigrateAggregateMetrics.execute(MigrateAggregateMetrics.java:220) [rhq-cassandra-schema-4.12.0.JON330GA-redhat-1.jar:4.12.0.JON330GA-redhat-1]
	at org.rhq.cassandra.schema.AbstractManager.execute(AbstractManager.java:283) [rhq-cassandra-schema-4.12.0.JON330GA-redhat-1.jar:4.12.0.JON330GA-redhat-1]
	at org.rhq.cassandra.schema.VersionManager.update(VersionManager.java:181) [rhq-cassandra-schema-4.12.0.JON330GA-redhat-1.jar:4.12.0.JON330GA-redhat-1]
	at org.rhq.cassandra.schema.VersionManager.install(VersionManager.java:91) [rhq-cassandra-schema-4.12.0.JON330GA-redhat-1.jar:4.12.0.JON330GA-redhat-1]
	at org.rhq.cassandra.schema.SchemaManager.install(SchemaManager.java:123) [rhq-cassandra-schema-4.12.0.JON330GA-redhat-1.jar:4.12.0.JON330GA-redhat-1]
	at org.rhq.enterprise.server.installer.InstallerServiceImpl.prepareStorageSchema(InstallerServiceImpl.java:755) [rhq-installer-util-4.12.0.JON330GA-redhat-1.jar:4.12.0.JON330GA-redhat-1]
	at org.rhq.enterprise.server.installer.InstallerServiceImpl.updateStorageSchema(InstallerServiceImpl.java:710) [rhq-installer-util-4.12.0.JON330GA-redhat-1.jar:4.12.0.JON330GA-redhat-1]
	at org.rhq.enterprise.server.installer.Installer.doInstall(Installer.java:122) [rhq-installer-util-4.12.0.JON330GA-redhat-1.jar:4.12.0.JON330GA-redhat-1]
	at org.rhq.enterprise.server.installer.Installer.main(Installer.java:59) [rhq-installer-util-4.12.0.JON330GA-redhat-1.jar:4.12.0.JON330GA-redhat-1]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [rt.jar:1.7.0_65]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) [rt.jar:1.7.0_65]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.7.0_65]
	at java.lang.reflect.Method.invoke(Method.java:606) [rt.jar:1.7.0_65]
	at org.jboss.modules.Module.run(Module.java:312) [jboss-modules.jar:1.3.5.Final-redhat-1]
	at org.jboss.modules.Main.main(Main.java:460) [jboss-modules.jar:1.3.5.Final-redhat-1]

06:22:18,721 ERROR [org.rhq.enterprise.server.installer.Installer] java.lang.Exception:Could not complete storage cluster schema installation: java.lang.NullPointerException -> java.lang.NullPointerException:null

Expected results:
No errors

Additional info:
Interesting thing is that no schedules were found (Finished migrating 0 one_hour, 0 six_hour, and 0 twenty_four_hour metrics in 0 sec) but there were some for sure.
Not sure if this is relevant and I don't have exact steps but the issue went away when I updated rhq-server.properties (old vs. new storage properties introduced by bz1207393) and restarted storage node several times. Then the migration was successfully finished and all schedules were found.

Comment 1 Libor Zoubek 2015-06-12 14:37:29 UTC
This is more than strange. I don't understand that NPE, especially when there are log messages from org.rhq.cassandra.schema.KeyScanner (which seams to be null according to stack trace)

The only fix I can think of now is detecting the NPE, safely exit migration and print warning message suggesting to repeat the migration.

Comment 2 Libor Zoubek 2015-06-15 10:24:10 UTC
I found a potential issue in KeyScanner constructor, which may happen, if we fail loading tokens from any node in  cluster. If it fails (which does seem to be above stacktrace) exception being thrown was ignored and finally block was executed.

branch:  master
link:    https://github.com/rhq-project/rhq/commit/2d173cd61
time:    2015-06-15 12:19:33 +0200
commit:  2d173cd61f7d0e9eb3eaf8e86f9ed72eb3f6272a
author:  Libor Zoubek - lzoubek@redhat.com
message: Bug 1213782 - Could not complete storage cluster schema installation: 
         java.lang.NullPointerException

         Correctly handle errors which might occur in KeyScanner
         constructor

Comment 3 Michael Burman 2015-06-24 17:45:35 UTC
Cherry-picked to release/jon3.3.x:

commit b5b3684f4d65cd86a20e14c0cde847c9b3d1fd41
Author: Libor Zoubek <lzoubek@redhat.com>
Date:   Mon Jun 15 12:19:33 2015 +0200

    Bug 1213782 - Could not complete storage cluster schema installation:
    java.lang.NullPointerException
    
    Correctly handle errors which might occur in KeyScanner constructor
    
    (cherry picked from commit 2d173cd61f7d0e9eb3eaf8e86f9ed72eb3f6272a)

Comment 4 Simeon Pinder 2015-07-10 18:55:23 UTC
Available for test with 3.3.3 ER01 build: 
https://brewweb.devel.redhat.com/buildinfo?buildID=446732
 *Note: jon-server-patch-3.3.0.GA.zip maps to ER01 build of
 jon-server-3.3.0.GA-update-03.zip.

Comment 5 Filip Brychta 2015-07-20 11:19:21 UTC
Verified on
Version :	
3.3.0.GA Update 03
Build Number :	
e4b348a:2f80c8c

Upgrade was canceled and correct msg was shown "Failed to load tokens....". Second invocation worked.

Comment 7 errata-xmlrpc 2015-07-30 16:42:08 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-1525.html


Note You need to log in before you can comment on or make changes to this bug.