Bug 1082805 - rhqctl upgrade is unable to perform data migration if estimate is executed first
Summary: rhqctl upgrade is unable to perform data migration if estimate is executed first
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: JBoss Operations Network
Classification: JBoss
Component: Storage Node
Version: JON 3.2
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: CR01
: JON 3.3.0
Assignee: Stefan Negrea
QA Contact: Mike Foley
URL:
Whiteboard:
Depends On:
Blocks: 1158247
TreeView+ depends on / blocked
 
Reported: 2014-03-31 20:48 UTC by Larry O'Leary
Modified: 2018-12-05 17:57 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
If a user executed rhqctl upgrade and passed the --run-data-migrator estimate parameter, the data could no longer be upgraded using the expected tool. Attempting to execute the --run-data-migrator do-it command fails indicating that the installation/upgrade is already finished. The fix removes all options for the --run-data-migration argument. Users can no longer use none, estimate, or do-it following the argument. If the argument is used when invoking the upgrade comment, rhqctl will migrate the data right away as part of the upgrade process.
Clone Of:
: 1158247 (view as bug list)
Environment:
Last Closed: 2014-12-11 14:00:31 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
rhqctl_upgrade_312_330er05 (18.33 KB, text/plain)
2014-10-22 13:09 UTC, Armine Hovsepyan
no flags Details
rhqctl_upgrade_312_330er05_patch (25.40 KB, text/plain)
2014-10-23 11:43 UTC, Armine Hovsepyan
no flags Details
upgrade.log (20.99 KB, text/plain)
2014-10-28 21:44 UTC, Armine Hovsepyan
no flags Details
upgrade-with-migration.log (75.54 KB, text/plain)
2014-10-29 19:47 UTC, Armine Hovsepyan
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 772973 0 None None None Never

Description Larry O'Leary 2014-03-31 20:48:16 UTC
Description of problem:
If a user executes rhqctl upgrade and passed the --run-data-migrator estimate parameter, the data can no longer be upgraded using the expected tool. Attempting to execute the --run-data-migrator do-it command fails indicating that the installation/upgrade is already finished.

Version-Release number of selected component (if applicable):
3.2.0.GA

How reproducible:
Always

Steps to Reproduce:
1.  Install and start JBoss ON 3.1.2 system.
2.  Import RHQ Agent and RHQ Server resources into inventory.
3.  Set metric collection schedules to 30 seconds for multiple resources.
    This is to ensure that we have some metric data generated.
4.  Let system run for 5 minutes.
5.  Shutdown JBoss ON 3.1.2 server.
6.  Extract 3.2 server to a new location.
7.  Execute upgrade with `--run-data-migrator estimate` parameter:

        ./rhqctl upgrade --from-server-dir /opt/jboss/jboss-on/jon-server.old --run-data-migrator estimate

#.  Wait for upgrade to finish and print migration time estimate.
#.  Attempt to run the data migration:

        ./rhqctl upgrade --run-data-migrator do-it
        ./rhqctl upgrade --from-server-dir /opt/jboss/jboss-on/jon-server.old --run-data-migrator do-it


Actual results:
All variations of command fail:

    15:42:20,949 ERROR [org.rhq.server.control.command.Upgrade] Missing required option: from-server-dir
    15:42:20,949 ERROR [org.rhq.server.control.command.Upgrade] The option --run-data-migrator is valid only for upgrades from older systems that did not have storage nodes.
    15:42:20,949 ERROR [org.rhq.server.control.command.Upgrade] Exiting due to the previous errors

    15:43:00,776 WARN  [org.rhq.server.control.command.Upgrade] RHQ is already installed so upgrade can not be performed.

Expected results:
"./rhqctl upgrade --run-data-migrator do-it" should result in the data migration to be performed with no errors or warnings.

Additional info:
The fact that you can request an estimate is a clear indication that the same (or similar) command can be executed a second time which will result in the migration to be performed. 

Additionally, our product documentation also leads the user to believe this:

[Upgrading the JBoss ON Server and Components(https://access.redhat.com/site/documentation/en-US/Red_Hat_JBoss_Operations_Network/3.2/html/Installation_Guide/managing-servers.html#upgrade-proc)
    It is possible to migrate the historical monitoring data by running the upgrade with the --run-data-migrator later. However, any new monitoring data collected between the server upgrade and the data migration will be lost. 
    
[The rhqctl Control Script](https://access.redhat.com/site/documentation/en-US/Red_Hat_JBoss_Operations_Network/3.2/html/Admin_and_Config/control-scripts.html)
    Upgrades all JBoss ON services. This can be rerun to migrate historical metric data after a server migration.

Comment 2 Simeon Pinder 2014-09-29 08:12:54 UTC
Moving into ER05 as didn't make the ER04 cut.

Comment 3 Stefan Negrea 2014-10-02 14:58:28 UTC
After a discussion with Larry, the initial feature will not be fixed as is. I tried a couple of permutations to fix the original code but all of them were clunky for the user and made the code brittle.


Here is the plan that was agreed with Larry:

1) Remove all the options form the --run-data-migration argument. The only possible way to use that argument is to do the migration right away. For a fine control over the migration process, users should use the actual data-migration tool.

2) Update the documentation with the new behaviour of the upgrade arguments

3) Clarify the documentation about the effects of the data migration. Currently it is wrong about losing all the data from C* if run at a later time.

Comment 4 Stefan Negrea 2014-10-02 15:00:21 UTC
release/jon3.3.x commits:

commit 146ef5f608f592981cbd779e92f1c965349ce338
Author: Stefan Negrea <snegrea>
Date:   Thu Oct 2 09:21:12 2014 -0500

    [BZ 1082805] Simplify the upgrade options for the data migration. Remove more undeeded 
    (cherry picked from commit 61bf3de37978e90ad796ac6ab5ce71e6c13c190d)
    
    Signed-off-by: John Mazzitelli <mazz>

commit d513d61c39e3046fa307db43b09519de2eb2d80b
Author: Stefan Negrea <snegrea>
Date:   Wed Oct 1 18:04:19 2014 -0500

    [BZ 1082805] Simplify the upgrade options for the data migration. Only allow to run the
    (cherry picked from commit ba48a62061609115f1a6c254b01da5922ea787dc)
    
    Signed-off-by: John Mazzitelli <mazz>

Comment 5 Stefan Negrea 2014-10-02 15:02:36 UTC
The commits from comment #4 remove the all the options for the --run-data-migration argument. The users can no longer use none, estimate, or do-it following the argument. If the argument is used when invoking the upgrade comment, rhqctl will just migrate the data right away as part of the upgrade process.

Comment 6 Simeon Pinder 2014-10-21 20:24:33 UTC
Moving to ON_QA as available to test with the latest brew build:
https://brewweb.devel.redhat.com//buildinfo?buildID=394734

Comment 8 Armine Hovsepyan 2014-10-22 13:08:28 UTC
1. upgrade from 3.1.2 to 3.3er05 does not work
2. According to logs - without giving --run-data-migrator migration starts

lot attached

Comment 9 Armine Hovsepyan 2014-10-22 13:09:04 UTC
Created attachment 949388 [details]
rhqctl_upgrade_312_330er05

Comment 10 Armine Hovsepyan 2014-10-23 11:42:45 UTC
upgrade issue (1 in comment #8) fixed in er05+patched jar
issue 2 is still valid

log attached

Comment 11 Armine Hovsepyan 2014-10-23 11:43:02 UTC
Created attachment 949820 [details]
rhqctl_upgrade_312_330er05_patch

Comment 12 Stefan Negrea 2014-10-28 15:56:58 UTC
Issue 2 from comment 8 is not valid. The migration noted in the logs is the Cassandra schema migration and has nothing to do with the data migration from SQL to Cassandra. 

Also, if that kind of migration is being performed, then the environment was NOT a clean 3.1.2 environment (hence the later SQL db upgrade errors). The storage node was not introduced until 3.2.0, the fact that Cassandra schema migration was performed means that the environment had a 3.2.x installed at some point in time and/or JON 3.1.x was not properly installed.

Comment 13 Armine Hovsepyan 2014-10-28 21:43:01 UTC
double checked on clean 3.1.2 to 3.3Er05 - migration runs.
New log attached. (please note, there was 0 data in postgres, that's why migrated 0 data to cassandra)

Comment 14 Armine Hovsepyan 2014-10-28 21:44:01 UTC
Created attachment 951552 [details]
upgrade.log

Comment 15 Jared MORGAN 2014-10-28 23:33:39 UTC
(In reply to Stefan Negrea from comment #3)
> After a discussion with Larry, the initial feature will not be fixed as is.
> I tried a couple of permutations to fix the original code but all of them
> were clunky for the user and made the code brittle.
> 
> 
> Here is the plan that was agreed with Larry:
> 
> 1) Remove all the options form the --run-data-migration argument. The only
> possible way to use that argument is to do the migration right away. For a
> fine control over the migration process, users should use the actual
> data-migration tool.
> 
> 2) Update the documentation with the new behaviour of the upgrade arguments
> 
> 3) Clarify the documentation about the effects of the data migration.
> Currently it is wrong about losing all the data from C* if run at a later
> time.

So I need to remove the suggestions that the migration can be re-run later from both spots in the Installation Guide below. And specifically delete the info about monitoring data being lost.

[Upgrading the JBoss ON Server and Components(https://access.redhat.com/site/documentation/en-US/Red_Hat_JBoss_Operations_Network/3.2/html/Installation_Guide/managing-servers.html#upgrade-proc)
    It is possible to migrate the historical monitoring data by running the upgrade with the --run-data-migrator later. However, any new monitoring data collected between the server upgrade and the data migration will be lost. 
    
[The rhqctl Control Script](https://access.redhat.com/site/documentation/en-US/Red_Hat_JBoss_Operations_Network/3.2/html/Admin_and_Config/control-scripts.html)
    Upgrades all JBoss ON services. This can be rerun to migrate historical metric data after a server migration.

These are small changes. However due to the work remaining for BZ release notes, I'm going to have to push this to JON Async and do this along with the other tasks.

Comment 16 Jared MORGAN 2014-10-28 23:39:42 UTC
(In reply to Jared MORGAN from comment #15)
> (In reply to Stefan Negrea from comment #3)
> > After a discussion with Larry, the initial feature will not be fixed as is.
> > I tried a couple of permutations to fix the original code but all of them
> > were clunky for the user and made the code brittle.
> > 
> > 
> > Here is the plan that was agreed with Larry:
> > 
> > 1) Remove all the options form the --run-data-migration argument. The only
> > possible way to use that argument is to do the migration right away. For a
> > fine control over the migration process, users should use the actual
> > data-migration tool.
> > 
> > 2) Update the documentation with the new behaviour of the upgrade arguments
> > 
> > 3) Clarify the documentation about the effects of the data migration.
> > Currently it is wrong about losing all the data from C* if run at a later
> > time.
> 
> So I need to remove the suggestions that the migration can be re-run later
> from both spots in the Installation Guide below. And specifically delete the
> info about monitoring data being lost.
> 
> [Upgrading the JBoss ON Server and
> Components(https://access.redhat.com/site/documentation/en-US/
> Red_Hat_JBoss_Operations_Network/3.2/html/Installation_Guide/managing-
> servers.html#upgrade-proc)
>     It is possible to migrate the historical monitoring data by running the
> upgrade with the --run-data-migrator later. However, any new monitoring data
> collected between the server upgrade and the data migration will be lost. 
>     
> [The rhqctl Control
> Script](https://access.redhat.com/site/documentation/en-US/
> Red_Hat_JBoss_Operations_Network/3.2/html/Admin_and_Config/control-scripts.
> html)
>     Upgrades all JBoss ON services. This can be rerun to migrate historical
> metric data after a server migration.
> 
> These are small changes. However due to the work remaining for BZ release
> notes, I'm going to have to push this to JON Async and do this along with
> the other tasks.

I've cloned this out into a docs bug so it can be separated from the Development work. Please discuss any docs related feedback in the cloned issue https://bugzilla.redhat.com/show_bug.cgi?id=1158247

Comment 18 Stefan Negrea 2014-10-29 15:28:29 UTC
I want to further clarify comment 12, while reiterating that comment 13 and comment 14 are NOT VALID issues.

There are two migration processes for metrics data:

1) Migration of data stored in SQL database before JON 3.2. The classes for that migration are int the org.rhq.server.metrics.migrator package.

2) Schema upgrade of data already stored in Cassandra. This process gets invoked everytime JON 3.3 is installed. If it is an upgrade from JON 3.1 then the latest schema will be installed; if it is an upgrade from JON 3.2 then the Cassandra schema is upgrade to the latest. If it is a new install then the schema will be installed in the same manner as the upgrade from JON 3.1 to JON 3.3. The classes for this migration are in the org.rhq.cassandra.schema package.


Based on the classes mentioned in the logs from comment 13 and comment 14, the second migration process in invoked. It is a migration or upgrade of data already stored in Cassandra. Even if there is no data, this migration is done part of an incremental storage node schema deployment. The baseline was established in JON 3.2, and JON 3.3 is just an upgrade over that. So every single install of JON will get the JON 3.2 schema first and then upgraded incrementally to the latests available (in this case JON 3.3). This follows the same pattern as the SQL database.

Please note the packages for the classes in the logs are not related org.rhq.server.metrics.migrator. So the SQL migration process was never invoked and never ran.

Based on comment 10 I am moving this BZ to MODIFIED to fully test the changes done in the context of comment 10 in CR1 release. No further changes are needed for comments 13 and comment 14 since they are NOT VALID issues.

Comment 19 Armine Hovsepyan 2014-10-29 19:47:08 UTC
Created attachment 951905 [details]
upgrade-with-migration.log

Comment 20 Armine Hovsepyan 2014-10-29 19:47:53 UTC
THank you for explanation Stefan.
BZ is verified in jon 3.3 er05. Both upgrade and upgrade-with-migration logs attached.


Note You need to log in before you can comment on or make changes to this bug.