Bug 1411856 - Upgrade 9->10->11 Fails: Migration cannot continue until all these have been migrated to the api database.
Summary: Upgrade 9->10->11 Fails: Migration cannot continue until all these have been ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: instack-undercloud
Version: 11.0 (Ocata)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: 11.0 (Ocata)
Assignee: James Slagle
QA Contact: Prasanth Anbalagan
URL:
Whiteboard:
Depends On:
Blocks: 1413972
TreeView+ depends on / blocked
 
Reported: 2017-01-10 16:08 UTC by Yurii Prokulevych
Modified: 2017-05-17 19:56 UTC (History)
13 users (show)

Fixed In Version: instack-undercloud-6.0.0-2.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1413972 (view as bug list)
Environment:
Last Closed: 2017-05-17 19:56:05 UTC


Attachments (Terms of Use)
undercloud upgrade log (101.78 KB, text/plain)
2017-01-10 16:08 UTC, Yurii Prokulevych
no flags Details


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2017:1245 normal SHIPPED_LIVE Red Hat OpenStack Platform 11.0 Bug Fix and Enhancement Advisory 2017-05-17 23:01:50 UTC
OpenStack gerrit 420060 None None None 2017-01-13 16:33:57 UTC
OpenStack gerrit 425323 None None None 2017-01-31 15:16:46 UTC

Description Yurii Prokulevych 2017-01-10 16:08:51 UTC
Created attachment 1239150 [details]
undercloud upgrade log

Description of problem:
-----------------------
Attempt to upgrade undercloud (rhos-10) upgraded from rhos-9 fails:

Notice: /Stage[main]/Glance::Deps/Anchor[glance::dbsync::end]: Triggered 'refresh' from 1 eventsESC[0m
Notice: /Stage[main]/Nova::Db::Sync/Exec[nova-db-sync]/returns: An error has occurred:ESC[0m
Notice: /Stage[main]/Nova::Db::Sync/Exec[nova-db-sync]/returns: Traceback (most recent call last):ESC[0m
Notice: /Stage[main]/Nova::Db::Sync/Exec[nova-db-sync]/returns:   File "/usr/lib/python2.7/site-packages/nova/cmd/manage.py", line 1584, in mainESC[0m
Notice: /Stage[main]/Nova::Db::Sync/Exec[nova-db-sync]/returns:     ret = fn(*fn_args, **fn_kwargs)ESC[0m
Notice: /Stage[main]/Nova::Db::Sync/Exec[nova-db-sync]/returns:   File "/usr/lib/python2.7/site-packages/nova/cmd/manage.py", line 783, in syncESC[0m
Notice: /Stage[main]/Nova::Db::Sync/Exec[nova-db-sync]/returns:     return migration.db_sync(version)ESC[0m
Notice: /Stage[main]/Nova::Db::Sync/Exec[nova-db-sync]/returns:   File "/usr/lib/python2.7/site-packages/nova/db/migration.py", line 26, in db_syncESC[0m
Notice: /Stage[main]/Nova::Db::Sync/Exec[nova-db-sync]/returns:     return IMPL.db_sync(version=version, database=database, context=context)ESC[0m
Notice: /Stage[main]/Nova::Db::Sync/Exec[nova-db-sync]/returns:   File "/usr/lib/python2.7/site-packages/nova/db/sqlalchemy/migration.py", line 57, in db_syncESC[0m
Notice: /Stage[main]/Nova::Db::Sync/Exec[nova-db-sync]/returns:     repository, version)ESC[0m
Notice: /Stage[main]/Nova::Db::Sync/Exec[nova-db-sync]/returns:   File "/usr/lib/python2.7/site-packages/migrate/versioning/api.py", line 186, in upgradeESC[0m
Notice: /Stage[main]/Nova::Db::Sync/Exec[nova-db-sync]/returns:     return _migrate(url, repository, version, upgrade=True, err=err, **opts)ESC[0m
Notice: /Stage[main]/Nova::Db::Sync/Exec[nova-db-sync]/returns:   File "<string>", line 2, in _migrateESC[0m
Notice: /Stage[main]/Nova::Db::Sync/Exec[nova-db-sync]/returns:   File "/usr/lib/python2.7/site-packages/migrate/versioning/util/__init__.py", line 160, in with_engineESC[0m
Notice: /Stage[main]/Nova::Db::Sync/Exec[nova-db-sync]/returns:     return f(*a, **kw)ESC[0m
Notice: /Stage[main]/Nova::Db::Sync/Exec[nova-db-sync]/returns:   File "/usr/lib/python2.7/site-packages/migrate/versioning/api.py", line 366, in _migrateESC[0m
Notice: /Stage[main]/Nova::Db::Sync/Exec[nova-db-sync]/returns:     schema.runchange(ver, change, changeset.step)ESC[0m
Notice: /Stage[main]/Nova::Db::Sync/Exec[nova-db-sync]/returns:   File "/usr/lib/python2.7/site-packages/migrate/versioning/schema.py", line 93, in runchangeESC[0m
Notice: /Stage[main]/Nova::Db::Sync/Exec[nova-db-sync]/returns:     change.run(self.engine, step)ESC[0m
Notice: /Stage[main]/Nova::Db::Sync/Exec[nova-db-sync]/returns:   File "/usr/lib/python2.7/site-packages/migrate/versioning/script/py.py", line 148, in runESC[0m
Notice: /Stage[main]/Nova::Db::Sync/Exec[nova-db-sync]/returns:     script_func(engine)ESC[0m
Notice: /Stage[main]/Nova::Db::Sync/Exec[nova-db-sync]/returns:   File "/usr/lib/python2.7/site-packages/nova/db/sqlalchemy/migrate_repo/versions/345_require_online_migration_completion.py", line 44, in upgradeESC[0m
Notice: /Stage[main]/Nova::Db::Sync/Exec[nova-db-sync]/returns:     raise exception.ValidationError(detail=msg)ESC[0m
Notice: /Stage[main]/Nova::Db::Sync/Exec[nova-db-sync]/returns: ValidationError: Migration cannot continue until all these have been migrated to the api database. Please run `nova-manage db online_migrations' on Newton code before continuing.There are still 8 unmigrated flavors. ESC[0m
Error: /usr/bin/nova-manage  db sync returned 1 instead of one of [0]ESC[0m
Error: /Stage[main]/Nova::Db::Sync/Exec[nova-db-sync]/returns: change from notrun to 0 failed: /usr/bin/nova-manage  db sync returned 1 instead of one of [0]

***
Attempt to run command `nova-manage db online_migrations` failed:

nova-manage db online_migrations                                                                                                                                             
usage: nova-manage db [-h]
                      
                      {archive_deleted_rows,null_instance_uuid_scan,online_data_migrations,sync,version}
                      ...
nova-manage db: error: argument action: invalid choice: 'online_migrations' (choose from 'archive_deleted_rows', 'null_instance_uuid_scan', 'online_data_migrations', 'sync', 'version')

***
Running next command helped to bypass error and upgrade undercloud:

nova-manage db online_data_migrations
Running batches of 50 until complete
8 rows matched query migrate_flavors, 8 migrated
8 rows matched query migrate_instance_keypairs, 8 migrated
8 rows matched query migrate_instances_add_request_spec, 0 migrated
1 rows matched query migrate_keypairs_to_api_db, 1 migrated
+---------------------------------------+--------------+-----------+
|               Migration               | Total Needed | Completed |
+---------------------------------------+--------------+-----------+
| aggregate_uuids_online_data_migration |      0       |     0     |
| migrate_aggregate_reset_autoincrement |      0       |     0     |
|           migrate_aggregates          |      0       |     0     |
|   migrate_flavor_reset_autoincrement  |      0       |     0     |
|            migrate_flavors            |      0       |     0     |
|   migrate_instance_groups_to_api_db   |      0       |     0     |
|       migrate_instance_keypairs       |      0       |     0     |
|   migrate_instances_add_request_spec  |      0       |     0     |
|       migrate_keypairs_to_api_db      |      0       |     0     |
+---------------------------------------+--------------+-----------+

Version-Release number of selected component (if applicable):
-------------------------------------------------------------
openstack-nova-placement-api-15.0.0-0.20161220035916.5eb3144.el7ost.noarch
openstack-nova-console-15.0.0-0.20161220035916.5eb3144.el7ost.noarch
openstack-nova-scheduler-15.0.0-0.20161220035916.5eb3144.el7ost.noarch
python-novaclient-6.0.0-1.el7ost.noarch
openstack-nova-common-15.0.0-0.20161220035916.5eb3144.el7ost.noarch
openstack-nova-novncproxy-15.0.0-0.20161220035916.5eb3144.el7ost.noarch
python-nova-15.0.0-0.20161220035916.5eb3144.el7ost.noarch
openstack-nova-cells-15.0.0-0.20161220035916.5eb3144.el7ost.noarch
openstack-nova-cert-15.0.0-0.20161220035916.5eb3144.el7ost.noarch
openstack-nova-15.0.0-0.20161220035916.5eb3144.el7ost.noarch
puppet-nova-10.1.0-0.20161216185556.4801dd0.el7ost.noarch
openstack-nova-conductor-15.0.0-0.20161220035916.5eb3144.el7ost.noarch
openstack-nova-network-15.0.0-0.20161220035916.5eb3144.el7ost.noarch
openstack-nova-api-15.0.0-0.20161220035916.5eb3144.el7ost.noarch
python-nova-tests-15.0.0-0.20161220035916.5eb3144.el7ost.noarch
openstack-nova-compute-15.0.0-0.20161220035916.5eb3144.el7ost.noarch


openstack-tripleo-heat-templates-compat-5.1.1-0.20161219183418.6ef0417.el7ost.noarch
openstack-heat-templates-0.0.1-0.20161213042949.d10069a.el7ost.noarch
openstack-tripleo-heat-templates-6.0.0-0.20161220000655.58d711e.el7ost.noarch


Steps to Reproduce:
-------------------
1. Upgrade RHOS-9 setup(3controllers + 2computes + 3ceph) to RHOS-10
2*. During upgrade VM on overcloud was migrated between computes
3. Try to upgrade undercloud to RHOS-11

    openstack undercloud upgrade 

Actual results:
---------------
Undercloud upgrade fails

Expected results:
-----------------
Undercloud upgrade succeeds 


Additional info:
----------------
Virtual setup - 3controllers + 2computes + 3ceph

Comment 1 Sven Anderson 2017-01-13 18:54:00 UTC
undercloud upgrades are done with the same puppet files as the installs, so no special processing is applied. Therefore, after upgrading to OSP 10, no "nova-manage db online_data_migrations" is executed after the "nova-manage db sync", which is mandatory if an upgrade happened. This makes a manual "manage db online_data_migrations" necessary.

Comment 2 Marius Cornea 2017-01-16 09:18:25 UTC
(In reply to Sven Anderson from comment #1)
> undercloud upgrades are done with the same puppet files as the installs, so
> no special processing is applied. Therefore, after upgrading to OSP 10, no
> "nova-manage db online_data_migrations" is executed after the "nova-manage
> db sync", which is mandatory if an upgrade happened. This makes a manual
> "manage db online_data_migrations" necessary.

I think this should be executed automatically during the undercloud upgrade flow and the operator shouldn't be required to run the "nova-manage db" command manually.

Comment 3 Marios Andreou 2017-01-16 10:58:42 UTC
hi o/ just poked at this a bit (someone must have added me to cc as it appeared in my inbox even though it isn't tagged DFG:Upgrades).

So to be clear, because we've had other nova related migration issues (cells API) that was specific to 10->11 at BZ 1409857, this bz is specific to environments starting at OSP9. I'm not clear on the details of why (i.e. if you start at 10 then you don't need to run "nova-manage db online_data_migrations"). I am also not clear if the 'migrate vms as part of the upgrade' in comment #0 is necessary to reproduce this.

All that being said, I think we can easily add this. The db syncs usually happen at https://github.com/openstack/instack-undercloud/blob/554977801aad85f3ac08e16762fcdc8550c95754/elements/puppet-stack-config/puppet-stack-config.pp#L26 but we want this to happen only on upgrade and then only on newton. In which case probably the easiest place to add it is just at https://github.com/openstack/python-tripleoclient/blob/master/tripleoclient/v1/undercloud.py#L54 after the undercloud was upgraded to stable/newton. 

I filed the LP bug so we aren't pointing at BZ in the upstream repos at https://bugs.launchpad.net/tripleo/+bug/1656791 and it points here and then I put the onliner at https://review.openstack.org/#/c/420637/ to add the migration to the client for now. Lets see what folks think of that.

One last note... this BZ is assigned puppet-nova and is being used to fixup to nova for the s/online_migrations/online_data_migrations'  @ https://review.openstack.org/#/c/420060/. Should we be tracking the fixup during the undercloud upgrade in a different BZ (and it can also be tagged as DFG:Upgrades) - i.e. should we clone this bug for the fix I put out as per @ansiwen's comment #1 and mcornea comment #2


thanks, marios

Comment 4 Marios Andreou 2017-01-16 14:51:02 UTC
after discussion on the Upgrade scrum just now I'll clone this BZ and in fact move the db-sync to the instack puppet-stack-config.pp I pointed to in comment #3. Going to remove the upgrades related changes from this BZ now.

Comment 5 Marios Andreou 2017-01-17 13:23:33 UTC
cloned this to BZ 1413972 so it can be assigned DFG:Upgrades and we can track the things we need instack-undercloud/puppet-nova etc. This BZ can then stay DFG:Compute and it tracks the change in nova for the wrong error text.

Comment 6 Stephen Gordon 2017-01-19 13:38:41 UTC
(In reply to marios from comment #3)
> So to be clear, because we've had other nova related migration issues (cells
> API) that was specific to 10->11 at BZ 1409857, this bz is specific to
> environments starting at OSP9. I'm not clear on the details of why (i.e. if
> you start at 10 then you don't need to run "nova-manage db
> online_data_migrations"). I am also not clear if the 'migrate vms as part of
> the upgrade' in comment #0 is necessary to reproduce this.

We would have to dig into exactly which online data migration we are talking about but the reason is likely that 10 no longer creates "old" records that need to be migrated, but a 10 upgraded from 9 where no online data migration has been run still has "old" records that need to be migrated, while 11 only understands the "new" records as it's assumed data migration has been completed.

Data migrations used to occur as part of the db sync offline migrations, but for large installations this inherently meant large outages while every row in an impacted table was modified. The idea behind online data migrations is to do the schema migration offline (which is generally very quick, especially given the rules that are enforced in terms of what schema migrations are allowed) and then allow the system to migrate the data online as it touches each record. The online_data_migrations command exists for the operator to "force" migrate any records that remain untouched as part of this process - they would do this as a pre-req step during the 11 upgrade process (or of course we can do it for them).

Dan explains some of the thought process here:

    http://www.danplanet.com/blog/2015/10/07/upgrades-in-nova-database-migrations/

Comment 10 errata-xmlrpc 2017-05-17 19:56:05 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1245


Note You need to log in before you can comment on or make changes to this bug.