Bug 1505069 - heat-manage db_sync returns table and column collisions during upgrade
Summary: heat-manage db_sync returns table and column collisions during upgrade
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: instack-undercloud
Version: 9.0 (Mitaka)
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: Upstream M2
: 13.0 (Queens)
Assignee: Emilien Macchi
QA Contact: Gurenko Alex
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-10-21 18:58 UTC by Andrew Ludwar
Modified: 2018-06-27 13:39 UTC (History)
14 users (show)

Fixed In Version: instack-undercloud-8.1.1-0.20171206155308.5565e25.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-06-27 13:38:59 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 514777 0 None MERGED heat-dbsync: increase timeout to 15 minutes 2020-07-15 12:11:17 UTC
Red Hat Knowledge Base (Solution) 3221491 0 None None None 2017-10-21 19:28:42 UTC
Red Hat Product Errata RHEA-2018:2086 0 None None None 2018-06-27 13:39:58 UTC

Description Andrew Ludwar 2017-10-21 18:58:06 UTC
Description of problem:

Customer is doing a major upgrade from OSP9-->OSP10. They just completed a major upgrade from OSP8-->OSP9 and immediately moved into an upgrade from OSP9-->OSP10. They are currently trying to perform the "openstack undercloud upgrade" step [1] and are encountering errors in the heat db_sync.


[1] https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/9/html-single/upgrading_red_hat_openstack_platform/#sect-Major-Updating_Director_Packages


Version-Release number of selected component (if applicable):

OSP9

$ grep openstack-heat installed-rpms 
openstack-heat-api-7.0.5-1.el7ost.noarch                    Fri Oct 20 23:34:07 2017
openstack-heat-api-cfn-7.0.5-1.el7ost.noarch                Fri Oct 20 23:34:07 2017
openstack-heat-api-cloudwatch-7.0.5-1.el7ost.noarch         Fri Oct 20 23:34:07 2017
openstack-heat-common-7.0.5-1.el7ost.noarch                 Fri Oct 20 23:31:07 2017
openstack-heat-engine-7.0.5-1.el7ost.noarch                 Fri Oct 20 23:34:07 2017
openstack-heat-templates-0-0.12.1e6015dgit.el7ost.noarch    Fri Oct 20 23:28:46 2017


How reproducible:

Everytime

Steps to Reproduce:
1. Run "opsenstack undercloud upgrade"
2. Use workaround described in https://bugzilla.redhat.com/show_bug.cgi?id=1420351 to drop the raw_templates_files table;
3. Manually run "heat-manage --config-file /etc/heat/heat.conf db_sync"

Actual results:

Original error, which was found in above BZ:
$ sudo heat-manage --config-file /etc/heat/heat.conf db_sync
2017-10-21 13:48:10.334 29119 INFO migrate.versioning.api [-] 71 -> 72...
2017-10-21 13:48:10.338 29119 ERROR oslo_db.sqlalchemy.exc_filters [-] DBAPIError exception wrapped from (pymysql.err.InternalError) (1050, u"Table 'raw_template_files' already exists") [SQL: u'\nCREATE TABLE raw_template_files (\n\tid INTEGER NOT NULL AUTO_INCREMENT, \n\tfiles LONGTEXT, \n\tcreated_at DATETIME, \n\tupdated_at DATETIME, \n\tPRIMARY KEY (id)\n)ENGINE=InnoDB CHARSET=utf8\n\n']
2017-10-21 13:48:10.338 29119 ERROR oslo_db.sqlalchemy.exc_filters Traceback (most recent call last):
2017-10-21 13:48:10.338 29119 ERROR oslo_db.sqlalchemy.exc_filters   File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1139, in _execute_context
2017-10-21 13:48:10.338 29119 ERROR oslo_db.sqlalchemy.exc_filters     context)
2017-10-21 13:48:10.338 29119 ERROR oslo_db.sqlalchemy.exc_filters   File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/default.py", line 450, in do_execute
2017-10-21 13:48:10.338 29119 ERROR oslo_db.sqlalchemy.exc_filters     cursor.execute(statement, parameters)
2017-10-21 13:48:10.338 29119 ERROR oslo_db.sqlalchemy.exc_filters   File "/usr/lib/python2.7/site-packages/pymysql/cursors.py", line 166, in execute
2017-10-21 13:48:10.338 29119 ERROR oslo_db.sqlalchemy.exc_filters     result = self._query(query)
2017-10-21 13:48:10.338 29119 ERROR oslo_db.sqlalchemy.exc_filters   File "/usr/lib/python2.7/site-packages/pymysql/cursors.py", line 322, in _query
2017-10-21 13:48:10.338 29119 ERROR oslo_db.sqlalchemy.exc_filters     conn.query(q)
2017-10-21 13:48:10.338 29119 ERROR oslo_db.sqlalchemy.exc_filters   File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 841, in query
2017-10-21 13:48:10.338 29119 ERROR oslo_db.sqlalchemy.exc_filters     self._affected_rows = self._read_query_result(unbuffered=unbuffered)
2017-10-21 13:48:10.338 29119 ERROR oslo_db.sqlalchemy.exc_filters   File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 1029, in _read_query_result
2017-10-21 13:48:10.338 29119 ERROR oslo_db.sqlalchemy.exc_filters     result.read()
2017-10-21 13:48:10.338 29119 ERROR oslo_db.sqlalchemy.exc_filters   File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 1312, in read
2017-10-21 13:48:10.338 29119 ERROR oslo_db.sqlalchemy.exc_filters     first_packet = self.connection._read_packet()
2017-10-21 13:48:10.338 29119 ERROR oslo_db.sqlalchemy.exc_filters   File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 991, in _read_packet
2017-10-21 13:48:10.338 29119 ERROR oslo_db.sqlalchemy.exc_filters     packet.check_error()
2017-10-21 13:48:10.338 29119 ERROR oslo_db.sqlalchemy.exc_filters   File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 393, in check_error
2017-10-21 13:48:10.338 29119 ERROR oslo_db.sqlalchemy.exc_filters     err.raise_mysql_exception(self._data)
2017-10-21 13:48:10.338 29119 ERROR oslo_db.sqlalchemy.exc_filters   File "/usr/lib/python2.7/site-packages/pymysql/err.py", line 107, in raise_mysql_exception
2017-10-21 13:48:10.338 29119 ERROR oslo_db.sqlalchemy.exc_filters     raise errorclass(errno, errval)
2017-10-21 13:48:10.338 29119 ERROR oslo_db.sqlalchemy.exc_filters InternalError: (1050, u"Table 'raw_template_files' already exists")
2017-10-21 13:48:10.338 29119 ERROR oslo_db.sqlalchemy.exc_filters
ERROR: (pymysql.err.InternalError) (1050, u"Table 'raw_template_files' already exists") [SQL: u'\nCREATE TABLE raw_template_files (\n\tid INTEGER NOT NULL AUTO_INCREMENT, \n\tfiles LONGTEXT, \n\tcreated_at DATETIME, \n\tupdated_at DATETIME, \n\tPRIMARY KEY (id)\n)ENGINE=InnoDB CHARSET=utf8\n\n']

After workaround steps of dropping the raw_template_files table:

sudo heat-manage --config-file /etc/heat/heat.conf db_sync
2017-10-21 14:00:58.417 31627 INFO migrate.versioning.api [-] 71 -> 72...
2017-10-21 14:00:58.450 31627 ERROR oslo_db.sqlalchemy.exc_filters [-] DBAPIError exception wrapped from (pymysql.err.InternalError) (1060, u"Duplicate column name 'files_id'") [SQL: u'\nALTER TABLE raw_template ADD files_id INTEGER']
2017-10-21 14:00:58.450 31627 ERROR oslo_db.sqlalchemy.exc_filters Traceback (most recent call last):
2017-10-21 14:00:58.450 31627 ERROR oslo_db.sqlalchemy.exc_filters   File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1139, in _execute_context
2017-10-21 14:00:58.450 31627 ERROR oslo_db.sqlalchemy.exc_filters     context)
2017-10-21 14:00:58.450 31627 ERROR oslo_db.sqlalchemy.exc_filters   File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/default.py", line 450, in do_execute
2017-10-21 14:00:58.450 31627 ERROR oslo_db.sqlalchemy.exc_filters     cursor.execute(statement, parameters)
2017-10-21 14:00:58.450 31627 ERROR oslo_db.sqlalchemy.exc_filters   File "/usr/lib/python2.7/site-packages/pymysql/cursors.py", line 166, in execute
2017-10-21 14:00:58.450 31627 ERROR oslo_db.sqlalchemy.exc_filters     result = self._query(query)
2017-10-21 14:00:58.450 31627 ERROR oslo_db.sqlalchemy.exc_filters   File "/usr/lib/python2.7/site-packages/pymysql/cursors.py", line 322, in _query
2017-10-21 14:00:58.450 31627 ERROR oslo_db.sqlalchemy.exc_filters     conn.query(q)
2017-10-21 14:00:58.450 31627 ERROR oslo_db.sqlalchemy.exc_filters   File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 841, in query
2017-10-21 14:00:58.450 31627 ERROR oslo_db.sqlalchemy.exc_filters     self._affected_rows = self._read_query_result(unbuffered=unbuffered)
2017-10-21 14:00:58.450 31627 ERROR oslo_db.sqlalchemy.exc_filters   File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 1029, in _read_query_result
2017-10-21 14:00:58.450 31627 ERROR oslo_db.sqlalchemy.exc_filters     result.read()
2017-10-21 14:00:58.450 31627 ERROR oslo_db.sqlalchemy.exc_filters   File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 1312, in read
2017-10-21 14:00:58.450 31627 ERROR oslo_db.sqlalchemy.exc_filters     first_packet = self.connection._read_packet()
2017-10-21 14:00:58.450 31627 ERROR oslo_db.sqlalchemy.exc_filters   File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 991, in _read_packet
2017-10-21 14:00:58.450 31627 ERROR oslo_db.sqlalchemy.exc_filters     packet.check_error()
2017-10-21 14:00:58.450 31627 ERROR oslo_db.sqlalchemy.exc_filters   File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 393, in check_error
2017-10-21 14:00:58.450 31627 ERROR oslo_db.sqlalchemy.exc_filters     err.raise_mysql_exception(self._data)
2017-10-21 14:00:58.450 31627 ERROR oslo_db.sqlalchemy.exc_filters   File "/usr/lib/python2.7/site-packages/pymysql/err.py", line 107, in raise_mysql_exception
2017-10-21 14:00:58.450 31627 ERROR oslo_db.sqlalchemy.exc_filters     raise errorclass(errno, errval)
2017-10-21 14:00:58.450 31627 ERROR oslo_db.sqlalchemy.exc_filters InternalError: (1060, u"Duplicate column name 'files_id'")
2017-10-21 14:00:58.450 31627 ERROR oslo_db.sqlalchemy.exc_filters
ERROR: (pymysql.err.InternalError) (1060, u"Duplicate column name 'files_id'") [SQL: u'\nALTER TABLE raw_template ADD files_id INTEGER']


Expected results:

"openstack undercloud upgrade" completes successfully.

Additional info:

Prior to db_sync we also ran purge_deleted:

# heat-manage purge_deleted -g days 2

Heat DB is larger than usual, disk IO on the Director seems OK though.

# du -hs /var/lib/mysql/*
16K     /var/lib/mysql/aria_log.00000001
4.0K    /var/lib/mysql/aria_log_control
21G     /var/lib/mysql/ceilometer
3.1M    /var/lib/mysql/glance
3.1G    /var/lib/mysql/heat
26M     /var/lib/mysql/ibdata1
5.0M    /var/lib/mysql/ib_logfile0
5.0M    /var/lib/mysql/ib_logfile1
1.3M    /var/lib/mysql/ironic
457M    /var/lib/mysql/keystone
14M     /var/lib/mysql/mistral
1012K   /var/lib/mysql/mysql
0       /var/lib/mysql/mysql.sock
23M     /var/lib/mysql/neutron
23M     /var/lib/mysql/nova
2.8M    /var/lib/mysql/nova_api
212K    /var/lib/mysql/performance_schema
0       /var/lib/mysql/test

# du -hs /var/lib/mysql/heat/*
4.0K    /var/lib/mysql/heat/db.opt
16K     /var/lib/mysql/heat/event.frm
81M     /var/lib/mysql/heat/event.ibd
12K     /var/lib/mysql/heat/migrate_version.frm
96K     /var/lib/mysql/heat/migrate_version.ibd
12K     /var/lib/mysql/heat/raw_template_files.frm
96K     /var/lib/mysql/heat/raw_template_files.ibd
12K     /var/lib/mysql/heat/raw_template.frm
3.0G    /var/lib/mysql/heat/raw_template.ibd
12K     /var/lib/mysql/heat/resource_data.frm
560K    /var/lib/mysql/heat/resource_data.ibd
16K     /var/lib/mysql/heat/resource.frm
17M     /var/lib/mysql/heat/resource.ibd
12K     /var/lib/mysql/heat/service.frm
96K     /var/lib/mysql/heat/service.ibd
16K     /var/lib/mysql/heat/snapshot.frm
128K    /var/lib/mysql/heat/snapshot.ibd
12K     /var/lib/mysql/heat/software_config.frm
16M     /var/lib/mysql/heat/software_config.ibd
16K     /var/lib/mysql/heat/software_deployment.frm
12M     /var/lib/mysql/heat/software_deployment.ibd
20K     /var/lib/mysql/heat/stack.frm
12M     /var/lib/mysql/heat/stack.ibd
12K     /var/lib/mysql/heat/stack_lock.frm
96K     /var/lib/mysql/heat/stack_lock.ibd
12K     /var/lib/mysql/heat/stack_tag.frm
112K    /var/lib/mysql/heat/stack_tag.ibd
12K     /var/lib/mysql/heat/sync_point.frm
112K    /var/lib/mysql/heat/sync_point.ibd
16K     /var/lib/mysql/heat/user_creds.frm
96K     /var/lib/mysql/heat/user_creds.ibd
12K     /var/lib/mysql/heat/watch_data.frm
112K    /var/lib/mysql/heat/watch_data.ibd
12K     /var/lib/mysql/heat/watch_rule.frm
112K    /var/lib/mysql/heat/watch_rule.ibd


$ sudo iostat -mdx 1
Linux 3.10.0-693.2.2.el7.x86_64 (hostname) 	10/21/2017 	_x86_64_	(8 CPU)

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
vda               0.25     0.63   47.63   31.13     6.07     0.38   167.79     2.71   34.37    5.08   79.20   0.99   7.78
dm-0              0.00     0.00    0.01    0.00     0.00     0.00    30.12     0.00    0.00    0.00    0.00   0.00   0.00

Comment 1 Andrew Ludwar 2017-10-21 21:18:27 UTC
Updated the attached KCS article to reflect the steps taken to fix the issue. Essentially, they came down to these, but note these steps are very specific for where in the db_sync the failure occurred. These steps might not apply for every single db_sync failure.

# mysql
MariaDB [none]> use heat;
MariaDB [heat]> alter table raw_template drop key raw_tmpl_files_fkey_ref;"
MariaDB [heat]> alter table raw_template drop foreign key raw_tmpl_files_fkey_ref;"
MariaDB [heat]> alter table raw_template drop column files_id;"
MariaDB [heat]> drop table raw_template_files;"

Additionally, the puppet timeout can be increased to allow for a larger heat DB time to sync:

For future deployments, you could edit etc/puppet/manifests/puppet-stack-config.pp and change:

Exec<| title == 'heat-dbsync' |> { refreshonly => false }

to have:

Exec<| title == 'heat-dbsync' |> { refreshonly => false, timeout => 900 }

So the timeout will be set to 15 min for the Heat dbsync command.
Keep in mind the undercloud upgrade command override this file, so you might need to edit the file right after running the upgrade command when instack-undercloud package is upgraded. You can also stop the Puppet run with CTLR+C before the Heat dbsync resource, edit the file, and run the upgrade command again.

Comment 2 Thomas Hervé 2017-10-24 10:06:51 UTC
Reassigning, as the fix is to increase the timeout. Emilien can you ack? Thanks.

Comment 9 Gurenko Alex 2018-04-30 12:46:28 UTC
Verified on puddle 2018-04-26.3

[stack@undercloud-0 ~]$ rpm -q instack-undercloud
instack-undercloud-8.4.1-2.el7ost.noarch

2018-04-30 06:11:31,872 INFO:
#############################################################################
Undercloud upgrade complete.

openstack undercloud upgrade completed without any issues

Comment 12 errata-xmlrpc 2018-06-27 13:38:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:2086


Note You need to log in before you can comment on or make changes to this bug.