Description of problem: OSP10 -> OSP11 upgrade: major upgrade composable step fails on composable roles deployment while running cinder-manage db sync: [stack@undercloud-0 ~]$ openstack stack failures list overcloud overcloud.AllNodesDeploySteps.ControllerUpgrade_Step5.0: resource_type: OS::Heat::SoftwareDeployment physical_resource_id: 2c5739e7-6099-4830-98ef-eb610d747866 status: CREATE_FAILED status_reason: | Error: resources[0]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 2 deploy_stdout: | ... TASK [Gathering Facts] ********************************************************* ok: [localhost] TASK [Sync cinder DB] ********************************************************** fatal: [localhost]: FAILED! => {"changed": true, "cmd": ["cinder-manage", "db", "sync"], "delta": "0:00:01.579537", "end": "2017-08-08 11:10:19.235581", "failed": true, "rc": 1, "start": "2017-08-08 11:10:17.656044", "stderr": "Option \"logdir\" from group \"DEFAULT\" is deprecated. Use option \"log-dir\" from group \"DEFAULT\".", "stderr_lines": ["Option \"logdir\" from group \"DEFAULT\" is deprecated. Use option \"log-dir\" from group \"DEFAULT\"."], "stdout": "", "stdout_lines": []} to retry, use: --limit @/var/lib/heat-config/heat-config-ansible/84534f12-7925-4c8c-96c8-b2fd69857f54_playbook.retry PLAY RECAP ********************************************************************* localhost : ok=1 changed=0 unreachable=0 failed=1 (truncated, view all with --long) deploy_stderr: | overcloud.AllNodesDeploySteps.ServiceApiUpgrade_Step5: resource_type: OS::Heat::SoftwareDeploymentGroup physical_resource_id: f3714032-4cce-4f76-9797-18303b065d5e status: CREATE_FAILED status_reason: | CREATE aborted overcloud.AllNodesDeploySteps.CephStorageUpgrade_Step5: resource_type: OS::Heat::SoftwareDeploymentGroup physical_resource_id: ae7b3aab-8b18-433a-bbff-97aa4148ce83 status: CREATE_FAILED status_reason: | CREATE aborted Version-Release number of selected component (if applicable): openstack-tripleo-heat-templates-6.1.0-2.el7ost.noarch How reproducible: 100% Steps to Reproduce: 1. Deploy OSP11 with custom role running systemd managed services #!/bin/bash timeout 100m openstack overcloud deploy --templates /usr/share/openstack-tripleo-heat-templates \ -r /home/stack/composable_roles/roles/roles_data.yaml \ -e /home/stack/composable_roles/roles/nodes.yaml \ -e /home/stack/composable_roles/roles/network-config.yaml \ --environment-file /usr/share/openstack-tripleo-heat-templates/environments/services/sahara.yaml \ --environment-file /usr/share/openstack-tripleo-heat-templates/environments/cinder-backup.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/storage-environment.yaml \ -e /home/stack/composable_roles/internal.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \ -e /home/stack/composable_roles/network/network-environment.yaml \ -e /home/stack/composable_roles/enable-tls.yaml \ -e /home/stack/composable_roles/inject-trust-anchor.yaml \ -e /home/stack/composable_roles/public_vip.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/tls-endpoints-public-ip.yaml \ -e /home/stack/composable_roles/debug.yaml \ --log-file overcloud_deployment_36.log roles_data: http://paste.openstack.org/show/617763/ Actual results: major upgrade composable step fails Expected results: Upgrade completes fine Additional info: Attaching sosreports.
We have this exception in the cinder-manage.log of ctl0, 2017-08-08 11:10:19.121 359275 INFO migrate.versioning.api [-] 89 -> 90... 2017-08-08 11:10:19.137 359275 ERROR oslo_db.sqlalchemy.exc_filters [-] DBAPIError exception wrapped from (pymysql.err.InternalError) (1060, u"Duplicate column name 'race_preventer'") [SQL: u'\nALTER TABLE workers ADD race_preventer INTEGER NOT NULL DEFAULT 0'] 2017-08-08 11:10:19.137 359275 ERROR oslo_db.sqlalchemy.exc_filters Traceback (most recent call last): 2017-08-08 11:10:19.137 359275 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1139, in _execute_context 2017-08-08 11:10:19.137 359275 ERROR oslo_db.sqlalchemy.exc_filters context) 2017-08-08 11:10:19.137 359275 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/default.py", line 450, in do_execute 2017-08-08 11:10:19.137 359275 ERROR oslo_db.sqlalchemy.exc_filters cursor.execute(statement, parameters) 2017-08-08 11:10:19.137 359275 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python2.7/site-packages/pymysql/cursors.py", line 166, in execute 2017-08-08 11:10:19.137 359275 ERROR oslo_db.sqlalchemy.exc_filters result = self._query(query) 2017-08-08 11:10:19.137 359275 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python2.7/site-packages/pymysql/cursors.py", line 322, in _query 2017-08-08 11:10:19.137 359275 ERROR oslo_db.sqlalchemy.exc_filters conn.query(q) 2017-08-08 11:10:19.137 359275 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 841, in query 2017-08-08 11:10:19.137 359275 ERROR oslo_db.sqlalchemy.exc_filters self._affected_rows = self._read_query_result(unbuffered=unbuffered) 2017-08-08 11:10:19.137 359275 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 1029, in _read_query_result 2017-08-08 11:10:19.137 359275 ERROR oslo_db.sqlalchemy.exc_filters result.read() 2017-08-08 11:10:19.137 359275 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 1312, in read 2017-08-08 11:10:19.137 359275 ERROR oslo_db.sqlalchemy.exc_filters first_packet = self.connection._read_packet() 2017-08-08 11:10:19.137 359275 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 991, in _read_packet 2017-08-08 11:10:19.137 359275 ERROR oslo_db.sqlalchemy.exc_filters packet.check_error() 2017-08-08 11:10:19.137 359275 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 393, in check_error 2017-08-08 11:10:19.137 359275 ERROR oslo_db.sqlalchemy.exc_filters err.raise_mysql_exception(self._data) 2017-08-08 11:10:19.137 359275 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python2.7/site-packages/pymysql/err.py", line 107, in raise_mysql_exception 2017-08-08 11:10:19.137 359275 ERROR oslo_db.sqlalchemy.exc_filters raise errorclass(errno, errval) 2017-08-08 11:10:19.137 359275 ERROR oslo_db.sqlalchemy.exc_filters InternalError: (1060, u"Duplicate column name 'race_preventer'") 2017-08-08 11:10:19.137 359275 ERROR oslo_db.sqlalchemy.exc_filters 2017-08-08 11:10:19.140 359275 CRITICAL cinder [-] DBError: (pymysql.err.InternalError) (1060, u"Duplicate column name 'race_preventer'") [SQL: u'\nALTER TABLE workers ADD race_preventer INTEGER NOT NULL DEFAULT 0'] and we can see that the migration already happened on ctl1: 2017-08-08 11:10:18.959 349866 DEBUG migrate.versioning.repository [-] Config: OrderedDict([('db_settings', OrderedDict([('__name__', 'db_settings'), ('repository_id', 'cinder'), ('version_table', 'migrate_version'), ('required_dbs', '[]')]))]) __init__ /usr/lib/python2.7/site-packages/migrate/versioning/repository.py:83 2017-08-08 11:10:18.969 349866 INFO migrate.versioning.api [-] 79 -> 80... 2017-08-08 11:10:18.977 349866 INFO migrate.versioning.api [-] done 2017-08-08 11:10:18.977 349866 INFO migrate.versioning.api [-] 80 -> 81... 2017-08-08 11:10:18.982 349866 INFO migrate.versioning.api [-] done 2017-08-08 11:10:18.983 349866 INFO migrate.versioning.api [-] 81 -> 82... 2017-08-08 11:10:18.988 349866 INFO migrate.versioning.api [-] done 2017-08-08 11:10:18.989 349866 INFO migrate.versioning.api [-] 82 -> 83... 2017-08-08 11:10:18.994 349866 INFO migrate.versioning.api [-] done 2017-08-08 11:10:18.995 349866 INFO migrate.versioning.api [-] 83 -> 84... 2017-08-08 11:10:19.000 349866 INFO migrate.versioning.api [-] done 2017-08-08 11:10:19.000 349866 INFO migrate.versioning.api [-] 84 -> 85... 2017-08-08 11:10:19.025 349866 INFO migrate.versioning.api [-] done 2017-08-08 11:10:19.025 349866 INFO migrate.versioning.api [-] 85 -> 86... 2017-08-08 11:10:19.053 349866 INFO migrate.versioning.api [-] done 2017-08-08 11:10:19.053 349866 INFO migrate.versioning.api [-] 86 -> 87... 2017-08-08 11:10:19.068 349866 INFO migrate.versioning.api [-] done 2017-08-08 11:10:19.068 349866 INFO migrate.versioning.api [-] 87 -> 88... 2017-08-08 11:10:19.101 349866 INFO migrate.versioning.api [-] done 2017-08-08 11:10:19.101 349866 INFO migrate.versioning.api [-] 88 -> 89... 2017-08-08 11:10:19.117 349866 INFO migrate.versioning.api [-] done 2017-08-08 11:10:19.117 349866 INFO migrate.versioning.api [-] 89 -> 90... 2017-08-08 11:10:19.142 349866 INFO migrate.versioning.api [-] done 2017-08-08 11:10:19.142 349866 INFO migrate.versioning.api [-] 90 -> 91... 2017-08-08 11:10:19.175 349866 INFO migrate.versioning.api [-] done 2017-08-08 11:10:19.175 349866 INFO migrate.versioning.api [-] 91 -> 92... 2017-08-08 11:10:19.181 349866 INFO migrate.versioning.api [-] done 2017-08-08 11:10:19.182 349866 INFO migrate.versioning.api [-] 92 -> 93... 2017-08-08 11:10:19.187 349866 INFO migrate.versioning.api [-] done 2017-08-08 11:10:19.188 349866 INFO migrate.versioning.api [-] 93 -> 94... 2017-08-08 11:10:19.194 349866 INFO migrate.versioning.api [-] done 2017-08-08 11:10:19.194 349866 INFO migrate.versioning.api [-] 94 -> 95... 2017-08-08 11:10:19.202 349866 INFO migrate.versioning.api [-] done 2017-08-08 11:10:19.202 349866 INFO migrate.versioning.api [-] 95 -> 96... 2017-08-08 11:10:19.210 349866 INFO migrate.versioning.api [-] done hence the duplicate error.
Backported to stable/ocata upstream.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:3098