Bug 1479327 - OSP10 -> OSP11 upgrade: major upgrade composable step fails on composable roles deployment while running cinder-manage db sync
OSP10 -> OSP11 upgrade: major upgrade composable step fails on composable rol...
Status: CLOSED ERRATA
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates (Show other bugs)
11.0 (Ocata)
Unspecified Unspecified
high Severity urgent
: z3
: 11.0 (Ocata)
Assigned To: Sofer Athlan-Guyot
Marius Cornea
: Regression, Triaged, ZStream
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-08-08 07:24 EDT by Marius Cornea
Modified: 2017-10-31 13:37 EDT (History)
8 users (show)

See Also:
Fixed In Version: openstack-tripleo-heat-templates-6.2.1-2.el7ost
Doc Type: Bug Fix
Doc Text:
Previously, upgrading a composable roles deployment running cinder-manage db sync would fail with duplicate columns in the cinder database. This was because cinder-manage db sync was run multiple times. With this update, the sync only runs only one controller. As a result, the cinder database schema successfully upgrades.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-10-31 13:37:35 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Launchpad 1709315 None None None 2017-08-08 09:22 EDT
OpenStack gerrit 494976 None None None 2017-08-18 03:11 EDT
Red Hat Product Errata RHBA-2017:3098 normal SHIPPED_LIVE Red Hat OpenStack Platform 11.0 director Bug Fix Advisory 2017-10-31 17:33:28 EDT

  None (edit)
Description Marius Cornea 2017-08-08 07:24:50 EDT
Description of problem:
OSP10 -> OSP11 upgrade: major upgrade composable step fails on composable roles deployment while running cinder-manage db sync:

[stack@undercloud-0 ~]$ openstack stack failures list overcloud
overcloud.AllNodesDeploySteps.ControllerUpgrade_Step5.0:
  resource_type: OS::Heat::SoftwareDeployment
  physical_resource_id: 2c5739e7-6099-4830-98ef-eb610d747866
  status: CREATE_FAILED
  status_reason: |
    Error: resources[0]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 2
  deploy_stdout: |
    ...
    TASK [Gathering Facts] *********************************************************
    ok: [localhost]
    
    TASK [Sync cinder DB] **********************************************************
    fatal: [localhost]: FAILED! => {"changed": true, "cmd": ["cinder-manage", "db", "sync"], "delta": "0:00:01.579537", "end": "2017-08-08 11:10:19.235581", "failed": true, "rc": 1, "start": "2017-08-08 11:10:17.656044", "stderr": "Option \"logdir\" from group \"DEFAULT\" is deprecated. Use option \"log-dir\" from group \"DEFAULT\".", "stderr_lines": ["Option \"logdir\" from group \"DEFAULT\" is deprecated. Use option \"log-dir\" from group \"DEFAULT\"."], "stdout": "", "stdout_lines": []}
    	to retry, use: --limit @/var/lib/heat-config/heat-config-ansible/84534f12-7925-4c8c-96c8-b2fd69857f54_playbook.retry
    
    PLAY RECAP *********************************************************************
    localhost                  : ok=1    changed=0    unreachable=0    failed=1   
    
    (truncated, view all with --long)
  deploy_stderr: |

overcloud.AllNodesDeploySteps.ServiceApiUpgrade_Step5:
  resource_type: OS::Heat::SoftwareDeploymentGroup
  physical_resource_id: f3714032-4cce-4f76-9797-18303b065d5e
  status: CREATE_FAILED
  status_reason: |
    CREATE aborted
overcloud.AllNodesDeploySteps.CephStorageUpgrade_Step5:
  resource_type: OS::Heat::SoftwareDeploymentGroup
  physical_resource_id: ae7b3aab-8b18-433a-bbff-97aa4148ce83
  status: CREATE_FAILED
  status_reason: |
    CREATE aborted



Version-Release number of selected component (if applicable):
openstack-tripleo-heat-templates-6.1.0-2.el7ost.noarch

How reproducible:
100%

Steps to Reproduce:
1. Deploy OSP11 with custom role running systemd managed services

#!/bin/bash

timeout 100m openstack overcloud deploy --templates /usr/share/openstack-tripleo-heat-templates \
  -r /home/stack/composable_roles/roles/roles_data.yaml \
  -e /home/stack/composable_roles/roles/nodes.yaml \
  -e /home/stack/composable_roles/roles/network-config.yaml \
--environment-file /usr/share/openstack-tripleo-heat-templates/environments/services/sahara.yaml \
--environment-file /usr/share/openstack-tripleo-heat-templates/environments/cinder-backup.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/storage-environment.yaml \
-e /home/stack/composable_roles/internal.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \
-e /home/stack/composable_roles/network/network-environment.yaml \
-e /home/stack/composable_roles/enable-tls.yaml \
-e /home/stack/composable_roles/inject-trust-anchor.yaml \
-e /home/stack/composable_roles/public_vip.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/tls-endpoints-public-ip.yaml \
-e /home/stack/composable_roles/debug.yaml \
--log-file overcloud_deployment_36.log

roles_data: http://paste.openstack.org/show/617763/

Actual results:
major upgrade composable step fails

Expected results:
Upgrade completes fine

Additional info:
Attaching sosreports.
Comment 2 Sofer Athlan-Guyot 2017-08-08 09:22:11 EDT
We have this exception in the cinder-manage.log of ctl0, 

2017-08-08 11:10:19.121 359275 INFO migrate.versioning.api [-] 89 -> 90...
2017-08-08 11:10:19.137 359275 ERROR oslo_db.sqlalchemy.exc_filters [-] DBAPIError exception wrapped from (pymysql.err.InternalError) (1060, u"Duplicate column name 'race_preventer'") [SQL: u'\nALTER TABLE workers ADD race_preventer INTEGER NOT NULL DEFAULT 0']
2017-08-08 11:10:19.137 359275 ERROR oslo_db.sqlalchemy.exc_filters Traceback (most recent call last):
2017-08-08 11:10:19.137 359275 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1139, in _execute_context
2017-08-08 11:10:19.137 359275 ERROR oslo_db.sqlalchemy.exc_filters context)
2017-08-08 11:10:19.137 359275 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/default.py", line 450, in do_execute
2017-08-08 11:10:19.137 359275 ERROR oslo_db.sqlalchemy.exc_filters cursor.execute(statement, parameters)
2017-08-08 11:10:19.137 359275 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python2.7/site-packages/pymysql/cursors.py", line 166, in execute
2017-08-08 11:10:19.137 359275 ERROR oslo_db.sqlalchemy.exc_filters result = self._query(query)
2017-08-08 11:10:19.137 359275 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python2.7/site-packages/pymysql/cursors.py", line 322, in _query
2017-08-08 11:10:19.137 359275 ERROR oslo_db.sqlalchemy.exc_filters conn.query(q)
2017-08-08 11:10:19.137 359275 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 841, in query
2017-08-08 11:10:19.137 359275 ERROR oslo_db.sqlalchemy.exc_filters self._affected_rows = self._read_query_result(unbuffered=unbuffered)
2017-08-08 11:10:19.137 359275 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 1029, in _read_query_result
2017-08-08 11:10:19.137 359275 ERROR oslo_db.sqlalchemy.exc_filters result.read()
2017-08-08 11:10:19.137 359275 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 1312, in read
2017-08-08 11:10:19.137 359275 ERROR oslo_db.sqlalchemy.exc_filters first_packet = self.connection._read_packet()
2017-08-08 11:10:19.137 359275 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 991, in _read_packet
2017-08-08 11:10:19.137 359275 ERROR oslo_db.sqlalchemy.exc_filters packet.check_error()
2017-08-08 11:10:19.137 359275 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 393, in check_error
2017-08-08 11:10:19.137 359275 ERROR oslo_db.sqlalchemy.exc_filters err.raise_mysql_exception(self._data)
2017-08-08 11:10:19.137 359275 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python2.7/site-packages/pymysql/err.py", line 107, in raise_mysql_exception
2017-08-08 11:10:19.137 359275 ERROR oslo_db.sqlalchemy.exc_filters raise errorclass(errno, errval)
2017-08-08 11:10:19.137 359275 ERROR oslo_db.sqlalchemy.exc_filters InternalError: (1060, u"Duplicate column name 'race_preventer'")
2017-08-08 11:10:19.137 359275 ERROR oslo_db.sqlalchemy.exc_filters
2017-08-08 11:10:19.140 359275 CRITICAL cinder [-] DBError: (pymysql.err.InternalError) (1060, u"Duplicate column name 'race_preventer'") [SQL: u'\nALTER TABLE workers ADD race_preventer INTEGER NOT NULL DEFAULT 0']

and we can see that the migration already happened on ctl1:

2017-08-08 11:10:18.959 349866 DEBUG migrate.versioning.repository [-] Config: OrderedDict([('db_settings', OrderedDict([('__name__', 'db_settings'), ('repository_id', 'cinder'), ('version_table', 'migrate_version'), ('required_dbs', '[]')]))]) __init__ /usr/lib/python2.7/site-packages/migrate/versioning/repository.py:83
2017-08-08 11:10:18.969 349866 INFO migrate.versioning.api [-] 79 -> 80...
2017-08-08 11:10:18.977 349866 INFO migrate.versioning.api [-] done
2017-08-08 11:10:18.977 349866 INFO migrate.versioning.api [-] 80 -> 81...
2017-08-08 11:10:18.982 349866 INFO migrate.versioning.api [-] done
2017-08-08 11:10:18.983 349866 INFO migrate.versioning.api [-] 81 -> 82...
2017-08-08 11:10:18.988 349866 INFO migrate.versioning.api [-] done
2017-08-08 11:10:18.989 349866 INFO migrate.versioning.api [-] 82 -> 83...
2017-08-08 11:10:18.994 349866 INFO migrate.versioning.api [-] done
2017-08-08 11:10:18.995 349866 INFO migrate.versioning.api [-] 83 -> 84...
2017-08-08 11:10:19.000 349866 INFO migrate.versioning.api [-] done
2017-08-08 11:10:19.000 349866 INFO migrate.versioning.api [-] 84 -> 85...
2017-08-08 11:10:19.025 349866 INFO migrate.versioning.api [-] done
2017-08-08 11:10:19.025 349866 INFO migrate.versioning.api [-] 85 -> 86...
2017-08-08 11:10:19.053 349866 INFO migrate.versioning.api [-] done
2017-08-08 11:10:19.053 349866 INFO migrate.versioning.api [-] 86 -> 87...
2017-08-08 11:10:19.068 349866 INFO migrate.versioning.api [-] done
2017-08-08 11:10:19.068 349866 INFO migrate.versioning.api [-] 87 -> 88...
2017-08-08 11:10:19.101 349866 INFO migrate.versioning.api [-] done
2017-08-08 11:10:19.101 349866 INFO migrate.versioning.api [-] 88 -> 89...
2017-08-08 11:10:19.117 349866 INFO migrate.versioning.api [-] done
2017-08-08 11:10:19.117 349866 INFO migrate.versioning.api [-] 89 -> 90...
2017-08-08 11:10:19.142 349866 INFO migrate.versioning.api [-] done
2017-08-08 11:10:19.142 349866 INFO migrate.versioning.api [-] 90 -> 91...
2017-08-08 11:10:19.175 349866 INFO migrate.versioning.api [-] done
2017-08-08 11:10:19.175 349866 INFO migrate.versioning.api [-] 91 -> 92...
2017-08-08 11:10:19.181 349866 INFO migrate.versioning.api [-] done
2017-08-08 11:10:19.182 349866 INFO migrate.versioning.api [-] 92 -> 93...
2017-08-08 11:10:19.187 349866 INFO migrate.versioning.api [-] done
2017-08-08 11:10:19.188 349866 INFO migrate.versioning.api [-] 93 -> 94...
2017-08-08 11:10:19.194 349866 INFO migrate.versioning.api [-] done
2017-08-08 11:10:19.194 349866 INFO migrate.versioning.api [-] 94 -> 95...
2017-08-08 11:10:19.202 349866 INFO migrate.versioning.api [-] done
2017-08-08 11:10:19.202 349866 INFO migrate.versioning.api [-] 95 -> 96...
2017-08-08 11:10:19.210 349866 INFO migrate.versioning.api [-] done

hence the duplicate error.
Comment 3 Sofer Athlan-Guyot 2017-08-18 03:11:32 EDT
Backported to stable/ocata upstream.
Comment 7 errata-xmlrpc 2017-10-31 13:37:35 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:3098

Note You need to log in before you can comment on or make changes to this bug.