Description of problem: OSP10 -> OSP11 upgrade fails when Nova services are running on a standalone role. roles_data file: http://paste.openstack.org/show/600798/ Upgrade fails during major-upgrade-composable-steps.yaml with the following error: stdout: overcloud.AllNodesDeploySteps.ControllerUpgrade_Step2: resource_type: OS::Heat::SoftwareDeploymentGroup physical_resource_id: 170d8e1d-58e0-4720-8149-a9fd4f2b9e1d status: CREATE_FAILED status_reason: | CREATE aborted overcloud.AllNodesDeploySteps.NovacontrolUpgrade_Step5.0: resource_type: OS::Heat::SoftwareDeployment physical_resource_id: 5cf72dbf-9b22-4f63-8b98-25061864df35 status: CREATE_FAILED status_reason: | Error: resources[0]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 2 deploy_stdout: | ... TASK [Run puppet apply to set tranport_url in nova.conf] *********************** changed: [localhost] TASK [Setup cell_v2 (map cell0)] *********************************************** fatal: [localhost]: FAILED! => {"changed": true, "cmd": ["nova-manage", "cell_v2", "map_cell0"], "delta": "0:02:12.569490", "end": "2017-02-28 15:41:23.802908", "failed": true, "rc": 1, "start": "2017-02-28 15:39:11.233418", "stderr": "", "stdout": "An error has occurred: Traceback (most recent call last): File \"/usr/lib/python2.7/site-packages/nova/cmd/manage.py\", line 1594, in main ret = fn(*fn_args, **fn_kwargs) File \"/usr/lib/python2.7/site-packages/nova/cmd/manage.py\", line 1140, in map_cell0 self._map_cell0(database_connection=database_connection) File \"/usr/lib/python2.7/site-packages/nova/cmd/manage.py\", line 1170, in _map_cell0 cell_mapping.create() File \"/usr/lib/python2.7/site-packages/oslo_versionedobjects/base.py\", line 226, in wrapper return fn(self, *args, **kwargs) File \"/usr/lib/python2.7/site-packages/nova/objects/cell_mapping.py\", line 71, in create db_mapping = self._create_in_db(self._context, self.obj_get_changes()) File \"/usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py\", line 893, in wrapper with self._transaction_scope(context): File \"/usr/lib64/python2.7/contextlib.py\", line 17, in __enter__ return self.gen.next() File \"/usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py\", line 944, in _transaction_scope allow_async=self._allow_async) as resource: File \"/usr/lib64/python2.7/contextlib.py\", line 17, in __enter__ return self.gen.next() File \"/usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py\", line 558, in _session bind=self.connection, mode=self.mode) File \"/usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py\", line 317, in _create_session self._start() File \"/usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py\", line 403, in _start engine_args, maker_args) File \"/usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py\", line 427, in _setup_for_connection sql_connection=sql_connection, **engine_kwargs) File \"/usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/engines.py\", line 155, in create_engine test_conn = _test_connection(engine, max_retries, retry_interval) File \"/usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/engines.py\", line 339, in _test_connection six.reraise(type(de_ref), de_ref) File \"<string>\", line 2, in reraise DBConnectionError: (pymysql.err.OperationalError) (2003, \"Can't connect to MySQL server on '172.17.1.13' ([Errno 113] EHOSTUNREACH)\")", "stdout_lines": ["An error has occurred:", "Traceback (most recent call last):", " File \"/usr/lib/python2.7/site-packages/nova/cmd/manage.py\", line 1594, in main", " ret = fn(*fn_args, **fn_kwargs)", " File \"/usr/lib/python2.7/site-packages/nova/cmd/manage.py\", line 1140, in map_cell0", " self._map_cell0(database_connection=database_connection)", " File \"/usr/lib/python2.7/site-packages/nova/cmd/manage.py\", line 1170, in _map_cell0", " cell_mapping.create()", " File \"/usr/lib/python2.7/site-packages/oslo_versionedobjects/base.py\", line 226, in wrapper", " return fn(self, *args, **kwargs)", " File \"/usr/lib/python2.7/site-packages/nova/objects/cell_mapping.py\", line 71, in create", " db_mapping = self._create_in_db(self._context, self.obj_get_changes())", " File \"/usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py\", line 893, in wrapper", " with self._transaction_scope(context):", " File \"/usr/lib64/python2.7/contextlib.py\", line 17, in __enter__", " return self.gen.next()", " File \"/usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py\", line 944, in _transaction_scope", " allow_async=self._allow_async) as resource:", " File \"/usr/lib64/python2.7/contextlib.py\", line 17, in __enter__", " return self.gen.next()", " File \"/usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py\", line 558, in _session", " bind=self.connection, mode=self.mode)", " File \"/usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py\", line 317, in _create_session", " self._start()", " File \"/usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py\", line 403, in _start", " engine_args, maker_args)", " File \"/usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py\", line 427, in _setup_for_connection", " sql_connection=sql_connection, **engine_kwargs)", " File \"/usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/engines.py\", line 155, in create_engine", " test_conn = _test_connection(engine, max_retries, retry_interval)", " File \"/usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/engines.py\", line 339, in _test_connection", " six.reraise(type(de_ref), de_ref)", " File \"<string>\", line 2, in reraise", "DBConnectionError: (pymysql.err.OperationalError) (2003, \"Can't connect to MySQL server on '172.17.1.13' ([Errno 113] EHOSTUNREACH)\")"], "warnings": []} to retry, use: --limit @/var/lib/heat-config/heat-config-ansible/b106b80f-8c24-4896-98d3-06ddf74f7508_playbook.retry Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. Deploy OSP10 overcloud with a standalone running nova control plane services 2. Upgrade OSP10 to OSP11 Actual results: Upgrade fails while running Setup cell_v2 (map cell0) step. Expected results: Upgrade succeeds. Additional info: 172.17.1.13 is the internal API VIP but it cannot be reached because the cluster is not running when this step is run.
Got a successful run in CI, so moving this one to POST. Checking if it's still working with the latest puddle.
(In reply to Sofer Athlan-Guyot from comment #2) > Got a successful run in CI, so moving this one to POST. Checking if it's > still working with the latest puddle. I wasn't able to reproduce this issue with latest puddle. I think we're good on this one.
Adding compute for visibility.
Removing compute, as it's unrelated. The pcs cluster is not started making the database migration failed as the vip configured in nova::cell0_database_connection isn't reachable. But this is happening at step5 while all the database should be back in step4.
Hi, so the upgrade of the custom role novacontrol is happening at the same time than the upgrade of the controller node: I prefix Novacontrol with N and controller logs with C: - C: step0: Apr 03 09:08:55 - N: step0: Apr 03 09:07:36 - N: step1: Apr 03 09:08:27 - N: step2: Apr 03 09:08:52 - C: step1: Apr 03 09:13:56 - N: step3: Apr 03 09:14:24 - N: step4: Apr 03 09:14:40 - C: step2: Apr 03 09:15:01 - N: step5: Apr 03 09:17:28 - C: step3: Apr 03 09:20:48 - C: step4: never happened - C: step5: never happened So the Novacontrol role got the time to reach the step5 while the controller was still at step3. We shouldn't have this kind of intermixed upgrade happening. Will check further on why this happen.
In stable/ocata.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:1245