Description of problem: Client is facing the following random (so might be a race condition) issue while deploying is overcloud: (the full stack trace will be put in the next private comment (because of PII)) TASK [Discovering nova hosts] ******:******************************************** Tuesday 04 February 2020 12:50:12 +0000 (0:00:00.473) 0:00:04.655 ****** fatal: [IP.33 -> IP.21]: FAILED! => {"changed": false, "cmd": ["docker", "exec", "nova_compute", "nova-manage", "cell_v2", "discover_hosts", "--by-service"], "delta": "0:00:03.084736", "end": "2020-02-04 12:50:15.893532", "msg": "non-zero return code", "rc": 1, "start": "2020-02-04 12:50:12.808796", "stderr": "", "stderr_lines": [], "stdout": "An error has occurred:\ Traceback (most recent call last):\ File \\"/usr/lib/python2.7/site-packages/nova/cmd/manage.py\\", line 1657, in main\ ret = fn(*fn_args, **fn_kwargs)\ File \\"/usr/lib/python2.7/site-packages/nova/cmd/manage.py\\", line 1323, in discover_hosts\ by_service)\ File \\"/usr/lib/python2.7/site-packages/nova/objects/host_mapping.py\\", line 265, in discover_hosts\ by_service)\ File \\"/usr/lib/python2.7/site-packages/nova/objects/host_mapping.py\\", line 224, in _check_and_create_host_mappings\ status_fn)\ File \\"/usr/lib/python2.7/site-packages/nova/objects/host_mapping.py\\", line 211, in _check_and_create_service_host_mappings\ host_mapping.create()\ File \\"/usr/lib/python2.7/site-packages/oslo_versionedobjects/base.py\\", line 226, in wrapper\ return fn(self, *args, **kwargs)\ File \\"/usr/lib/python2.7/site-packages/nova/objects/host_mapping.py\\", line 114, in create\ db_mapping = self._create_in_db(self._context, changes)\ File \\"/usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py\\", line 988, in wrapper\ return fn(*args, **kwargs)\ File \\"/usr/lib/python2.7/site-packages/nova/objects/host_mapping.py\\", line 107, in _create_in_db\ return _apply_updates(context, db_mapping, updates)\ File \\"/usr/lib/python2.7/site-packages/nova/objects/host_mapping.py\\", line 33, in _apply_updates\ db_mapping.save(context.session)\ File \\"/usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/models.py\\", line 50, in save\ session.flush()\ File \\"/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py\\", line 2243, in flush\ self._flush(objects)\ File \\"/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py\\", line 2369, in _flush\ transaction.rollback(_capture_exception=True)\ File \\"/usr/lib64/python2.7/site-packages/sqlalchemy/util/langhelpers.py\\", line 66, in __exit__\ compat.reraise(exc_type, exc_value, exc_tb)\ File \\"/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py\\", line 2333, in _flush\ flush_context.execute()\ File \\"/usr/lib64/python2.7/site-packages/sqlalchemy/orm/unitofwork.py\\", line 391, in execute\ rec.execute(self)\ File \\"/usr/lib64/python2.7/site-packages/sqlalchemy/orm/unitofwork.py\\", line 556, in execute\ uow\ File \\"/usr/lib64/python2.7/site-packages/sqlalchemy/orm/persistence.py\\", line 181, in save_obj\ mapper, table, insert)\ File \\"/usr/lib64/python2.7/site-packages/sqlalchemy/orm/persistence.py\\", line 866, in _emit_insert_statements\ execute(statement, params)\ File \\"/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py\\", line 948, in execute\ return meth(self, multiparams, params)\ File \\"/usr/lib64/python2.7/site-packages/sqlalchemy/sql/elements.py\\", line 269, in _execute_on_connection\ return connection._execute_clauseelement(self, multiparams, params)\ File \\"/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py\\", line 1060, in _execute_clauseelement\ compiled_sql, distilled_params\ File \\"/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py\\", line 1200, in _execute_context\ context)\ File \\"/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py\\", line 1409, in _handle_dbapi_exception\ util.raise_from_cause(newraise, exc_info)\ File \\"/usr/lib64/python2.7/site-packages/sqlalchemy/util/compat.py\\", line 203, in raise_from_cause\ reraise(type(exception), exception, tb=exc_tb, cause=cause)\ File \\"/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py\\", line 1193, in _execute_context\ context)\ File \\"/usr/lib64/python2.7/site-packages/sqlalchemy/engine/default.py\\", line 507, in do_execute\ cursor.execute(statement, parameters)\ File \\"/usr/lib/python2.7/site-packages/pymysql/cursors.py\\", line 166, in execute\ result = self._query(query)\ File \\"/usr/lib/python2.7/site-packages/pymysql/cursors.py\\", line 322, in _query\ conn.query(q)\ File \\"/usr/lib/python2.7/site-packages/pymysql/connections.py\\", line 856, in query\ self._affected_rows = self._read_query_result(unbuffered=unbuffered)\ File \\"/usr/lib/python2.7/site-packages/pymysql/connections.py\\", line 1057, in _read_query_result\ result.read()\ File \\"/usr/lib/python2.7/site-packages/pymysql/connections.py\\", line 1340, in read\ first_packet = self.connection._read_packet()\ File \\"/usr/lib/python2.7/site-packages/pymysql/connections.py\\", line 1014, in _read_packet\ packet.check_error()\ File \\"/usr/lib/python2.7/site-packages/pymysql/connections.py\\", line 393, in check_error\ err.raise_mysql_exception(self._data)\ File \\"/usr/lib/python2.7/site-packages/pymysql/err.py\\", line 107, in raise_mysql_exception\ raise errorclass(errno, errval)\ DBDuplicateEntry: (pymysql.err.IntegrityError) (1062, u\\"Duplicate entry \'compute01.domain.tld\' for key \'uniq_host_mappings0host\'\\") Version-Release number of selected component (if applicable): openstack-tripleo-common-containers-8.7.1-5.el7ost.noarch openstack-tripleo-common-8.7.1-5.el7ost.noarch openstack-tripleo-heat-templates-8.4.1-23.el7ost.noarch How reproducible: Random(?) Steps to Reproduce: 1.Deploy overcloud 2. 3. Actual results: overcloud deployment fails Expected results: overcloud deployment succeed Additional info: Have a sosreport from the Director node. If anything else is needed please just ask.
The discovery task [1] is correct triggered via delegate only on a single node, but all computes delegate jobs to this single host: TASK [Discovering nova hosts] ************************************************** Thursday 06 February 2020 06:24:10 -0500 (0:00:00.694) 0:00:05.342 ***** ok: [192.168.24.11 -> 192.168.24.11] ok: [192.168.24.10 -> 192.168.24.11] ok: [192.168.24.17 -> 192.168.24.11] ok: [192.168.24.14 -> 192.168.24.11] ok: [192.168.24.7 -> 192.168.24.11] ok: [192.168.24.6 -> 192.168.24.11] We should just run this task once. [1] https://github.com/openstack/tripleo-common/blob/stable/queens/playbooks/nova_cellv2_host_discover.yaml#L15
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0760