Bug 1549057

Summary: Overcloud Deployment failed: Exhausted all hosts available for retrying build failures for instance {UUID}., Code: 500"
Product: Red Hat OpenStack Reporter: Ronnie Rasouli <rrasouli>
Component: openstack-selinuxAssignee: Lon Hohberger <lhh>
Status: CLOSED DUPLICATE QA Contact: Udi Shkalim <ushkalim>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 13.0 (Queens)CC: hjensas, mgrepl, psedlak, skatlapa, srevivo, ykarel
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-02-27 13:16:50 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Ronnie Rasouli 2018-02-26 10:20:03 UTC
Description of problem:

overcloud deployment failed hosts went on error state:

+--------------------------------------+--------------+--------+------------+-------------+----------+
| ID                                   | Name         | Status | Task State | Power State | Networks |
+--------------------------------------+--------------+--------+------------+-------------+----------+
| f8454595-ce6b-4f9b-99ca-b8a5c2f37a5a | ceph-0       | ERROR  | -          | NOSTATE     |          |
| ba8aeda1-eb02-418d-a195-d274466ae81e | ceph-1       | BUILD  | scheduling | NOSTATE     |          |
| 6c722474-4fc9-4bd6-b8bf-2eecb42f48cc | ceph-2       | BUILD  | scheduling | NOSTATE     |          |
| 430f449e-6dc7-4d16-8928-a9759871e1e2 | compute-0    | ERROR  | -          | NOSTATE     |          |
| f9f7136c-8128-41c8-a87e-c5caa6c95848 | controller-0 | ERROR  | -          | NOSTATE     |          |
| a49522f2-5f06-4aca-aa66-61da4cc5e84e | controller-1 | BUILD  | scheduling | NOSTATE     |          |
| 267c7689-402b-4d40-a89c-28e3cb8e471d | controller-2 | BUILD  | scheduling | NOSTATE     |          |
+--------------------------------------+--------------+--------+------------+-------------+----------+


| resource_status        | CREATE_FAILED                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
| resource_status_reason | resources.CephStorage: Resource CREATE failed: ResourceInError: resources[0].resources.CephStorage: Went to status ERROR due to "Message: Exceeded maximum number of retries. Exhausted all hosts available for retrying build failures for instance 2ed9a042-36e2-4329-9d32-ff24d863a116., Code: 500"

Version-Release number of selected component (if applicable):

RHOS 13 core_puddle: 2018-02-23.1


How reproducible:


Steps to Reproduce:
1.
2.
3.

Deploy RHOS13 with infrared
infrared tripleo-overcloud                 -o overcloud-install.yml                 --version 13                 --deployment-files virt                 --overcloud-templates none                 --storage-backend ceph                 --overcloud-ssl True                 --network-backend vxlan                 --network-protocol ipv4                 --network-bgpvpn no                 --network-dvr no                 --network-l2gw no                  --storage-external no                 --overcloud-debug yes                 --introspect no                 --tagging no                 --deploy yes                 --post no                 --containers yes --registry-mirror rhos-qe-mirror-tlv.usersys.redhat.com:5000                                                                                                                                        



Actual results:

Overcloud deployment failed

Expected results:


Additional info:


Found a SELLINUX denied 
type=AVC msg=audit(1519634068.051:112): avc:  denied  { read } for  pid=2362 comm="inet_gethost" name="unix" dev="proc" ino=4026532003 scontext=system_u:system_r:rabbitmq_t:s0 tcontext=system_u:object_r:proc_net_t:s0 tclass=file


SOS report located on: http://ikook.tlv.redhat.com/uploads/sosreport-rrasouli-20180226050818.tar.xz

Comment 1 Ronnie Rasouli 2018-02-26 10:39:14 UTC
From nova-scheduler.log on undercloud

2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db [req-40bf95ea-3d75-4118-aafb-c0b4a2882342 - - - - -] Unexpected error while reporting service status: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.24.1' ([Errno 111] ECONNREFUSED)") [SQL: u'SELECT 1'] (Background on this error at: http://sqlalche.me/e/e3q8)
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db Traceback (most recent call last):
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib/python2.7/site-packages/nova/servicegroup/drivers/db.py", line 91, in _report_state
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     service.service_ref.save()
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib/python2.7/site-packages/oslo_versionedobjects/base.py", line 226, in wrapper
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     return fn(self, *args, **kwargs)
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib/python2.7/site-packages/nova/objects/service.py", line 382, in save
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     db_service = db.service_update(self._context, self.id, updates)
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib/python2.7/site-packages/nova/db/api.py", line 189, in service_update
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     return IMPL.service_update(context, service_id, values)
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib/python2.7/site-packages/oslo_db/api.py", line 147, in wrapper
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     ectxt.value = e.inner_exc
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
LECT 1'] (Background on this error at: http://sqlalche.me/e/e3q8)

2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db [req-40bf95ea-3d75-4118-aafb-c0b4a2882342 - - - - -] Unexpected error while reporting service status: DBConnectio
nError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.24.1' ([Errno 111] ECONNREFUSED)") [SQL: u'SELECT 1'] (Background on this error at: http:
//sqlalche.me/e/e3q8)
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db Traceback (most recent call last):
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib/python2.7/site-packages/nova/servicegroup/drivers/db.py", line 91, in _report_state
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     service.service_ref.save()
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib/python2.7/site-packages/oslo_versionedobjects/base.py", line 226, in wrapper
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     return fn(self, *args, **kwargs)
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib/python2.7/site-packages/nova/objects/service.py", line 382, in save
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     db_service = db.service_update(self._context, self.id, updates)
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib/python2.7/site-packages/nova/db/api.py", line 189, in service_update
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     return IMPL.service_update(context, service_id, values)
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib/python2.7/site-packages/oslo_db/api.py", line 147, in wrapper
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     ectxt.value = e.inner_exc
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     self.force_reraise()
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     six.reraise(self.type_, self.value, self.tb)
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib/python2.7/site-packages/oslo_db/api.py", line 135, in wrapper
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     return f(*args, **kwargs)
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib/python2.7/site-packages/nova/db/sqlalchemy/api.py", line 240, in wrapped
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     return f(context, *args, **kwargs)
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib/python2.7/site-packages/nova/db/sqlalchemy/api.py", line 569, in service_update
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     service_ref = service_get(context, service_id)
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib/python2.7/site-packages/nova/db/sqlalchemy/api.py", line 255, in wrapped
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     return f(context, *args, **kwargs)
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib/python2.7/site-packages/nova/db/sqlalchemy/api.py", line 429, in service_get
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     result = query.first()
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", line 2778, in first
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     ret = list(self[0:1])
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", line 2570, in __getitem__
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     return list(res)
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", line 2878, in __iter__
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     return self._execute_and_instances(context)
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", line 2899, in _execute_and_instances
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     close_with_result=True)
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", line 2908, in _get_bind_args
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     **kw
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", line 2890, in _connection_from_session
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     conn = self.session.connection(**kw)
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 1035, in connection
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     execution_options=execution_options)
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 1040, in _connection_for_bind
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     engine, execution_options)
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 409, in _connection_for_bind
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     conn = bind.contextual_connect()
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 2125, in contextual_connect
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     **kwargs)
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 110, in __init__
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     self.dispatch.engine_connect(self, self.__branch)
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib64/python2.7/site-packages/sqlalchemy/event/attr.py", line 256, in __call__
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     fn(*args, **kw)
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/engines.py", line 83, in _connect_ping_listener
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     connection.scalar(select([1]))
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 880, in scalar
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     return self.execute(object, *multiparams, **params).scalar()
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 948, in execute
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     return meth(self, multiparams, params)
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib64/python2.7/site-packages/sqlalchemy/sql/elements.py", line 269, in _execute_on_connection
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     return connection._execute_clauseelement(self, multiparams, params)
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1060, in _execute_clauseelement
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     compiled_sql, distilled_params
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1132, in _execute_context
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     None, None)
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1409, in _handle_dbapi_exception
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     util.raise_from_cause(newraise, exc_info)
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib64/python2.7/site-packages/sqlalchemy/util/compat.py", line 203, in raise_from_cause
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     reraise(type(exception), exception, tb=exc_tb, cause=cause)
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1125, in _execute_context
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     conn = self._revalidate_connection()
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 429, in _revalidate_connection
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     self.__connection = self.engine.raw_connection(_connection=self)
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 2188, in raw_connection
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     self.pool.unique_connection, _connection)
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 2164, in _wrap_pool_connect
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     util.reraise(*sys.exc_info())
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 2158, in _wrap_pool_connect
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     return fn()
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib64/python2.7/site-packages/sqlalchemy/pool.py", line 345, in unique_connection
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     return _ConnectionFairy._checkout(self)
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib64/python2.7/site-packages/sqlalchemy/pool.py", line 782, in _checkout
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     fairy = _ConnectionRecord.checkout(pool)
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib64/python2.7/site-packages/sqlalchemy/pool.py", line 537, in checkout
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     rec.checkin()
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib64/python2.7/site-packages/sqlalchemy/util/langhelpers.py", line 66, in __exit__
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     compat.reraise(exc_type, exc_value, exc_tb)
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib64/python2.7/site-packages/sqlalchemy/pool.py", line 534, in checkout
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     dbapi_connection = rec.get_connection()
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib64/python2.7/site-packages/sqlalchemy/pool.py", line 623, in get_connection
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     self.__connect()
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib64/python2.7/site-packages/sqlalchemy/pool.py", line 667, in __connect
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     connection = pool._invoke_creator(self)
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/strategies.py", line 105, in connect
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     return dialect.connect(*cargs, **cparams)
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/default.py", line 410, in connect
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     return self.dbapi.connect(*cargs, **cparams)
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib/python2.7/site-packages/pymysql/__init__.py", line 90, in Connect
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     return Connection(*args, **kwargs)
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 706, in __init__
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     self.connect()
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db   File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 963, in connect
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db     raise exc
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.24.1' ([Errno 111] ECONNREFUSED)") [SQL: u'SELECT 1'] (Background on this error at: http://sqlalche.me/e/e3q8)
2018-02-26 03:32:46.491 24683 ERROR nova.servicegroup.drivers.db
2018-02-26 03:32:51.470 24683 DEBUG oslo_concurrency.lockutils [req-40bf95ea-3d75-4118-aafb-c0b4a2882342 - - - - -] Acquired semaphore "singleton_lock" lock /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:212
2018-02-26 03:32:51.471 24683 DEBUG oslo_concurrency.lockutils [req-40bf95ea-3d75-4118-aafb-c0b4a2882342 - - - - -] Releasing semaphore "singleton_lock" lock /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:228
2018-02-26 03:34:37.485 1407 INFO oslo_service.periodic_task [-] Skipping periodic task _discover_hosts_in_cells because its interval is negative
2018-02-26 03:34:37.580 1407 WARNING oslo_config.cfg [-] Option "use_baremetal_filters" from group "filter_scheduler" is deprecated for removal (
These filters were used to overcome some of the baremetal scheduling
limitations in Nova prior to the use of the Placement API. Now scheduling will
use the custom resource class defined for each baremetal node to make its
selection.
).  Its value may be silently ignored in the future.
2018-02-26 03:34:37.670 1407 DEBUG oslo_db.sqlalchemy.engines [req-aa15d7f5-59cd-45c0-9152-2ab6255abf75 - - - - -] MySQL server mode set to STRICT_TRANS_TABLES,STRICT_ALL_TABLES,:

Comment 2 Pavel Sedlák 2018-02-27 09:23:23 UTC
Mysql issue likely unrelated, during service startups/restarts it's common to temporarily not have connection to mysql or amqp until they start and get ready too. (Notice also it was 'SELECT 1' - testing query.)

Aside that, this issue is reproducible in CI, in my case i've noticed:

nova/nova-conductor.log:
> 2018-02-26 13:55:13.195 12300 ERROR nova.scheduler.utils [req-9f5e0190-24bb-4f8e-ba5f-24ea05fbc975 694fbdfdb054454ba2015337186a9316 6dfa73a84b0e49639709cc25ae752d21 - default default] [instance: c97ed789-8e4f-4592-8d2b-1b699793fcfd] Error from last host: undercloud-0.redhat.local (node 66891d78-515d-4d16-b0d7-23dbbf1f1bdb): [u'Traceback (most recent call last):\n', u'  File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1840, in _do_build_and_run_instance\n    filter_properties, request_spec)\n', u'  File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2108, in _build_and_run_instance\n    instance_uuid=instance.uuid, reason=six.text_type(e))\n', u"RescheduledException: Build of instance c97ed789-8e4f-4592-8d2b-1b699793fcfd was re-scheduled: Failed to provision instance c97ed789-8e4f-4592-8d2b-1b699793fcfd: Failed to prepare to deploy: Unable to set binding:host_id for neutron port 7f495ac7-1ef2-4b92-905a-342c6758a5ea. Error: Host 66891d78-515d-4d16-b0d7-23dbbf1f1bdb is not connected to any segments on routed provider network '464855f9-e35e-472a-a7e2-5e19e954f2b1'.  It should be connected to one.\nNeutron server returns request_ids: ['req-3a3fe0bf-c438-41af-ba21-b4d3efa6ca4a']\n"]
> 2018-02-26 13:55:13.196 12300 DEBUG nova.conductor.manager [req-9f5e0190-24bb-4f8e-ba5f-24ea05fbc975 694fbdfdb054454ba2015337186a9316 6dfa73a84b0e49639709cc25ae752d21 - default default] Rescheduling: True build_instances /usr/lib/python2.7/site-packages/nova/conductor/manager.py:570
> 2018-02-26 13:55:13.197 12300 WARNING nova.scheduler.utils [req-9f5e0190-24bb-4f8e-ba5f-24ea05fbc975 694fbdfdb054454ba2015337186a9316 6dfa73a84b0e49639709cc25ae752d21 - default default] Failed to compute_task_build_instances: Exceeded maximum number of retries. Exhausted all hosts available for retrying build failures for instance c97ed789-8e4f-4592-8d2b-1b699793fcfd.: MaxRetriesExceeded: Exceeded maximum number of retries. Exhausted all hosts available for retrying build failures for instance c97ed789-8e4f-4592-8d2b-1b699793fcfd.
> 2018-02-26 13:55:13.197 12300 WARNING nova.scheduler.utils [req-9f5e0190-24bb-4f8e-ba5f-24ea05fbc975 694fbdfdb054454ba2015337186a9316 6dfa73a84b0e49639709cc25ae752d21 - default default] [instance: c97ed789-8e4f-4592-8d2b-1b699793fcfd] Setting instance to ERROR state.: MaxRetriesExceeded: Exceeded maximum number of retries. Exhausted all hosts available for retrying build failures for instance c97ed789-8e4f-4592-8d2b-1b699793fcfd.

ironic/ironic-conductor.log:
> 2018-02-26 13:51:44.319 1386 ERROR ironic.drivers.modules.network.flat [req-6ac0470f-eeea-4cc6-bd7d-d005e4bc5600 aa1c6d98e9ba4c3bbd665bd73b135a73 d2ced46113564e9cba9e435b84705e9d - default default] Unable to set binding:host_id for neutron port 25a726dd-4ee3-43d4-bf0b-88a923fea077. Error: Host f36a3e49-b9ea-4199-ba01-d6f4c98aae12 is not connected to any segments on routed provider network '464855f9-e35e-472a-a7e2-5e19e954f2b1'.  It should be connected to one.
> Neutron server returns request_ids: ['req-63403b56-a96c-443e-9b0f-32da5b63c685']: Conflict: Host f36a3e49-b9ea-4199-ba01-d6f4c98aae12 is not connected to any segments on routed provider network '464855f9-e35e-472a-a7e2-5e19e954f2b1'.  It should be connected to one.
> Neutron server returns request_ids: ['req-63403b56-a96c-443e-9b0f-32da5b63c685']
> 2018-02-26 13:51:44.319 1386 ERROR ironic.drivers.modules.network.flat Traceback (most recent call last):
> 2018-02-26 13:51:44.319 1386 ERROR ironic.drivers.modules.network.flat   File "/usr/lib/python2.7/site-packages/ironic/drivers/modules/network/flat.py", line 78, in add_provisioning_network
> 2018-02-26 13:51:44.319 1386 ERROR ironic.drivers.modules.network.flat     client.update_port(vif_port_id, body)
> 2018-02-26 13:51:44.319 1386 ERROR ironic.drivers.modules.network.flat   File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 799, in update_port
> 2018-02-26 13:51:44.319 1386 ERROR ironic.drivers.modules.network.flat     revision_number=revision_number)
> 2018-02-26 13:51:44.319 1386 ERROR ironic.drivers.modules.network.flat   File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 2375, in _update_resource
> 2018-02-26 13:51:44.319 1386 ERROR ironic.drivers.modules.network.flat     return self.put(path, **kwargs)
> 2018-02-26 13:51:44.319 1386 ERROR ironic.drivers.modules.network.flat   File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 363, in put
> 2018-02-26 13:51:44.319 1386 ERROR ironic.drivers.modules.network.flat     headers=headers, params=params)
> 2018-02-26 13:51:44.319 1386 ERROR ironic.drivers.modules.network.flat   File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 331, in retry_request
> 2018-02-26 13:51:44.319 1386 ERROR ironic.drivers.modules.network.flat     headers=headers, params=params)
> 2018-02-26 13:51:44.319 1386 ERROR ironic.drivers.modules.network.flat   File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 294, in do_request
> 2018-02-26 13:51:44.319 1386 ERROR ironic.drivers.modules.network.flat     self._handle_fault_response(status_code, replybody, resp)
> 2018-02-26 13:51:44.319 1386 ERROR ironic.drivers.modules.network.flat   File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 269, in _handle_fault_response
> 2018-02-26 13:51:44.319 1386 ERROR ironic.drivers.modules.network.flat     exception_handler_v20(status_code, error_body)
> 2018-02-26 13:51:44.319 1386 ERROR ironic.drivers.modules.network.flat   File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 93, in exception_handler_v20
> 2018-02-26 13:51:44.319 1386 ERROR ironic.drivers.modules.network.flat     request_ids=request_ids)
> 2018-02-26 13:51:44.319 1386 ERROR ironic.drivers.modules.network.flat Conflict: Host f36a3e49-b9ea-4199-ba01-d6f4c98aae12 is not connected to any segments on routed provider network '464855f9-e35e-472a-a7e2-5e19e954f2b1'.  It should be connected to one.
> 2018-02-26 13:51:44.319 1386 ERROR ironic.drivers.modules.network.flat Neutron server returns request_ids: ['req-63403b56-a96c-443e-9b0f-32da5b63c685']
> 2018-02-26 13:51:44.319 1386 ERROR ironic.drivers.modules.network.flat 
> 2018-02-26 13:51:44.322 1386 ERROR ironic.conductor.manager [req-6ac0470f-eeea-4cc6-bd7d-d005e4bc5600 aa1c6d98e9ba4c3bbd665bd73b135a73 d2ced46113564e9cba9e435b84705e9d - default default] Error while preparing to deploy to node f36a3e49-b9ea-4199-ba01-d6f4c98aae12: Unable to set binding:host_id for neutron port 25a726dd-4ee3-43d4-bf0b-88a923fea077. Error: Host f36a3e49-b9ea-4199-ba01-d6f4c98aae12 is not connected to any segments on routed provider network '464855f9-e35e-472a-a7e2-5e19e954f2b1'.  It should be connected to one.
> Neutron server returns request_ids: ['req-63403b56-a96c-443e-9b0f-32da5b63c685']: NetworkError: Unable to set binding:host_id for neutron port 25a726dd-4ee3-43d4-bf0b-88a923fea077. Error: Host f36a3e49-b9ea-4199-ba01-d6f4c98aae12 is not connected to any segments on routed provider network '464855f9-e35e-472a-a7e2-5e19e954f2b1'.  It should be connected to one.
> 2018-02-26 13:51:44.322 1386 DEBUG ironic.common.states [req-6ac0470f-eeea-4cc6-bd7d-d005e4bc5600 aa1c6d98e9ba4c3bbd665bd73b135a73 d2ced46113564e9cba9e435b84705e9d - default default] Exiting old state 'deploying' in response to event 'fail' on_exit /usr/lib/python2.7/site-packages/ironic/common/states.py:253
> 2018-02-26 13:51:44.323 1386 DEBUG ironic.common.states [req-6ac0470f-eeea-4cc6-bd7d-d005e4bc5600 aa1c6d98e9ba4c3bbd665bd73b135a73 d2ced46113564e9cba9e435b84705e9d - default default] Entering new state 'deploy failed' in response to event 'fail' on_enter /usr/lib/python2.7/site-packages/ironic/common/states.py:259
> 2018-02-26 13:51:44.338 1386 INFO ironic.conductor.task_manager [req-6ac0470f-eeea-4cc6-bd7d-d005e4bc5600 aa1c6d98e9ba4c3bbd665bd73b135a73 d2ced46113564e9cba9e435b84705e9d - default default] Node f36a3e49-b9ea-4199-ba01-d6f4c98aae12 moved to provision state "deploy failed" from state "deploying"; target provision state is "active"
> 2018-02-26 13:51:44.354 1386 DEBUG ironic.conductor.task_manager [req-6ac0470f-eeea-4cc6-bd7d-d005e4bc5600 aa1c6d98e9ba4c3bbd665bd73b135a73 d2ced46113564e9cba9e435b84705e9d - default default] Successfully released exclusive lock for node deployment on node f36a3e49-b9ea-4199-ba01-d6f4c98aae12 (lock was held 2.38 sec) release_resources /usr/lib/python2.7/site-packages/ironic/conductor/task_manager.py:352

Comment 3 Pavel Sedlák 2018-02-27 09:56:26 UTC
openstack port list shows that most ports are DOWN
(in my case it was 3 node setup [1ctl1comp1ceph],
 i see three ports for corresponding nodes all in subnet of ctlplane)

> (undercloud) [stack@undercloud-0 ~]$ openstack port list
> +--------------------------------------+-------------------------------+-------------------+------------------------------------------------------------------------------+--------+
> | ID                                   | Name                          | MAC Address       | Fixed IP Addresses                                                           | Status |
> +--------------------------------------+-------------------------------+-------------------+------------------------------------------------------------------------------+--------+
> | 09269333-2891-484d-a5b3-97f21ec37f5e | internal_api_virtual_ip       | fa:16:3e:68:d6:f6 | ip_address='172.17.1.20', subnet_id='9e0b61c3-ab70-4836-97d0-f2f3d09c026d'   | DOWN   |
> | 288c8fe4-5e9a-49f6-ae93-5b81ece21862 | redis_virtual_ip              | fa:16:3e:c2:2d:01 | ip_address='172.17.1.11', subnet_id='9e0b61c3-ab70-4836-97d0-f2f3d09c026d'   | DOWN   |
> | 768112fa-4f62-4097-9e38-9cefaa4b9d32 | control_virtual_ip            | fa:16:3e:7b:24:2d | ip_address='192.168.24.14', subnet_id='3c8829db-d5c5-4e9e-802d-63c49668e760' | DOWN   |
> | 88bb90a8-b6bc-4859-ae75-738365b6c67f | Controller-port-0             | 52:54:00:e8:d5:07 | ip_address='192.168.24.8', subnet_id='3c8829db-d5c5-4e9e-802d-63c49668e760'  | DOWN   |
> | 9cccad9a-08dc-4b75-9185-4f631acae976 | CephStorage-port-0            | 52:54:00:14:45:44 | ip_address='192.168.24.10', subnet_id='3c8829db-d5c5-4e9e-802d-63c49668e760' | DOWN   |
> | 9e8816f0-e980-4d41-8afe-ec18505eb1d4 | public_virtual_ip             | fa:16:3e:76:7a:7b | ip_address='10.0.0.106', subnet_id='ca5495a8-ec2a-41b7-810e-4e5cb8088018'    | DOWN   |
> | 9f8a3149-0563-4aef-97d1-32e01084ab7f | storage_virtual_ip            | fa:16:3e:97:c5:25 | ip_address='172.17.3.15', subnet_id='b52f2efb-388e-43d8-a391-3dc963c1adf4'   | DOWN   |
> | cf658b03-5752-4904-9b4b-79b6177c2ffa | NovaCompute-port-0            | 52:54:00:db:08:c6 | ip_address='192.168.24.12', subnet_id='3c8829db-d5c5-4e9e-802d-63c49668e760' | DOWN   |
> | d8f279dd-3d34-4b99-8084-3f40e6f84f18 |                               | fa:16:3e:2b:1e:a6 | ip_address='192.168.24.5', subnet_id='3c8829db-d5c5-4e9e-802d-63c49668e760'  | ACTIVE |
> | f53119df-fa41-459d-9ac8-61d2f89218d5 | storage_management_virtual_ip | fa:16:3e:4c:f6:d6 | ip_address='172.17.4.11', subnet_id='8ba0b4c7-f293-41c7-9d9a-ad8308039bc3'   | DOWN   |
> +--------------------------------------+-------------------------------+-------------------+------------------------------------------------------------------------------+--------+

In neutron/server.log, grepping for controller's port id:
> ...
> 2018-02-26 12:07:56.677 2625 DEBUG neutron.db.provisioning_blocks [req-9fc57a59-3f88-45e9-8e45-81dcaf841271 - - - - -] Provisioning complete for port 88bb90a8-b6bc-4859-ae75-738365b6c67f triggered by entity DHCP. provisioning_complete /usr/lib/python2.7/site-packages/neutron/db/provisioning_blocks.py:138
> 2018-02-26 12:07:56.690 2621 INFO neutron.wsgi [req-05c51bc3-a788-486f-bf3e-9b5e3dfc2c67 1a3a332fb2594a40b4dd4f32f17e14bc 41fdfb8c456841a9a2a1b5884f7cc259 - default default] 192.168.24.1 "PUT /v2.0/ports/88bb90a8-b6bc-4859-ae75-738365b6c67f HTTP/1.1" status: 200  len: 1036 time: 0.4324431
>
> 2018-02-26 12:07:56.710 2625 DEBUG neutron.plugins.ml2.plugin [req-9fc57a59-3f88-45e9-8e45-81dcaf841271 - - - - -] Port 88bb90a8-b6bc-4859-ae75-738365b6c67f cannot update to ACTIVE because it is not bound. _port_provisioned /usr/lib/python2.7/site-packages/neutron/plugins/ml2/plugin.py:247
>
> 2018-02-26 12:07:57.145 2623 INFO neutron.wsgi [req-be4ac5c8-855d-4a31-9af7-9bf5e6497c65 f28805f99a094b6bb7330916420bd318 69b879cc46e6430c989e8ffd797ce7f2 - default default] 192.168.24.1 "GET /v2.0/ports/88bb90a8-b6bc-4859-ae75-738365b6c67f?fields=binding%3Aprofile HTTP/1.1" status: 200  len: 229 time: 0.0396399
> 2018-02-26 12:07:57.643 2623 DEBUG neutron.api.rpc.handlers.resources_rpc [req-4f9043d3-39e9-424a-910c-d2919ca70cc2 1a3a332fb2594a40b4dd4f32f17e14bc 41fdfb8c456841a9a2a1b5884f7cc259 - - -] Pushing event updated for resources: {'Port': ['ID=88bb90a8-b6bc-4859-ae75-738365b6c67f,revision_number=14']} push /usr/lib/python2.7/site-packages/neutron/api/rpc/handlers/resources_rpc.py:241
> ...
> # also this warning, no other ERROR messages aside amqp/mysql after service restart
> 2018-02-26 12:08:31.817 2621 WARNING neutron.pecan_wsgi.controllers.root [req-8f8762c4-8e5b-42af-8776-dae6d43e03c9 1a3a332fb2594a40b4dd4f32f17e14bc 41fdfb8c456841a9a2a1b5884f7cc259 - default default] No controller found for: floatingips - returning response code 404: PecanNotFound

Comment 5 Yatin Karel 2018-02-27 10:41:20 UTC
Few days back it's fixed in instack-undercloud: https://bugs.launchpad.net/tripleo/+bug/1749671. Currently used instack-undercloud(instack-undercloud-8.2.1-0.20180214022816.8a7609e.el7ost.noarch) in ospphase1 doesn't have the fixed commit as we don't have queens promotion from many days.

Comment 7 Harald Jensås 2018-02-27 13:16:50 UTC

*** This bug has been marked as a duplicate of bug 1549170 ***