+++ This bug was initially created as a clone of Bug #1636399 +++ Using this bug to track the deployment race aspect of this issue. +++ This bug was initially created as a clone of Bug #1630069 +++ Unable to launch instance after deployment. Gets stuck in BUILD. nova-compute shows: ResourceProviderSyncFailed: Failed to synchronize the placement service with resource provider information supplied by the compute host. Environment: openstack-nova-common-18.0.1-0.20180830232802.43f6f74.el7ost.noarch python2-novaclient-11.0.0-0.20180809174649.f1005ce.el7ost.noarch openstack-nova-migration-18.0.1-0.20180830232802.43f6f74.el7ost.noarch puppet-nova-13.3.1-0.20180831195236.ce0efbe.el7ost.noarch python-nova-18.0.1-0.20180830232802.43f6f74.el7ost.noarch openstack-nova-compute-18.0.1-0.20180830232802.43f6f74.el7ost.noarch openstack-tripleo-heat-templates-9.0.0-0.20180831204457.17bb71e.0rc1.el7ost instack-undercloud-9.3.1-0.20180831000258.e464799.el7ost.noarch Steps to reproduce: Deployed OC on BM setup. Try to launch an instance. Result: The instance gets stuck: (overcloud) [stack@undercloud ~]$ openstack server list +--------------------------------------+--------------+--------+----------+--------+---------+ | ID | Name | Status | Networks | Image | Flavor | +--------------------------------------+--------------+--------+----------+--------+---------+ | ecf1d7c4-8b03-4528-96b5-ddc3a21b2558 | after_deploy | BUILD | | cirros | m1.tiny | +--------------------------------------+--------------+--------+----------+--------+---------+ <h1>Internal Server Error</h1> <p>The server encountered an internal error or misconfiguration and was unable to complete your request.</p> <p>Please contact the server administrator at [no address given] to inform them of the time this error occurred, and the actions you performed just before this error.</p> <p>More information about this error may be available in the server error log.</p> </body></html> 2018-09-17 19:47:18.271 1 ERROR nova.compute.manager [req-268c54e7-9f4f-459f-84f1-5c1f3ab41746 - - - - -] Error updating resources for node overcloud-compute-0.localdomain.: ResourceProvider SyncFailed: Failed to synchronize the placement service with resource provider information supplied by the compute host. 2018-09-17 19:47:18.271 1 ERROR nova.compute.manager Traceback (most recent call last): 2018-09-17 19:47:18.271 1 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 7722, in _update_available_resource_for_node 2018-09-17 19:47:18.271 1 ERROR nova.compute.manager rt.update_available_resource(context, nodename) 2018-09-17 19:47:18.271 1 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 703, in update_available_resource 2018-09-17 19:47:18.271 1 ERROR nova.compute.manager self._update_available_resource(context, resources) 2018-09-17 19:47:18.271 1 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 274, in inner 2018-09-17 19:47:18.271 1 ERROR nova.compute.manager return f(*args, **kwargs) 2018-09-17 19:47:18.271 1 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 726, in _update_available_resource 2018-09-17 19:47:18.271 1 ERROR nova.compute.manager self._init_compute_node(context, resources) 2018-09-17 19:47:18.271 1 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 593, in _init_compute_node 2018-09-17 19:47:18.271 1 ERROR nova.compute.manager self._update(context, cn) 2018-09-17 19:47:18.271 1 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/retrying.py", line 68, in wrapped_f 2018-09-17 19:47:18.271 1 ERROR nova.compute.manager return Retrying(*dargs, **dkw).call(f, *args, **kw) 2018-09-17 19:47:18.271 1 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/retrying.py", line 223, in call 2018-09-17 19:47:18.271 1 ERROR nova.compute.manager return attempt.get(self._wrap_exception) 2018-09-17 19:47:18.271 1 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/retrying.py", line 261, in get 2018-09-17 19:47:18.271 1 ERROR nova.compute.manager six.reraise(self.value[0], self.value[1], self.value[2]) 2018-09-17 19:47:18.271 1 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/retrying.py", line 217, in call 2018-09-17 19:47:18.271 1 ERROR nova.compute.manager attempt = Attempt(fn(*args, **kwargs), attempt_number, False) 2018-09-17 19:47:18.271 1 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 938, in _update 2018-09-17 19:47:18.271 1 ERROR nova.compute.manager self._update_to_placement(context, compute_node) 2018-09-17 19:47:18.271 1 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 907, in _update_to_placement 2018-09-17 19:47:18.271 1 ERROR nova.compute.manager reportclient.update_from_provider_tree(context, prov_tree) 2018-09-17 19:47:18.271 1 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/scheduler/client/__init__.py", line 37, in __run_method 2018-09-17 19:47:18.271 1 ERROR nova.compute.manager return getattr(self.instance, __name)(*args, **kwargs) 2018-09-17 19:47:18.271 1 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/scheduler/client/report.py", line 1535, in update_from_provider_tree 2018-09-17 19:47:18.271 1 ERROR nova.compute.manager raise exception.ResourceProviderSyncFailed() 2018-09-17 19:47:18.271 1 ERROR nova.compute.manager ResourceProviderSyncFailed: Failed to synchronize the placement service with resource provider information supplied by the compute host. 2018-09-17 19:47:18.271 1 ERROR nova.compute.manager Repeating error in /var/log/containers/httpd/nova-placement/placement_wsgi_error.log on controllers: [Mon Sep 17 19:37:25.276009 2018] [:error] [pid 16] [remote 10.19.94.28:80] mod_wsgi (pid=16): Exception occurred processing WSGI script '/var/www/cgi-bin/nova/nova-placement-api'. [Mon Sep 17 19:37:25.276033 2018] [:error] [pid 16] [remote 10.19.94.28:80] Traceback (most recent call last): [Mon Sep 17 19:37:25.276055 2018] [:error] [pid 16] [remote 10.19.94.28:80] File "/var/www/cgi-bin/nova/nova-placement-api", line 54, in <module> [Mon Sep 17 19:37:25.276092 2018] [:error] [pid 16] [remote 10.19.94.28:80] application = init_application() [Mon Sep 17 19:37:25.276105 2018] [:error] [pid 16] [remote 10.19.94.28:80] File "/usr/lib/python2.7/site-packages/nova/api/openstack/placement/wsgi.py", line 102, in init_application [Mon Sep 17 19:37:25.276132 2018] [:error] [pid 16] [remote 10.19.94.28:80] _parse_args([], default_config_files=[conffile]) [Mon Sep 17 19:37:25.276144 2018] [:error] [pid 16] [remote 10.19.94.28:80] File "/usr/lib/python2.7/site-packages/nova/api/openstack/placement/wsgi.py", line 62, in _parse_args [Mon Sep 17 19:37:25.276164 2018] [:error] [pid 16] [remote 10.19.94.28:80] logging.register_options(conf.CONF) [Mon Sep 17 19:37:25.276174 2018] [:error] [pid 16] [remote 10.19.94.28:80] File "/usr/lib/python2.7/site-packages/oslo_log/log.py", line 250, in register_options [Mon Sep 17 19:37:25.276195 2018] [:error] [pid 16] [remote 10.19.94.28:80] conf.register_cli_opts(_options.common_cli_opts) [Mon Sep 17 19:37:25.276209 2018] [:error] [pid 16] [remote 10.19.94.28:80] File "/usr/lib/python2.7/site-packages/oslo_config/cfg.py", line 2501, in __inner [Mon Sep 17 19:37:25.276927 2018] [:error] [pid 16] [remote 10.19.94.28:80] result = f(self, *args, **kwargs) [Mon Sep 17 19:37:25.276949 2018] [:error] [pid 16] [remote 10.19.94.28:80] File "/usr/lib/python2.7/site-packages/oslo_config/cfg.py", line 2783, in register_cli_opts [Mon Sep 17 19:37:25.276974 2018] [:error] [pid 16] [remote 10.19.94.28:80] self.register_cli_opt(opt, group, clear_cache=False) [Mon Sep 17 19:37:25.276985 2018] [:error] [pid 16] [remote 10.19.94.28:80] File "/usr/lib/python2.7/site-packages/oslo_config/cfg.py", line 2505, in __inner [Mon Sep 17 19:37:25.277004 2018] [:error] [pid 16] [remote 10.19.94.28:80] return f(self, *args, **kwargs) [Mon Sep 17 19:37:25.277014 2018] [:error] [pid 16] [remote 10.19.94.28:80] File "/usr/lib/python2.7/site-packages/oslo_config/cfg.py", line 2775, in register_cli_opt [Mon Sep 17 19:37:25.277032 2018] [:error] [pid 16] [remote 10.19.94.28:80] raise ArgsAlreadyParsedError("cannot register CLI option") [Mon Sep 17 19:37:25.277057 2018] [:error] [pid 16] [remote 10.19.94.28:80] ArgsAlreadyParsedError: arguments already parsed: cannot register CLI option [Mon Sep 17 19:38:55.686870 2018] [:error] [pid 19] [remote 10.19.94.28:88] mod_wsgi (pid=19): Target WSGI script '/var/www/cgi-bin/nova/nova-placement-api' cannot be loaded as Python module. --- Additional comment from Martin Schuppert on 2018-10-25 14:23:22 EDT --- Since there are ongoing upstream RDO ci issues and we could not land [1] yet, we added a temporary down-stream only patch to unblock the situation: During overcloud deploy we could see on ctrl1 + 2 where the db_sync is not running something like: [root@controller-1 ~]# grep INFO:nova_wait_for_db_sync /var/log/messages Oct 25 13:32:48 controller-1 journal: INFO:nova_wait_for_db_sync:Nova API DB sync not yet finished. Migrationnumber DB/files (0/61) Oct 25 13:32:58 controller-1 journal: INFO:nova_wait_for_db_sync:Nova API DB sync not yet finished. Migrationnumber DB/files (13/61) Oct 25 13:33:08 controller-1 journal: INFO:nova_wait_for_db_sync:Nova API DB sync finished. Migration number 61 [root@controller-2 ~]# grep INFO:nova_wait_for_db_sync /var/log/messages Oct 25 13:32:50 controller-2 journal: INFO:nova_wait_for_db_sync:Nova API DB sync not yet finished. Migrationnumber DB/files (0/61) Oct 25 13:33:00 controller-2 journal: INFO:nova_wait_for_db_sync:Nova API DB sync not yet finished. Migrationnumber DB/files (35/61) Oct 25 13:33:10 controller-2 journal: INFO:nova_wait_for_db_sync:Nova API DB sync finished. Migration number 61 This BZ is to track the revert of the downstream-only patch