Bug 1623613 - After reboot unable to start nova instance: ERROR nova.compute.manager ResourceProviderRetrievalFailed: Failed to get resource provider with UUID <uuid>
Summary: After reboot unable to start nova instance: ERROR nova.compute.manager Resour...
Keywords:
Status: CLOSED DUPLICATE of bug 1630069
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-nova
Version: 14.0 (Rocky)
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: Upstream M3
: 14.0 (Rocky)
Assignee: OSP DFG:Compute
QA Contact: OSP DFG:Compute
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-08-29 18:07 UTC by Alexander Chuzhoy
Modified: 2023-03-21 18:59 UTC (History)
19 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-10-04 13:48:12 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1784155 0 None None None 2018-10-01 09:57:49 UTC

Description Alexander Chuzhoy 2018-08-29 18:07:25 UTC
After reboot unable to start nova instance: ERROR nova.compute.manager ResourceProviderRetrievalFailed: Failed to get resource provider with UUID <uuid>

Environment:
instack-undercloud-9.2.1-0.20180803181448.be5fa97.el7ost.noarch
python2-ironic-inspector-client-3.3.0-0.20180731012520.53bf4e8.el7ost.noarch
puppet-ironic-13.2.0-0.20180730212733.387fffa.el7ost.noarch
python2-ironicclient-2.5.0-0.20180731010529.fb94fb8.el7ost.noarch
puppet-nova-13.2.0-0.20180802212242.dd0cc66.el7ost.noarch
python2-novaclient-11.0.0-0.20180731010157.f1005ce.el7ost.noarch


Steps to reproduce:

1. Deploy OC with ironic enabled.
2. Add nodes in OC
3. Switch the nodes to clean through cleaning
4. launch instances
5. remove instances.
6. reboot the setup
7. attempt to launch an instance

Result:
(overcloud) [stack@undercloud-0 ~]$ openstack server list
+--------------------------------------+-----------+--------+----------+----------------+-----------+
| ID                                   | Name      | Status | Networks | Image          | Flavor    |
+--------------------------------------+-----------+--------+----------+----------------+-----------+
| 3bbe2cb1-bd0b-4fcb-afb2-22ed00f1bd4e | instance1 | ERROR  |          | overcloud-full | baremetal |
+--------------------------------------+-----------+--------+----------+----------------+-----------+



nova compute log shows:
2018-08-29 18:03:30.391 1 ERROR nova.scheduler.client.report [req-c838d765-d7ea-408b-a90a-88c6ec26691a - - - - -] [None] Failed to retrieve resource provider tree from placement API for UUID 5a0dd0a2-0d57-4828-b070-f65da86d922d. Got 500: <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
2018-08-29 18:03:30.392 1 ERROR nova.compute.manager [req-c838d765-d7ea-408b-a90a-88c6ec26691a - - - - -] Error updating resources for node 5a0dd0a2-0d57-4828-b070-f65da86d922d.: ResourceProviderRetrievalFailed: Failed to get resource provider with UUID 5a0dd0a2-0d57-4828-b070-f65da86d922d
2018-08-29 18:03:30.392 1 ERROR nova.compute.manager Traceback (most recent call last):
2018-08-29 18:03:30.392 1 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 7717, in _update_available_resource_for_node
2018-08-29 18:03:30.392 1 ERROR nova.compute.manager     rt.update_available_resource(context, nodename)
2018-08-29 18:03:30.392 1 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 703, in update_available_resource
2018-08-29 18:03:30.392 1 ERROR nova.compute.manager     self._update_available_resource(context, resources)
2018-08-29 18:03:30.392 1 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 274, in inner
2018-08-29 18:03:30.392 1 ERROR nova.compute.manager     return f(*args, **kwargs)
2018-08-29 18:03:30.392 1 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 726, in _update_available_resource
2018-08-29 18:03:30.392 1 ERROR nova.compute.manager     self._init_compute_node(context, resources)
2018-08-29 18:03:30.392 1 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 564, in _init_compute_node
2018-08-29 18:03:30.392 1 ERROR nova.compute.manager     self._update(context, cn)
2018-08-29 18:03:30.392 1 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/retrying.py", line 68, in wrapped_f
2018-08-29 18:03:30.392 1 ERROR nova.compute.manager     return Retrying(*dargs, **dkw).call(f, *args, **kw)
2018-08-29 18:03:30.392 1 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/retrying.py", line 223, in call
2018-08-29 18:03:30.392 1 ERROR nova.compute.manager     return attempt.get(self._wrap_exception)
2018-08-29 18:03:30.392 1 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/retrying.py", line 261, in get
2018-08-29 18:03:30.392 1 ERROR nova.compute.manager     six.reraise(self.value[0], self.value[1], self.value[2])
2018-08-29 18:03:30.392 1 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/retrying.py", line 217, in call
2018-08-29 18:03:30.392 1 ERROR nova.compute.manager     attempt = Attempt(fn(*args, **kwargs), attempt_number, False)
2018-08-29 18:03:30.392 1 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 938, in _update
2018-08-29 18:03:30.392 1 ERROR nova.compute.manager     self._update_to_placement(context, compute_node)
2018-08-29 18:03:30.392 1 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 894, in _update_to_placement
2018-08-29 18:03:30.392 1 ERROR nova.compute.manager     context, compute_node.uuid, name=compute_node.hypervisor_hostname)
2018-08-29 18:03:30.392 1 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/scheduler/client/__init__.py", line 37, in __run_method
2018-08-29 18:03:30.392 1 ERROR nova.compute.manager     return getattr(self.instance, __name)(*args, **kwargs)
2018-08-29 18:03:30.392 1 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/scheduler/client/report.py", line 1006, in get_provider_tree_and_ensure_root
2018-08-29 18:03:30.392 1 ERROR nova.compute.manager     parent_provider_uuid=parent_provider_uuid)
2018-08-29 18:03:30.392 1 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/scheduler/client/report.py", line 668, in _ensure_resource_provider
2018-08-29 18:03:30.392 1 ERROR nova.compute.manager     rps_to_refresh = self._get_providers_in_tree(context, uuid)
2018-08-29 18:03:30.392 1 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/scheduler/client/report.py", line 74, in wrapper
2018-08-29 18:03:30.392 1 ERROR nova.compute.manager     return f(self, *a, **k)
2018-08-29 18:03:30.392 1 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/scheduler/client/report.py", line 535, in _get_providers_in_tree
2018-08-29 18:03:30.392 1 ERROR nova.compute.manager     raise exception.ResourceProviderRetrievalFailed(uuid=uuid)
2018-08-29 18:03:30.392 1 ERROR nova.compute.manager ResourceProviderRetrievalFailed: Failed to get resource provider with UUID 5a0dd0a2-0d57-4828-b070-f65da86d922d
2018-08-29 18:03:30.392 1 ERROR nova.compute.manager 





nova scheduler shows:
2018-08-29 17:25:01.826 1 ERROR nova Traceback (most recent call last):
2018-08-29 17:25:01.826 1 ERROR nova   File "/usr/bin/nova-scheduler", line 10, in <module>
2018-08-29 17:25:01.826 1 ERROR nova     sys.exit(main())
2018-08-29 17:25:01.826 1 ERROR nova   File "/usr/lib/python2.7/site-packages/nova/cmd/scheduler.py", line 46, in main
2018-08-29 17:25:01.826 1 ERROR nova     topic=scheduler_rpcapi.RPC_TOPIC)
2018-08-29 17:25:01.826 1 ERROR nova   File "/usr/lib/python2.7/site-packages/nova/service.py", line 259, in create
2018-08-29 17:25:01.826 1 ERROR nova     periodic_interval_max=periodic_interval_max)
2018-08-29 17:25:01.826 1 ERROR nova   File "/usr/lib/python2.7/site-packages/nova/service.py", line 128, in __init__
2018-08-29 17:25:01.826 1 ERROR nova     self.manager = manager_class(host=self.host, *args, **kwargs)
2018-08-29 17:25:01.826 1 ERROR nova   File "/usr/lib/python2.7/site-packages/nova/scheduler/manager.py", line 63, in __init__
2018-08-29 17:25:01.826 1 ERROR nova     invoke_on_load=True).driver
2018-08-29 17:25:01.826 1 ERROR nova   File "/usr/lib/python2.7/site-packages/stevedore/driver.py", line 61, in __init__
2018-08-29 17:25:01.826 1 ERROR nova     warn_on_missing_entrypoint=warn_on_missing_entrypoint
2018-08-29 17:25:01.826 1 ERROR nova   File "/usr/lib/python2.7/site-packages/stevedore/named.py", line 81, in __init__
2018-08-29 17:25:01.826 1 ERROR nova     verify_requirements)
2018-08-29 17:25:01.826 1 ERROR nova   File "/usr/lib/python2.7/site-packages/stevedore/extension.py", line 203, in _load_plugins
2018-08-29 17:25:01.826 1 ERROR nova     self._on_load_failure_callback(self, ep, err)
2018-08-29 17:25:01.826 1 ERROR nova   File "/usr/lib/python2.7/site-packages/stevedore/extension.py", line 195, in _load_plugins
2018-08-29 17:25:01.826 1 ERROR nova     verify_requirements,
2018-08-29 17:25:01.826 1 ERROR nova   File "/usr/lib/python2.7/site-packages/stevedore/named.py", line 158, in _load_one_plugin
2018-08-29 17:25:01.826 1 ERROR nova     verify_requirements,
2018-08-29 17:25:01.826 1 ERROR nova   File "/usr/lib/python2.7/site-packages/stevedore/extension.py", line 227, in _load_one_plugin
2018-08-29 17:25:01.826 1 ERROR nova     obj = plugin(*invoke_args, **invoke_kwds)
2018-08-29 17:25:01.826 1 ERROR nova   File "/usr/lib/python2.7/site-packages/nova/scheduler/filter_scheduler.py", line 43, in __init__
2018-08-29 17:25:01.826 1 ERROR nova     super(FilterScheduler, self).__init__(*args, **kwargs)
2018-08-29 17:25:01.826 1 ERROR nova   File "/usr/lib/python2.7/site-packages/nova/scheduler/driver.py", line 42, in __init__
2018-08-29 17:25:01.826 1 ERROR nova     self.host_manager = host_manager.HostManager()
2018-08-29 17:25:01.826 1 ERROR nova   File "/usr/lib/python2.7/site-packages/nova/scheduler/host_manager.py", line 345, in __init__
2018-08-29 17:25:01.826 1 ERROR nova     self.refresh_cells_caches()
2018-08-29 17:25:01.826 1 ERROR nova   File "/usr/lib/python2.7/site-packages/nova/scheduler/host_manager.py", line 646, in refresh_cells_caches
2018-08-29 17:25:01.826 1 ERROR nova     temp_cells = objects.CellMappingList.get_all(context)
2018-08-29 17:25:01.826 1 ERROR nova   File "/usr/lib/python2.7/site-packages/oslo_versionedobjects/base.py", line 184, in wrapper
2018-08-29 17:25:01.826 1 ERROR nova     result = fn(cls, context, *args, **kwargs)
2018-08-29 17:25:01.826 1 ERROR nova   File "/usr/lib/python2.7/site-packages/nova/objects/cell_mapping.py", line 255, in get_all
2018-08-29 17:25:01.826 1 ERROR nova     db_mappings = cls._get_all_from_db(context)
2018-08-29 17:25:01.826 1 ERROR nova   File "/usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py", line 992, in wrapper
2018-08-29 17:25:01.826 1 ERROR nova     with self._transaction_scope(context):
2018-08-29 17:25:01.826 1 ERROR nova   File "/usr/lib64/python2.7/contextlib.py", line 17, in __enter__
2018-08-29 17:25:01.826 1 ERROR nova     return self.gen.next()
2018-08-29 17:25:01.826 1 ERROR nova   File "/usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py", line 1042, in _transaction_scope
2018-08-29 17:25:01.826 1 ERROR nova     context=context) as resource:
2018-08-29 17:25:01.826 1 ERROR nova   File "/usr/lib64/python2.7/contextlib.py", line 17, in __enter__
2018-08-29 17:25:01.826 1 ERROR nova     return self.gen.next()
2018-08-29 17:25:01.826 1 ERROR nova   File "/usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py", line 645, in _session
2018-08-29 17:25:01.826 1 ERROR nova     bind=self.connection, mode=self.mode)
2018-08-29 17:25:01.826 1 ERROR nova   File "/usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py", line 409, in _create_session
2018-08-29 17:25:01.826 1 ERROR nova     self._start()
2018-08-29 17:25:01.826 1 ERROR nova   File "/usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py", line 496, in _start
2018-08-29 17:25:01.826 1 ERROR nova     engine_args, maker_args)
2018-08-29 17:25:01.826 1 ERROR nova   File "/usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py", line 520, in _setup_for_connection
2018-08-29 17:25:01.826 1 ERROR nova     sql_connection=sql_connection, **engine_kwargs)
2018-08-29 17:25:01.826 1 ERROR nova   File "/usr/lib/python2.7/site-packages/debtcollector/renames.py", line 43, in decorator
2018-08-29 17:25:01.826 1 ERROR nova     return wrapped(*args, **kwargs)
2018-08-29 17:25:01.826 1 ERROR nova   File "/usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/engines.py", line 202, in create_engine
2018-08-29 17:25:01.826 1 ERROR nova     test_conn = _test_connection(engine, max_retries, retry_interval)
2018-08-29 17:25:01.826 1 ERROR nova   File "/usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/engines.py", line 388, in _test_connection
2018-08-29 17:25:01.826 1 ERROR nova     six.reraise(type(de_ref), de_ref)
2018-08-29 17:25:01.826 1 ERROR nova   File "<string>", line 2, in reraise
2018-08-29 17:25:01.826 1 ERROR nova DBConnectionError: (pymysql.err.OperationalError) (2013, 'Lost connection to MySQL server during query') (Background on this error at: http://sqlalche.me/e/e3q8)
2018-08-29 17:25:01.826 1 ERROR nova

Comment 2 Alexander Chuzhoy 2018-08-29 22:19:39 UTC
Reproduced the failure to launch instance post reboot.
This time there was another error in nova-compute log:
2018-08-29 21:55:58.944 1 ERROR nova.compute.manager [req-31a2c0cd-27ff-459d-8260-6696eedbd674 8680faae02f14a7388e8f73be7a6a0b4 08904f9f318e44e9800f328c03437553 - default default] [instance: b88aecd9-a91f-4750-a450-0ba1fa71688d] Setting instance vm_state to ERROR: ResourceProviderSyncFailed: Failed to synchronize the placement service with resource provider information supplied by the compute host.
2018-08-29 21:55:58.944 1 ERROR nova.compute.manager [instance: b88aecd9-a91f-4750-a450-0ba1fa71688d] Traceback (most recent call last):
2018-08-29 21:55:58.944 1 ERROR nova.compute.manager [instance: b88aecd9-a91f-4750-a450-0ba1fa71688d]   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2660, in do_terminate_instance
2018-08-29 21:55:58.944 1 ERROR nova.compute.manager [instance: b88aecd9-a91f-4750-a450-0ba1fa71688d]     self._delete_instance(context, instance, bdms)
2018-08-29 21:55:58.944 1 ERROR nova.compute.manager [instance: b88aecd9-a91f-4750-a450-0ba1fa71688d]   File "/usr/lib/python2.7/site-packages/nova/hooks.py", line 154, in inner
2018-08-29 21:55:58.944 1 ERROR nova.compute.manager [instance: b88aecd9-a91f-4750-a450-0ba1fa71688d]     rv = f(*args, **kwargs)
2018-08-29 21:55:58.944 1 ERROR nova.compute.manager [instance: b88aecd9-a91f-4750-a450-0ba1fa71688d]   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2633, in _delete_instance
2018-08-29 21:55:58.944 1 ERROR nova.compute.manager [instance: b88aecd9-a91f-4750-a450-0ba1fa71688d]     bdms)
2018-08-29 21:55:58.944 1 ERROR nova.compute.manager [instance: b88aecd9-a91f-4750-a450-0ba1fa71688d]   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 756, in _complete_deletion
2018-08-29 21:55:58.944 1 ERROR nova.compute.manager [instance: b88aecd9-a91f-4750-a450-0ba1fa71688d]     self._update_resource_tracker(context, instance)
2018-08-29 21:55:58.944 1 ERROR nova.compute.manager [instance: b88aecd9-a91f-4750-a450-0ba1fa71688d]   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 564, in _update_resource_tracker
2018-08-29 21:55:58.944 1 ERROR nova.compute.manager [instance: b88aecd9-a91f-4750-a450-0ba1fa71688d]     rt.update_usage(context, instance, instance.node)
2018-08-29 21:55:58.944 1 ERROR nova.compute.manager [instance: b88aecd9-a91f-4750-a450-0ba1fa71688d]   File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 274, in inner
2018-08-29 21:55:58.944 1 ERROR nova.compute.manager [instance: b88aecd9-a91f-4750-a450-0ba1fa71688d]     return f(*args, **kwargs)
2018-08-29 21:55:58.944 1 ERROR nova.compute.manager [instance: b88aecd9-a91f-4750-a450-0ba1fa71688d]   File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 492, in update_usage
2018-08-29 21:55:58.944 1 ERROR nova.compute.manager [instance: b88aecd9-a91f-4750-a450-0ba1fa71688d]     self._update(context.elevated(), self.compute_nodes[nodename])
2018-08-29 21:55:58.944 1 ERROR nova.compute.manager [instance: b88aecd9-a91f-4750-a450-0ba1fa71688d]   File "/usr/lib/python2.7/site-packages/retrying.py", line 68, in wrapped_f
2018-08-29 21:55:58.944 1 ERROR nova.compute.manager [instance: b88aecd9-a91f-4750-a450-0ba1fa71688d]     return Retrying(*dargs, **dkw).call(f, *args, **kw)
2018-08-29 21:55:58.944 1 ERROR nova.compute.manager [instance: b88aecd9-a91f-4750-a450-0ba1fa71688d]   File "/usr/lib/python2.7/site-packages/retrying.py", line 223, in call
2018-08-29 21:55:58.944 1 ERROR nova.compute.manager [instance: b88aecd9-a91f-4750-a450-0ba1fa71688d]     return attempt.get(self._wrap_exception)
2018-08-29 21:55:58.944 1 ERROR nova.compute.manager [instance: b88aecd9-a91f-4750-a450-0ba1fa71688d]   File "/usr/lib/python2.7/site-packages/retrying.py", line 261, in get
2018-08-29 21:55:58.944 1 ERROR nova.compute.manager [instance: b88aecd9-a91f-4750-a450-0ba1fa71688d]     six.reraise(self.value[0], self.value[1], self.value[2])
2018-08-29 21:55:58.944 1 ERROR nova.compute.manager [instance: b88aecd9-a91f-4750-a450-0ba1fa71688d]   File "/usr/lib/python2.7/site-packages/retrying.py", line 217, in call
2018-08-29 21:55:58.944 1 ERROR nova.compute.manager [instance: b88aecd9-a91f-4750-a450-0ba1fa71688d]     attempt = Attempt(fn(*args, **kwargs), attempt_number, False)
2018-08-29 21:55:58.944 1 ERROR nova.compute.manager [instance: b88aecd9-a91f-4750-a450-0ba1fa71688d]   File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 938, in _update
2018-08-29 21:55:58.944 1 ERROR nova.compute.manager [instance: b88aecd9-a91f-4750-a450-0ba1fa71688d]     self._update_to_placement(context, compute_node)
2018-08-29 21:55:58.944 1 ERROR nova.compute.manager [instance: b88aecd9-a91f-4750-a450-0ba1fa71688d]   File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 907, in _update_to_placement
2018-08-29 21:55:58.944 1 ERROR nova.compute.manager [instance: b88aecd9-a91f-4750-a450-0ba1fa71688d]     reportclient.update_from_provider_tree(context, prov_tree)
2018-08-29 21:55:58.944 1 ERROR nova.compute.manager [instance: b88aecd9-a91f-4750-a450-0ba1fa71688d]   File "/usr/lib/python2.7/site-packages/nova/scheduler/client/__init__.py", line 37, in __run_method
2018-08-29 21:55:58.944 1 ERROR nova.compute.manager [instance: b88aecd9-a91f-4750-a450-0ba1fa71688d]     return getattr(self.instance, __name)(*args, **kwargs)
2018-08-29 21:55:58.944 1 ERROR nova.compute.manager [instance: b88aecd9-a91f-4750-a450-0ba1fa71688d]   File "/usr/lib/python2.7/site-packages/nova/scheduler/client/report.py", line 1531, in update_from_provider_tree
2018-08-29 21:55:58.944 1 ERROR nova.compute.manager [instance: b88aecd9-a91f-4750-a450-0ba1fa71688d]     raise exception.ResourceProviderSyncFailed()
2018-08-29 21:55:58.944 1 ERROR nova.compute.manager [instance: b88aecd9-a91f-4750-a450-0ba1fa71688d] ResourceProviderSyncFailed: Failed to synchronize the placement service with resource provider information supplied by the compute host.
2018-08-29 21:55:58.944 1 ERROR nova.compute.manager [instance: b88aecd9-a91f-4750-a450-0ba1fa71688d]

Comment 4 Alexander Chuzhoy 2018-08-30 20:20:14 UTC
grepping for errors in placement.log


[root@controller-0 ~]# grep -i error /var/log/containers/nova/nova-placement-api.log 
2018-08-30 16:44:22.026 14 DEBUG nova.api.openstack.placement.wsgi [-] logging_exception_prefix       = %(asctime)s.%(msecs)03d %(process)d ERROR %(name)s %(instance)s log_opt_values /usr/lib/python2.7/site-packages/oslo_config/cfg.py:3023
2018-08-30 16:44:22.035 14 DEBUG nova.api.openstack.placement.wsgi [-] publish_errors                 = False log_opt_values /usr/lib/python2.7/site-packages/oslo_config/cfg.py:3023
2018-08-30 16:44:22.055 14 DEBUG nova.api.openstack.placement.wsgi [-] oslo_versionedobjects.fatal_exception_format_errors = False log_opt_values /usr/lib/python2.7/site-packages/oslo_config/cfg.py:3031
2018-08-30 16:44:22.208 14 DEBUG nova.api.openstack.placement.wsgi [-] profiler.filter_error_trace    = False log_opt_values /usr/lib/python2.7/site-packages/oslo_config/cfg.py:3031
2018-08-30 16:44:22.269 14 DEBUG oslo_db.sqlalchemy.engines [-] MySQL server mode set to STRICT_TRANS_TABLES,STRICT_ALL_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,TRADITIONAL,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION _check_effective_sql_mode /usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/engines.py:308
2018-08-30 16:44:53.819 15 DEBUG nova.api.openstack.placement.wsgi [-] logging_exception_prefix       = %(asctime)s.%(msecs)03d %(process)d ERROR %(name)s %(instance)s log_opt_values /usr/lib/python2.7/site-packages/oslo_config/cfg.py:3023
2018-08-30 16:44:53.828 15 DEBUG nova.api.openstack.placement.wsgi [-] publish_errors                 = False log_opt_values /usr/lib/python2.7/site-packages/oslo_config/cfg.py:3023
2018-08-30 16:44:53.847 15 DEBUG nova.api.openstack.placement.wsgi [-] oslo_versionedobjects.fatal_exception_format_errors = False log_opt_values /usr/lib/python2.7/site-packages/oslo_config/cfg.py:3031
2018-08-30 16:44:53.984 15 DEBUG nova.api.openstack.placement.wsgi [-] profiler.filter_error_trace    = False log_opt_values /usr/lib/python2.7/site-packages/oslo_config/cfg.py:3031
2018-08-30 16:44:54.039 15 DEBUG oslo_db.sqlalchemy.engines [-] MySQL server mode set to STRICT_TRANS_TABLES,STRICT_ALL_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,TRADITIONAL,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION _check_effective_sql_mode /usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/engines.py:308
2018-08-30 17:02:33.055 15 DEBUG nova.api.openstack.placement.wsgi_wrapper [req-cec08d9d-efc4-4aa1-a480-a6b61465cac4 cb722cd9f95d49f5bfe8d1dace4e7be7 28dc3c7b75cc400287eaa9fac0a3bb08 - default default] Placement API returning an error response: Unable to update inventory for resource provider b73f4434-e4b8-4f33-a8f0-0e522065d5f3: Invalid inventory for 'VCPU' on resource provider 'b73f4434-e4b8-4f33-a8f0-0e522065d5f3'. The reserved value is greater than or equal to total. call_func /usr/lib/python2.7/site-packages/nova/api/openstack/placement/wsgi_wrapper.py:31
2018-08-30 17:02:33.236 15 DEBUG nova.api.openstack.placement.wsgi_wrapper [req-f9139d9c-f767-4268-adb3-a2325832f542 cb722cd9f95d49f5bfe8d1dace4e7be7 28dc3c7b75cc400287eaa9fac0a3bb08 - default default] Placement API returning an error response: Unable to update inventory for resource provider b1423ed9-eecc-4853-86e9-2dcc77ab21fd: Invalid inventory for 'VCPU' on resource provider 'b1423ed9-eecc-4853-86e9-2dcc77ab21fd'. The reserved value is greater than or equal to total. call_func /usr/lib/python2.7/site-packages/nova/api/openstack/placement/wsgi_wrapper.py:31
2018-08-30 17:14:45.335 15 DEBUG nova.api.openstack.placement.wsgi_wrapper [req-d8c27eec-2907-4acb-a6dc-83a6f4befce5 cb722cd9f95d49f5bfe8d1dace4e7be7 28dc3c7b75cc400287eaa9fac0a3bb08 - default default] Placement API returning an error response: Unable to update inventory for resource provider b1423ed9-eecc-4853-86e9-2dcc77ab21fd: Invalid inventory for 'VCPU' on resource provider 'b1423ed9-eecc-4853-86e9-2dcc77ab21fd'. The reserved value is greater than or equal to total. call_func /usr/lib/python2.7/site-packages/nova/api/openstack/placement/wsgi_wrapper.py:31
2018-08-30 17:15:45.157 14 DEBUG nova.api.openstack.placement.wsgi_wrapper [req-04f23fa8-db83-4bf9-925c-cb722903251a cb722cd9f95d49f5bfe8d1dace4e7be7 28dc3c7b75cc400287eaa9fac0a3bb08 - default default] Placement API returning an error response: Unable to update inventory for resource provider b1423ed9-eecc-4853-86e9-2dcc77ab21fd: Invalid inventory for 'VCPU' on resource provider 'b1423ed9-eecc-4853-86e9-2dcc77ab21fd'. The reserved value is greater than or equal to total. call_func /usr/lib/python2.7/site-packages/nova/api/openstack/placement/wsgi_wrapper.py:31
2018-08-30 17:25:40.669 15 DEBUG nova.api.openstack.placement.wsgi [-] logging_exception_prefix       = %(asctime)s.%(msecs)03d %(process)d ERROR %(name)s %(instance)s log_opt_values /usr/lib/python2.7/site-packages/oslo_config/cfg.py:3023
2018-08-30 17:25:40.680 15 DEBUG nova.api.openstack.placement.wsgi [-] publish_errors                 = False log_opt_values /usr/lib/python2.7/site-packages/oslo_config/cfg.py:3023
2018-08-30 17:25:40.703 15 DEBUG nova.api.openstack.placement.wsgi [-] oslo_versionedobjects.fatal_exception_format_errors = False log_opt_values /usr/lib/python2.7/site-packages/oslo_config/cfg.py:3031
2018-08-30 17:25:40.887 15 DEBUG nova.api.openstack.placement.wsgi [-] profiler.filter_error_trace    = False log_opt_values /usr/lib/python2.7/site-packages/oslo_config/cfg.py:3031
2018-08-30 17:25:41.771 15 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. 10 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '172.17.1.12' ([Errno 113] No route to host)") (Background on this error at: http://sqlalche.me/e/e3q8)
2018-08-30 17:25:54.780 15 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. 9 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '172.17.1.12' ([Errno 113] No route to host)") (Background on this error at: http://sqlalche.me/e/e3q8)
2018-08-30 17:26:04.786 15 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. 8 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '172.17.1.12' ([Errno 111] Connection refused)") (Background on this error at: http://sqlalche.me/e/e3q8)
2018-08-30 17:26:14.797 15 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. 7 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '172.17.1.12' ([Errno 111] Connection refused)") (Background on this error at: http://sqlalche.me/e/e3q8)
2018-08-30 17:26:21.043 14 DEBUG nova.api.openstack.placement.wsgi [-] logging_exception_prefix       = %(asctime)s.%(msecs)03d %(process)d ERROR %(name)s %(instance)s log_opt_values /usr/lib/python2.7/site-packages/oslo_config/cfg.py:3023
2018-08-30 17:26:21.061 14 DEBUG nova.api.openstack.placement.wsgi [-] publish_errors                 = False log_opt_values /usr/lib/python2.7/site-packages/oslo_config/cfg.py:3023
2018-08-30 17:26:21.081 14 DEBUG nova.api.openstack.placement.wsgi [-] oslo_versionedobjects.fatal_exception_format_errors = False log_opt_values /usr/lib/python2.7/site-packages/oslo_config/cfg.py:3031
2018-08-30 17:26:21.258 14 DEBUG nova.api.openstack.placement.wsgi [-] profiler.filter_error_trace    = False log_opt_values /usr/lib/python2.7/site-packages/oslo_config/cfg.py:3031
2018-08-30 17:26:21.306 14 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. 10 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '172.17.1.12' ([Errno 111] Connection refused)") (Background on this error at: http://sqlalche.me/e/e3q8)
2018-08-30 17:26:24.809 15 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. 6 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '172.17.1.12' ([Errno 111] Connection refused)") (Background on this error at: http://sqlalche.me/e/e3q8)
2018-08-30 17:26:31.326 14 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. 9 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '172.17.1.12' ([Errno 111] Connection refused)") (Background on this error at: http://sqlalche.me/e/e3q8)
2018-08-30 17:26:34.821 15 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. 5 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '172.17.1.12' ([Errno 111] Connection refused)") (Background on this error at: http://sqlalche.me/e/e3q8)
2018-08-30 17:26:41.344 14 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. 8 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '172.17.1.12' ([Errno 111] Connection refused)") (Background on this error at: http://sqlalche.me/e/e3q8)
2018-08-30 17:26:44.832 15 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. 4 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '172.17.1.12' ([Errno 111] Connection refused)") (Background on this error at: http://sqlalche.me/e/e3q8)
2018-08-30 17:26:51.356 14 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. 7 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '172.17.1.12' ([Errno 111] Connection refused)") (Background on this error at: http://sqlalche.me/e/e3q8)
2018-08-30 17:26:54.843 15 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. 3 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '172.17.1.12' ([Errno 111] Connection refused)") (Background on this error at: http://sqlalche.me/e/e3q8)
2018-08-30 17:27:01.367 14 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. 6 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '172.17.1.12' ([Errno 111] Connection refused)") (Background on this error at: http://sqlalche.me/e/e3q8)
2018-08-30 17:27:04.853 15 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. 2 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '172.17.1.12' ([Errno 111] Connection refused)") (Background on this error at: http://sqlalche.me/e/e3q8)
2018-08-30 17:27:11.378 14 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. 5 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '172.17.1.12' ([Errno 111] Connection refused)") (Background on this error at: http://sqlalche.me/e/e3q8)
2018-08-30 17:27:14.860 15 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. 1 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '172.17.1.12' ([Errno 111] Connection refused)") (Background on this error at: http://sqlalche.me/e/e3q8)
2018-08-30 17:27:21.390 14 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. 4 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '172.17.1.12' ([Errno 111] Connection refused)") (Background on this error at: http://sqlalche.me/e/e3q8)
2018-08-30 17:27:31.402 14 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. 3 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '172.17.1.12' ([Errno 111] Connection refused)") (Background on this error at: http://sqlalche.me/e/e3q8)
2018-08-30 17:27:41.410 14 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. 2 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '172.17.1.12' ([Errno 111] Connection refused)") (Background on this error at: http://sqlalche.me/e/e3q8)
2018-08-30 17:27:51.421 14 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. 1 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '172.17.1.12' ([Errno 111] Connection refused)") (Background on this error at: http://sqlalche.me/e/e3q8)

Comment 5 Sylvain Bauza 2018-08-31 14:09:28 UTC
By looking at the logs, it's not something about a bug, rather a deployment issue where the DB wouldn't be runnning.

Please check your deployment, test the API DB directly but that's not something Nova needs to be modified to fix it. That's why I'm closing this bug now.

Comment 6 Marian Krcmarik 2018-09-04 13:54:33 UTC
I observed a similar problem, It seems that nova-placement is not working correctly after control plane reset (reset all the nodes) and coming back online.
I have not had more time to debbug due to other issues but I only notice a traceback being thrown on all controllers in nova-placement/placement_wsgi_error.log:
    [Mon Sep 03 14:50:34.144509 2018] [:error] [pid 15] [remote 172.17.1.17:144] mod_wsgi (pid=15): Exception occurred processing WSGI script '/var/www/cgi-bin/nova/nova-placement-api'.
    [Mon Sep 03 14:50:34.144529 2018] [:error] [pid 15] [remote 172.17.1.17:144] Traceback (most recent call last):
    [Mon Sep 03 14:50:34.144560 2018] [:error] [pid 15] [remote 172.17.1.17:144]   File "/var/www/cgi-bin/nova/nova-placement-api", line 54, in <module>
    [Mon Sep 03 14:50:34.144597 2018] [:error] [pid 15] [remote 172.17.1.17:144]     application = init_application()
    [Mon Sep 03 14:50:34.144607 2018] [:error] [pid 15] [remote 172.17.1.17:144]   File "/usr/lib/python2.7/site-packages/nova/api/openstack/placement/wsgi.py", line 97, in init_application
    [Mon Sep 03 14:50:34.144666 2018] [:error] [pid 15] [remote 172.17.1.17:144]     _parse_args([], default_config_files=[conffile])
    [Mon Sep 03 14:50:34.144678 2018] [:error] [pid 15] [remote 172.17.1.17:144]   File "/usr/lib/python2.7/site-packages/nova/api/openstack/placement/wsgi.py", line 61, in _parse_args
    [Mon Sep 03 14:50:34.144695 2018] [:error] [pid 15] [remote 172.17.1.17:144]     logging.register_options(conf.CONF)
    [Mon Sep 03 14:50:34.144703 2018] [:error] [pid 15] [remote 172.17.1.17:144]   File "/usr/lib/python2.7/site-packages/oslo_log/log.py", line 250, in register_options
    [Mon Sep 03 14:50:34.144719 2018] [:error] [pid 15] [remote 172.17.1.17:144]     conf.register_cli_opts(_options.common_cli_opts)
    [Mon Sep 03 14:50:34.144727 2018] [:error] [pid 15] [remote 172.17.1.17:144]   File "/usr/lib/python2.7/site-packages/oslo_config/cfg.py", line 2501, in __inner
    [Mon Sep 03 14:50:34.144741 2018] [:error] [pid 15] [remote 172.17.1.17:144]     result = f(self, *args, **kwargs)
    [Mon Sep 03 14:50:34.144748 2018] [:error] [pid 15] [remote 172.17.1.17:144]   File "/usr/lib/python2.7/site-packages/oslo_config/cfg.py", line 2783, in register_cli_opts
    [Mon Sep 03 14:50:34.144761 2018] [:error] [pid 15] [remote 172.17.1.17:144]     self.register_cli_opt(opt, group, clear_cache=False)
    [Mon Sep 03 14:50:34.144768 2018] [:error] [pid 15] [remote 172.17.1.17:144]   File "/usr/lib/python2.7/site-packages/oslo_config/cfg.py", line 2505, in __inner
    [Mon Sep 03 14:50:34.144779 2018] [:error] [pid 15] [remote 172.17.1.17:144]     return f(self, *args, **kwargs)
    [Mon Sep 03 14:50:34.144786 2018] [:error] [pid 15] [remote 172.17.1.17:144]   File "/usr/lib/python2.7/site-packages/oslo_config/cfg.py", line 2775, in register_cli_opt
    [Mon Sep 03 14:50:34.144802 2018] [:error] [pid 15] [remote 172.17.1.17:144]     raise ArgsAlreadyParsedError("cannot register CLI option")
    [Mon Sep 03 14:50:34.144833 2018] [:error] [pid 15] [remote 172.17.1.17:144] ArgsAlreadyParsedError: arguments already parsed: cannot register CLI option

Comment 7 Michele Baldessari 2018-09-06 15:03:18 UTC
I think the closing of this bug came out of comment#4 which had a bunch of maybe misleading references to the DB being unreachable. The problem here is not about erroring out when the DB is not there. It is about being able to recover automatically after a control-plane/DB outage.

How to reproduce:
Simply take down the control plane by powering off all three controlles, wait a couple of mins(?), and power them back on (or even more simply take down the galera db with 'pcs resource disabe galera-bundle', make sure all nodes are down and wait a few minutes for nova_placement to detect the missing db connection and then re-enable galera 'pcs resource enable galera-bundle')

You will observe that all services eventually recover. As a matter of fact all services are mandated to recover and to retry to connect to the control plane since OSP10 (see https://github.com/openstack/tripleo-specs/blob/master/specs/newton/pacemaker-next-generation-architecture.rst for more information)

One of the observed sequences in the logs is the following:
1) The control plane is gone (reboot/reset/db maintenance/whatever) and we correctly observe the following message. So far this is all expected and correct:
[Fri Aug 31 15:19:05.987530 2018] [:error] [pid 22] [remote 172.17.1.16:0] mod_wsgi (pid=22): Exception occurred processing WSGI script '/var/www/cgi-bin/nova/nova-placement-api'.
[Fri Aug 31 15:19:05.987577 2018] [:error] [pid 22] [remote 172.17.1.16:0] Traceback (most recent call last):
[Fri Aug 31 15:19:05.987605 2018] [:error] [pid 22] [remote 172.17.1.16:0]   File "/var/www/cgi-bin/nova/nova-placement-api", line 54, in <module>
...
[Fri Aug 31 15:19:05.990710 2018] [:error] [pid 22] [remote 172.17.1.16:0] DBConnectionError: (pymysql.err.OperationalError) (2013, 'Lost connection to MySQL server during query') (Background on this error at: http://sqlalche.me/e/e3q8)


2) The problem comes from the fact that once the control plane is fully functional again, all openstack services manage to come back and reconnect to the control plane (db/rabbit), whereas nova placement seems to be stuck in a wsgi loop forever:
[Fri Aug 31 15:21:57.561423 2018] [:error] [pid 20] [remote 172.17.1.16:176] mod_wsgi (pid=20): Target WSGI script '/var/www/cgi-bin/nova/nova-placement-api' cannot be loaded as Python module.
[Fri Aug 31 15:21:57.561532 2018] [:error] [pid 20] [remote 172.17.1.16:176] mod_wsgi (pid=20): Exception occurred processing WSGI script '/var/www/cgi-bin/nova/nova-placement-api'.
[Fri Aug 31 15:21:57.561570 2018] [:error] [pid 20] [remote 172.17.1.16:176] Traceback (most recent call last):
[Fri Aug 31 15:21:57.561601 2018] [:error] [pid 20] [remote 172.17.1.16:176]   File "/var/www/cgi-bin/nova/nova-placement-api", line 54, in <module>
[Fri Aug 31 15:21:57.561656 2018] [:error] [pid 20] [remote 172.17.1.16:176]     application = init_application()
[Fri Aug 31 15:21:57.561682 2018] [:error] [pid 20] [remote 172.17.1.16:176]   File "/usr/lib/python2.7/site-packages/nova/api/openstack/placement/wsgi.py", line 97, in init_application
[Fri Aug 31 15:21:57.561732 2018] [:error] [pid 20] [remote 172.17.1.16:176]     _parse_args([], default_config_files=[conffile])
[Fri Aug 31 15:21:57.561751 2018] [:error] [pid 20] [remote 172.17.1.16:176]   File "/usr/lib/python2.7/site-packages/nova/api/openstack/placement/wsgi.py", line 61, in _parse_args
[Fri Aug 31 15:21:57.561786 2018] [:error] [pid 20] [remote 172.17.1.16:176]     logging.register_options(conf.CONF)
[Fri Aug 31 15:21:57.561814 2018] [:error] [pid 20] [remote 172.17.1.16:176]   File "/usr/lib/python2.7/site-packages/oslo_log/log.py", line 250, in register_options
[Fri Aug 31 15:21:57.561857 2018] [:error] [pid 20] [remote 172.17.1.16:176]     conf.register_cli_opts(_options.common_cli_opts)
[Fre way
Aug 31 15:21:57.561891 2018] [:error] [pid 20] [remote 172.17.1.16:176]   File "/usr/lib/python2.7/site-packages/oslo_config/cfg.py", line 2501, in __inner
[Fri Aug 31 15:21:57.563173 2018] [:error] [pid 20] [remote 172.17.1.16:176]     result = f(self, *args, **kwargs)
[Fri Aug 31 15:21:57.563206 2018] [:error] [pid 20] [remote 172.17.1.16:176]   File "/usr/lib/python2.7/site-packages/oslo_config/cfg.py", line 2783, in register_cli_opts
[Fri Aug 31 15:21:57.563249 2018] [:error] [pid 20] [remote 172.17.1.16:176]     self.register_cli_opt(opt, group, clear_cache=False)
[Fri Aug 31 15:21:57.563267 2018] [:error] [pid 20] [remote 172.17.1.16:176]   File "/usr/lib/python2.7/site-packages/oslo_config/cfg.py", line 2505, in __inner
[Fri Aug 31 15:21:57.563297 2018] [:error] [pid 20] [remote 172.17.1.16:176]     return f(self, *args, **kwargs)
[Fri Aug 31 15:21:57.563313 2018] [:error] [pid 20] [remote 172.17.1.16:176]   File "/usr/lib/python2.7/site-packages/oslo_config/cfg.py", line 2775, in register_cli_opt
[Fri Aug 31 15:21:57.563341 2018] [:error] [pid 20] [remote 172.17.1.16:176]     raise ArgsAlreadyParsedError("cannot register CLI option")
[Fri Aug 31 15:21:57.563377 2018] [:error] [pid 20] [remote 172.17.1.16:176] ArgsAlreadyParsedError: arguments already parsed: cannot register CLI option

Restarting the nova_placement container fixes it, but this requires manual intervention, which is a) a regression from previous OSP version and b) unwanted/unneeded. The nova part of the problem has been also captured here https://bugs.launchpad.net/tripleo/+bug/1784155 but I am under the impression that the simple control-plane restart condition has not yet been captured there.

A small, likely unclean and unacceptable fix (which Marian and I tested just to understand the issue) would be to just ignore the ArgsAlreadyParsedError exception in the _parse_args() function. Ideally we'd split the config reading code out of any wsgi loop, though

Comment 15 Martin Schuppert 2018-10-04 13:48:12 UTC
As discussed with Lee, closing this one as a duplicate of 1630069. We'll track the mentioned deployment race in a new BZ to avoid confusion.

*** This bug has been marked as a duplicate of bug 1630069 ***


Note You need to log in before you can comment on or make changes to this bug.