Hide Forgot
openstack-ironic: OSP12 overcloud deployment status ERROR in nova list, the nodes weren't turned on by ironic. Environment: openstack-puppet-modules-10.0.0-0.20170315222135.0333c73.el7.1.noarch openstack-ironic-common-8.0.1-0.20170719072039.d9983f1.el7ost.noarch puppet-ironic-11.2.0-0.20170718122857.e15b0fd.el7ost.noarch python-ironic-inspector-client-1.12.0-0.20170331023022.045fd9e.el7ost.noarch openstack-tripleo-heat-templates-7.0.0-0.20170718190543.el7ost.noarch instack-undercloud-7.1.1-0.20170714211622.el7ost.noarch python-ironic-lib-2.8.0-0.20170621200136.56e7619.el7ost.noarch openstack-ironic-conductor-8.0.1-0.20170719072039.d9983f1.el7ost.noarch python-ironicclient-1.14.0-0.20170714004320.bd7ccdd.el7ost.noarch openstack-ironic-inspector-5.1.1-0.20170705203602.c38596e.el7ost.noarch openstack-ironic-api-8.0.1-0.20170719072039.d9983f1.el7ost.noarch python-virtualbmc-1.0.1-0.20170307140838.ef2c881.el7.noarch Steps to reproduce: Attempt to deploy OC OSP12 using remote registry for images (without uploading the images to the local registry). Result: The deployment fails: CREATE_COMPLETE state changed", "2017-07-25 18:34:40Z [overcloud.Controller.0.NodeAdminUserData]: CREATE_COMPLETE state changed", "2017-07-25 18:34:40Z [overcloud.Controller.0.NodeUserData]: CREATE_COMPLETE state changed", "2017-07-25 18:34:40Z [overcloud.Controller.0.UserData]: CREATE_IN_PROGRESS state changed", "2017-07-25 18:34:41Z [overcloud.Controller.0.UserData]: CREATE_COMPLETE state changed", "2017-07-25 18:34:41Z [overcloud.Controller.0.Controller]: CREATE_IN_PROGRESS state changed", "2017-07-25 18:34:44Z [overcloud.Compute.0.NovaCompute]: CREATE_FAILED ResourceInError: resources.NovaCompute: Went to status ERROR due to \"Message: No valid host was found. There are not enough hosts available., Code: 500\"", "2017-07-25 18:34:44Z [overcloud.Compute.0.NovaCompute]: DELETE_IN_PROGRESS state changed", "2017-07-25 18:34:45Z [overcloud.Controller.0.Controller]: CREATE_FAILED ResourceInError: resources.Controller: Went to status ERROR due to \"Message: No valid host was found. There are not enough hosts available., Code: 500\"", "2017-07-25 18:34:45Z [overcloud.Controller.0.Controller]: DELETE_IN_PROGRESS state changed", "2017-07-25 18:34:46Z [overcloud.Compute.0.NovaCompute]: DELETE_COMPLETE state changed", "2017-07-25 18:34:46Z [overcloud.Controller.0.Controller]: DELETE_COMPLETE state changed", "2017-07-25 18:34:49Z [overcloud.Controller.0.Controller]: CREATE_IN_PROGRESS state changed", "2017-07-25 18:34:50Z [overcloud.Compute.0.NovaCompute]: CREATE_IN_PROGRESS state changed", "2017-07-25 18:34:51Z [overcloud.Controller.0.Controller]: CREATE_FAILED ResourceInError: resources.Controller: Went to status ERROR due to \"Message: No valid host was found. There are not enough hosts available., Code: 500\"", "2017-07-25 18:34:51Z [overcloud.Controller.0.Controller]: DELETE_IN_PROGRESS state changed", "2017-07-25 18:34:51Z [overcloud.Compute.0.NovaCompute]: CREATE_FAILED ResourceInError: resources.NovaCompute: Went to status ERROR due to \"Message: No valid host was found. There are not enough hosts available., Code: 500\"", "2017-07-25 18:34:51Z [overcloud.Compute.0.NovaCompute]: DELETE_IN_PROGRESS state changed", "2017-07-25 18:34:52Z [overcloud.Controller.0.Controller]: DELETE_COMPLETE state changed", "2017-07-25 18:34:53Z [overcloud.Compute.0.NovaCompute]: DELETE_COMPLETE state changed", "2017-07-25 18:34:57Z [overcloud.Controller.0.Controller]: CREATE_IN_PROGRESS state changed", "2017-07-25 18:34:58Z [overcloud.Compute.0.NovaCompute]: CREATE_IN_PROGRESS state changed", "2017-07-25 18:34:59Z [overcloud.Controller.0.Controller]: CREATE_FAILED ResourceInError: resources.Controller: Went to status ERROR due to \"Message: No valid host was found. There are not enough hosts available., Code: 500\"", "2017-07-25 18:34:59Z [overcloud.Controller.0.Controller]: DELETE_IN_PROGRESS state changed", "2017-07-25 18:35:00Z [overcloud.Compute.0.NovaCompute]: CREATE_FAILED ResourceInError: resources.NovaCompute: Went to status ERROR due to \"Message: No valid host was found. There are not enough hosts available., Code: 500\"", "2017-07-25 18:35:00Z [overcloud.Compute.0.NovaCompute]: DELETE_IN_PROGRESS state changed", "2017-07-25 18:35:01Z [overcloud.Controller.0.Controller]: DELETE_COMPLETE state changed", "2017-07-25 18:35:01Z [overcloud.Compute.0.NovaCompute]: DELETE_COMPLETE state changed", "2017-07-25 18:35:11Z [overcloud.Controller.0.Controller]: CREATE_IN_PROGRESS state changed", "2017-07-25 18:35:11Z [overcloud.Compute.0.NovaCompute]: CREATE_IN_PROGRESS state changed", "2017-07-25 18:35:12Z [overcloud.Controller.0.Controller]: CREATE_FAILED ResourceInError: resources.Controller: Went to status ERROR due to \"Message: No valid host was found. There are not enough hosts available., Code: 500\"", "2017-07-25 18:35:12Z [overcloud.Controller.0.Controller]: DELETE_IN_PROGRESS state changed", "2017-07-25 18:35:13Z [overcloud.Compute.0.NovaCompute]: CREATE_FAILED ResourceInError: resources.NovaCompute: Went to status ERROR due to \"Message: No valid host was found. There are not enough hosts available., Code: 500\"", "2017-07-25 18:35:13Z [overcloud.Compute.0.NovaCompute]: DELETE_IN_PROGRESS state changed", "2017-07-25 18:35:14Z [overcloud.Controller.0.Controller]: DELETE_COMPLETE state changed", "2017-07-25 18:35:15Z [overcloud.Compute.0.NovaCompute]: DELETE_COMPLETE state changed", "2017-07-25 18:35:32Z [overcloud.Controller.0.Controller]: CREATE_IN_PROGRESS state changed", "2017-07-25 18:35:33Z [overcloud.Compute.0.NovaCompute]: CREATE_IN_PROGRESS state changed", "2017-07-25 18:35:34Z [overcloud.Controller.0.Controller]: CREATE_FAILED ResourceInError: resources.Controller: Went to status ERROR due to \"Message: No valid host was found. There are not enough hosts available., Code: 500\"", "2017-07-25 18:35:34Z [overcloud.Controller.0.Controller]: DELETE_IN_PROGRESS state changed", "2017-07-25 18:35:34Z [overcloud.Compute.0.NovaCompute]: CREATE_FAILED ResourceInError: resources.NovaCompute: Went to status ERROR due to \"Message: No valid host was found. There are not enough hosts available., Code: 500\"", "2017-07-25 18:35:34Z [overcloud.Compute.0.NovaCompute]: DELETE_IN_PROGRESS state changed", "2017-07-25 18:35:36Z [overcloud.Controller.0.Controller]: DELETE_COMPLETE state changed", "2017-07-25 18:35:36Z [overcloud.Compute.0.NovaCompute]: DELETE_COMPLETE state changed", "2017-07-25 18:36:09Z [overcloud.Controller.0.Controller]: CREATE_IN_PROGRESS state changed", "2017-07-25 18:36:09Z [overcloud.Compute.0.NovaCompute]: CREATE_IN_PROGRESS state changed", "2017-07-25 18:36:10Z [overcloud.Controller.0.Controller]: CREATE_FAILED ResourceInError: resources.Controller: Went to status ERROR due to \"Message: No valid host was found. There are not enough hosts available., Code: 500\"", "2017-07-25 18:36:11Z [overcloud.Compute.0.NovaCompute]: CREATE_FAILED ResourceInError: resources.NovaCompute: Went to status ERROR due to \"Message: No valid host was found. There are not enough hosts available., Code: 500\"", "2017-07-25 18:36:11Z [overcloud.Controller.0]: CREATE_FAILED Resource CREATE failed: ResourceInError: resources.Controller: Went to status ERROR due to \"Message: No valid host was found. There are not enough hosts available., Code: 500\"", "2017-07-25 18:36:11Z [overcloud.Controller.0]: CREATE_FAILED ResourceInError: resources[0].resources.Controller: Went to status ERROR due to \"Message: No valid host was found. There are not enough hosts available., Code: 500\"", "2017-07-25 18:36:11Z [overcloud.Controller]: UPDATE_FAILED ResourceInError: resources[0].resources.Controller: Went to status ERROR due to \"Message: No valid host was found. There are not enough hosts available., Code: 500\"", "2017-07-25 18:36:12Z [overcloud.Compute.0]: CREATE_FAILED Resource CREATE failed: ResourceInError: resources.NovaCompute: Went to status ERROR due to \"Message: No valid host was found. There are not enough hosts available., Code: 500\"", "2017-07-25 18:36:12Z [overcloud.Controller]: CREATE_FAILED resources.Controller: ResourceInError: resources[0].resources.Controller: Went to status ERROR due to \"Message: No valid host was found. There are not enough hosts available., Code: 500\"", "2017-07-25 18:36:12Z [overcloud.Compute]: CREATE_FAILED CREATE aborted", "2017-07-25 18:36:12Z [overcloud]: CREATE_FAILED Resource CREATE failed: resources.Controller: ResourceInError: resources[0].resources.Controller: Went to status ERROR due to \"Message: No valid host was found. There are not enough hosts available., Code: 500\"", "2017-07-25 18:36:13Z [overcloud.Compute.0]: CREATE_FAILED CREATE aborted", "2017-07-25 18:36:13Z [overcloud.Compute]: UPDATE_FAILED Operation cancelled", "", " Stack overcloud CREATE_FAILED ", "", "overcloud.Controller.0.Controller:", " resource_type: OS::TripleO::ControllerServer", " physical_resource_id: 98b6a18c-1435-4305-b019-94bca4c2985f", " status: CREATE_FAILED", " status_reason: |", " ResourceInError: resources.Controller: Went to status ERROR due to \"Message: No valid host was found. There are not enough hosts available., Code: 500\"", "overcloud.Compute.0.NovaCompute:", " resource_type: OS::TripleO::ComputeServer", " physical_resource_id: 7be74e74-9602-4f74-a65a-c41ba216b42e", " status: CREATE_FAILED", " status_reason: |", " ResourceInError: resources.NovaCompute: Went to status ERROR due to \"Message: No valid host was found. There are not enough hosts available., Code: 500\""]} Checking the status of nova nodes: (undercloud) [stack@undercloud-0 ~]$ nova list +--------------------------------------+------------------------+--------+------------+-------------+----------+ | ID | Name | Status | Task State | Power State | Networks | +--------------------------------------+------------------------+--------+------------+-------------+----------+ | 7be74e74-9602-4f74-a65a-c41ba216b42e | overcloud-compute-0 | ERROR | - | NOSTATE | | | 98b6a18c-1435-4305-b019-94bca4c2985f | overcloud-controller-0 | ERROR | - | NOSTATE | | +--------------------------------------+------------------------+--------+------------+-------------+----------+ The nodes in ironic remained off: (undercloud) [stack@undercloud-0 ~]$ ironic node-list +--------------------------------------+--------------+---------------+-------------+--------------------+-------------+ | UUID | Name | Instance UUID | Power State | Provisioning State | Maintenance | +--------------------------------------+--------------+---------------+-------------+--------------------+-------------+ | dbed9827-9525-4590-8889-4f092a1304fb | compute-0 | None | power off | available | False | | 4394beec-0ea5-4847-b66b-4e62ebabc819 | controller-0 | None | power off | available | False | Notes: 1) Simply re-deploying on the same setup without any changes works fine 2) Waiting for 10 minutes right before starting OC deployment also works fine. This explains why we didn't see the issue with local registry as it takes time to upload images.
Created attachment 1304421 [details] sosreports from undercloud
Hi Sasha - sounds like this is a remote registry issue if it works fine if delaying before starting deployment? Is it related to this bug - https://bugzilla.redhat.com/show_bug.cgi?id=1456986? I'm seeing quite a few swift errors with Permission denied, not sure if that is related: Jul 25 14:35:02 undercloud-0 object-server: Exception dumping recon cache: #012Traceback (most recent call last):#012 File "/usr/lib/python2.7/site-packages/swift/common/utils.py", line 3108, in dump_recon_cache#012 with lock_file(cache_file, lock_timeout, unlink=False) as cf:#012 File "/usr/lib64/python2.7/contextlib.py", line 17, in __enter__#012 return self.gen.next()#012 File "/usr/lib/python2.7/site-packages/swift/common/utils.py", line 2294, in lock_file#012 fd = os.open(filename, flags)#012OSError: [Errno 13] Permission denied: '/var/cache/swift/object.recon' Jul 25 14:45:02 undercloud-0 object-server: Exception dumping recon cache: #012Traceback (most recent call last):#012 File "/usr/lib/python2.7/site-packages/swift/common/utils.py", line 3108, in dump_recon_cache#012 with lock_file(cache_file, lock_timeout, unlink=False) as cf:#012 File "/usr/lib64/python2.7/contextlib.py", line 17, in __enter__#012 return self.gen.next()#012 File "/usr/lib/python2.7/site-packages/swift/common/utils.py", line 2294, in lock_file#012 fd = os.open(filename, flags)#012OSError: [Errno 13] Permission denied: '/var/cache/swift/object.recon'
The thing is that the same remote registry used after the 10 mins wait, so I don't see how it's related. With local registry though it takes something like 10 minutes to upload the images to the local registry and maybe that's why we don't see the issue.
Thanks Sasha. I'd like to have the nova team take a look at this. There's really not anything Ironic can do if these images aren't available. Compute DFG - please send this back if this is deemed hardware provisioning related but this timing between local and remote registries makes it seems like its not. Thanks.
According to the logs of nova-scheduler: sts for the request with instance ID '53499109-aa69-4650-8728-8638503104c3'. Filter results: ['RetryFilter: (start: 1, end: 1)', 'TripleOCapabilitiesFilter: (start: 1, end: 1)', 'ComputeCapabilitiesFilter: (start: 1, end: 1)', 'AvailabilityZoneFilter: (start: 1, end: 1)', 'RamFilter: (start: 1, end: 1)', 'DiskFilter: (start: 1, end: 1)', 'ComputeFilter: (start: 1, end: 0)'] The 'ComputeFilter' is returning 0 host. I'm not a specialist of the scheduler/filter but that one seems to indicate whether the compute node is enabled or disabled. In nova-compute.log we can see that error: No compute node record for host undercloud-0.redhat.local: ComputeHostNotFound_Remote: Compute host undercloud-0.redhat.local could not be found. Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/nova/conductor/manager.py", line 123, in _object_dispatch return getattr(target, method)(*args, **kwargs) File "/usr/lib/python2.7/site-packages/oslo_versionedobjects/base.py", line 184, in wrapper result = fn(cls, context, *args, **kwargs) File "/usr/lib/python2.7/site-packages/nova/objects/compute_node.py", line 437, in get_all_by_host use_slave=use_slave) File "/usr/lib/python2.7/site-packages/nova/db/sqlalchemy/api.py", line 235, in wrapper return f(*args, **kwargs) File "/usr/lib/python2.7/site-packages/nova/objects/compute_node.py", line 432, in _db_compute_node_get_all_by_host return db.compute_node_get_all_by_host(context, host) File "/usr/lib/python2.7/site-packages/nova/db/api.py", line 298, in compute_node_get_all_by_host return IMPL.compute_node_get_all_by_host(context, host) File "/usr/lib/python2.7/site-packages/nova/db/sqlalchemy/api.py", line 280, in wrapped return f(context, *args, **kwargs) File "/usr/lib/python2.7/site-packages/nova/db/sqlalchemy/api.py", line 713, in compute_node_get_all_by_host raise exception.ComputeHostNotFound(host=host) ComputeHostNotFound: Compute host undercloud-0.redhat.local could not be found. It's not clear for me what does that mean... Lets see if team have an idea
(In reply to Bob Fournier from comment #2) > Hi Sasha - sounds like this is a remote registry issue if it works fine if > delaying before starting deployment? I'm assuming that is a remote *docker* registry and not involved in the overcloud node deployment. However it means the ironic nodes are deployed much sooner after undercloud install as there is no time spent populating the local docker registry with images. Sasha also mentions that the issues does not occur after 10 minutes. This suggests some sort of timing issue in the service startup after the undercloud is installed.
Looking at this a bit closer... Seems ComputeFilter should be first on the list of filters https://bugs.launchpad.net/nova/+bug/1579213. It's not: https://github.com/openstack/instack-undercloud/blob/caa64ac1a912c8c59f1dfe4c2bfe0c44311b5a26/elements/puppet-stack-config/puppet-stack-config.yaml.template#L433 However that doesn't look like the root cause here. Seems nova-compute blocks is waiting for 10 minutes for a reply to a message: 2017-07-25 14:26:38.631 2502 ERROR oslo.messaging._drivers.impl_rabbit [req-a6cb755e-959b-4069-b494-0b6727e77d1e - - - - -] [da19e175-0dad-4c26-84d1-283bbbc0964d] AMQP server on 192.168.24.1:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 1 seconds. Client port: None: error: [Errno 111] ECONNREFUSED 2017-07-25 14:26:39.642 2502 INFO oslo.messaging._drivers.impl_rabbit [req-a6cb755e-959b-4069-b494-0b6727e77d1e - - - - -] [da19e175-0dad-4c26-84d1-283bbbc0964d] Reconnected to AMQP server on 192.168.24.1:5672 via [amqp] client with port 48580. 2017-07-25 14:36:39.665 2502 CRITICAL nova [req-a6cb755e-959b-4069-b494-0b6727e77d1e - - - - -] Unhandled error: MessagingTimeout: Timed out waiting for a reply to message ID 51a6ec286d124515b6c545250cd5e7d1 2017-07-25 14:36:39.665 2502 ERROR nova Traceback (most recent call last): 2017-07-25 14:36:39.665 2502 ERROR nova File "/usr/bin/nova-compute", line 10, in <module> 2017-07-25 14:36:39.665 2502 ERROR nova sys.exit(main()) 2017-07-25 14:36:39.665 2502 ERROR nova File "/usr/lib/python2.7/site-packages/nova/cmd/compute.py", line 56, in main 2017-07-25 14:36:39.665 2502 ERROR nova topic=CONF.compute_topic) 2017-07-25 14:36:39.665 2502 ERROR nova File "/usr/lib/python2.7/site-packages/nova/service.py", line 240, in create 2017-07-25 14:36:39.665 2502 ERROR nova periodic_interval_max=periodic_interval_max) 2017-07-25 14:36:39.665 2502 ERROR nova File "/usr/lib/python2.7/site-packages/nova/service.py", line 116, in __init__ 2017-07-25 14:36:39.665 2502 ERROR nova self.manager = manager_class(host=self.host, *args, **kwargs) 2017-07-25 14:36:39.665 2502 ERROR nova File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 534, in __init__ 2017-07-25 14:36:39.665 2502 ERROR nova self.driver = driver.load_compute_driver(self.virtapi, compute_driver) 2017-07-25 14:36:39.665 2502 ERROR nova File "/usr/lib/python2.7/site-packages/nova/virt/driver.py", line 1636, in load_compute_driver 2017-07-25 14:36:39.665 2502 ERROR nova virtapi) 2017-07-25 14:36:39.665 2502 ERROR nova File "/usr/lib/python2.7/site-packages/oslo_utils/importutils.py", line 44, in import_object 2017-07-25 14:36:39.665 2502 ERROR nova return import_class(import_str)(*args, **kwargs) 2017-07-25 14:36:39.665 2502 ERROR nova File "/usr/lib/python2.7/site-packages/nova/virt/ironic/driver.py", line 152, in __init__ 2017-07-25 14:36:39.665 2502 ERROR nova self._refresh_hash_ring(nova_context.get_admin_context()) 2017-07-25 14:36:39.665 2502 ERROR nova File "/usr/lib/python2.7/site-packages/nova/virt/ironic/driver.py", line 541, in _refresh_hash_ring 2017-07-25 14:36:39.665 2502 ERROR nova ctxt, self._get_hypervisor_type()) 2017-07-25 14:36:39.665 2502 ERROR nova File "/usr/lib/python2.7/site-packages/oslo_versionedobjects/base.py", line 177, in wrapper 2017-07-25 14:36:39.665 2502 ERROR nova args, kwargs) 2017-07-25 14:36:39.665 2502 ERROR nova File "/usr/lib/python2.7/site-packages/nova/conductor/rpcapi.py", line 240, in object_class_action_versions 2017-07-25 14:36:39.665 2502 ERROR nova args=args, kwargs=kwargs) 2017-07-25 14:36:39.665 2502 ERROR nova File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/client.py", line 169, in call 2017-07-25 14:36:39.665 2502 ERROR nova retry=self.retry) 2017-07-25 14:36:39.665 2502 ERROR nova File "/usr/lib/python2.7/site-packages/oslo_messaging/transport.py", line 123, in _send 2017-07-25 14:36:39.665 2502 ERROR nova timeout=timeout, retry=retry) 2017-07-25 14:36:39.665 2502 ERROR nova File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 578, in send 2017-07-25 14:36:39.665 2502 ERROR nova retry=retry) 2017-07-25 14:36:39.665 2502 ERROR nova File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 567, in _send 2017-07-25 14:36:39.665 2502 ERROR nova result = self._waiter.wait(msg_id, timeout) 2017-07-25 14:36:39.665 2502 ERROR nova File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 459, in wait 2017-07-25 14:36:39.665 2502 ERROR nova message = self.waiters.get(msg_id, timeout=timeout) 2017-07-25 14:36:39.665 2502 ERROR nova File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 347, in get 2017-07-25 14:36:39.665 2502 ERROR nova 'to message ID %s' % msg_id) 2017-07-25 14:36:39.665 2502 ERROR nova MessagingTimeout: Timed out waiting for a reply to message ID 51a6ec286d124515b6c545250cd5e7d1 2017-07-25 14:36:39.665 2502 ERROR nova ironic-conductor connected to the message bus a few seconds after the message was sent at 14:26:41: 2017-07-25 14:26:41.051 1360 INFO oslo.messaging._drivers.impl_rabbit [req-f968bd19-1358-42ad-ba97-0fc08cd18fd2 - - - - -] [b5d7cb06-f274-4edb-9eff-3fad669455be] Reconnected to AMQP server on 192.168.24.1:5672 via [amqp] client with port 48628. Don't know much about the messaging setup (and I've probably forgotten everything I ever knew about rabbitmq) but it looks like a dropped message is the root cause.
nova-conductor log: 2017-07-25 14:36:09.788 2928 ERROR nova.conductor.manager Traceback (most recent call last): 2017-07-25 14:36:09.788 2928 ERROR nova.conductor.manager File "/usr/lib/python2.7/site-packages/nova/conductor/manager.py", line 920, in schedule_and_build_instances 2017-07-25 14:36:09.788 2928 ERROR nova.conductor.manager instance_uuids) 2017-07-25 14:36:09.788 2928 ERROR nova.conductor.manager File "/usr/lib/python2.7/site-packages/nova/conductor/manager.py", line 624, in _schedule_instances 2017-07-25 14:36:09.788 2928 ERROR nova.conductor.manager request_spec, instance_uuids) 2017-07-25 14:36:09.788 2928 ERROR nova.conductor.manager File "/usr/lib/python2.7/site-packages/nova/scheduler/utils.py", line 464, in wrapped 2017-07-25 14:36:09.788 2928 ERROR nova.conductor.manager return func(*args, **kwargs) 2017-07-25 14:36:09.788 2928 ERROR nova.conductor.manager File "/usr/lib/python2.7/site-packages/nova/scheduler/client/__init__.py", line 52, in select_destinations 2017-07-25 14:36:09.788 2928 ERROR nova.conductor.manager instance_uuids) 2017-07-25 14:36:09.788 2928 ERROR nova.conductor.manager File "/usr/lib/python2.7/site-packages/nova/scheduler/client/__init__.py", line 37, in __run_method 2017-07-25 14:36:09.788 2928 ERROR nova.conductor.manager return getattr(self.instance, __name)(*args, **kwargs) 2017-07-25 14:36:09.788 2928 ERROR nova.conductor.manager File "/usr/lib/python2.7/site-packages/nova/scheduler/client/query.py", line 33, in select_destinations 2017-07-25 14:36:09.788 2928 ERROR nova.conductor.manager instance_uuids) 2017-07-25 14:36:09.788 2928 ERROR nova.conductor.manager File "/usr/lib/python2.7/site-packages/nova/scheduler/rpcapi.py", line 136, in select_destinations 2017-07-25 14:36:09.788 2928 ERROR nova.conductor.manager return cctxt.call(ctxt, 'select_destinations', **msg_args) 2017-07-25 14:36:09.788 2928 ERROR nova.conductor.manager File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/client.py", line 169, in call 2017-07-25 14:36:09.788 2928 ERROR nova.conductor.manager retry=self.retry) 2017-07-25 14:36:09.788 2928 ERROR nova.conductor.manager File "/usr/lib/python2.7/site-packages/oslo_messaging/transport.py", line 123, in _send 2017-07-25 14:36:09.788 2928 ERROR nova.conductor.manager timeout=timeout, retry=retry) 2017-07-25 14:36:09.788 2928 ERROR nova.conductor.manager File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 578, in send 2017-07-25 14:36:09.788 2928 ERROR nova.conductor.manager retry=retry) 2017-07-25 14:36:09.788 2928 ERROR nova.conductor.manager File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 569, in _send 2017-07-25 14:36:09.788 2928 ERROR nova.conductor.manager raise result 2017-07-25 14:36:09.788 2928 ERROR nova.conductor.manager NoValidHost_Remote: No valid host was found. There are not enough hosts available.
Ignore comment #8, the fact that the exception is coming from amqpdriver.py made me think it was an issue connecting to rabbitmq when in fact it has connected successfully and is raising an Exception it received over the message bus in this traceback.
Don't reproduce the issue now.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:3462