Description of problem: Attempting to deploy an overcloud with a minion that has ironic-conductor fails to provision the systems. The following error can be found in the ironic logs: 2020-09-25 00:01:34.388 7 ERROR futurist.periodics [req-4abe3d66-06b3-4ebf-8acd-6286d3b32dd6 - - - - -] Failed to call periodic 'ironic.conductor.manager.ConductorManager._sync_power_states' (it runs every 60.00 seconds): ironic.common.exception.InterfaceNotFoundInEntrypoint: Could not find the following interface in the 'ironic.hardware.interfaces.inspect' entrypoint: inspector. Valid interfaces are ['no-inspect']. 2020-09-25 00:01:34.388 7 ERROR futurist.periodics Traceback (most recent call last): 2020-09-25 00:01:34.388 7 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/ironic/common/driver_factory.py", line 94, in get_interface 2020-09-25 00:01:34.388 7 ERROR futurist.periodics impl_instance = factory.get_driver(interface_name) 2020-09-25 00:01:34.388 7 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/ironic/common/driver_factory.py", line 323, in get_driver 2020-09-25 00:01:34.388 7 ERROR futurist.periodics return self[name].obj 2020-09-25 00:01:34.388 7 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/ironic/common/driver_factory.py", line 320, in __getitem__ 2020-09-25 00:01:34.388 7 ERROR futurist.periodics return self._extension_manager[name] 2020-09-25 00:01:34.388 7 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/stevedore/extension.py", line 326, in __getitem__ 2020-09-25 00:01:34.388 7 ERROR futurist.periodics return self._extensions_by_name[name] 2020-09-25 00:01:34.388 7 ERROR futurist.periodics KeyError: 'inspector' 2020-09-25 00:01:34.388 7 ERROR futurist.periodics 2020-09-25 00:01:34.388 7 ERROR futurist.periodics During handling of the above exception, another exception occurred: 2020-09-25 00:01:34.388 7 ERROR futurist.periodics 2020-09-25 00:01:34.388 7 ERROR futurist.periodics Traceback (most recent call last): 2020-09-25 00:01:34.388 7 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/futurist/periodics.py", line 290, in run 2020-09-25 00:01:34.388 7 ERROR futurist.periodics work() 2020-09-25 00:01:34.388 7 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/futurist/periodics.py", line 64, in __call__ 2020-09-25 00:01:34.388 7 ERROR futurist.periodics return self.callback(*self.args, **self.kwargs) 2020-09-25 00:01:34.388 7 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/ironic_lib/metrics.py", line 60, in wrapped 2020-09-25 00:01:34.388 7 ERROR futurist.periodics result = f(*args, **kwargs) 2020-09-25 00:01:34.388 7 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/futurist/periodics.py", line 178, in decorator 2020-09-25 00:01:34.388 7 ERROR futurist.periodics return f(*args, **kwargs) 2020-09-25 00:01:34.388 7 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/ironic/conductor/manager.py", line 1774, in _sync_power_states 2020-09-25 00:01:34.388 7 ERROR futurist.periodics self._sync_power_state_nodes_task(context, nodes_queue) 2020-09-25 00:01:34.388 7 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/ironic/conductor/manager.py", line 1820, in _sync_power_state_nodes_task 2020-09-25 00:01:34.388 7 ERROR futurist.periodics shared=True) as task: 2020-09-25 00:01:34.388 7 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/ironic/conductor/task_manager.py", line 161, in acquire 2020-09-25 00:01:34.388 7 ERROR futurist.periodics return TaskManager(context, *args, **kwargs) 2020-09-25 00:01:34.388 7 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/ironic/conductor/task_manager.py", line 243, in __init__ 2020-09-25 00:01:34.388 7 ERROR futurist.periodics self.release_resources() 2020-09-25 00:01:34.388 7 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__ 2020-09-25 00:01:34.388 7 ERROR futurist.periodics self.force_reraise() 2020-09-25 00:01:34.388 7 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise 2020-09-25 00:01:34.388 7 ERROR futurist.periodics six.reraise(self.type_, self.value, self.tb) 2020-09-25 00:01:34.388 7 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/six.py", line 675, in reraise 2020-09-25 00:01:34.388 7 ERROR futurist.periodics raise value 2020-09-25 00:01:34.388 7 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/ironic/conductor/task_manager.py", line 237, in __init__ 2020-09-25 00:01:34.388 7 ERROR futurist.periodics self.driver = driver_factory.build_driver_for_task(self) 2020-09-25 00:01:34.388 7 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/ironic/common/driver_factory.py", line 53, in build_driver_for_task 2020-09-25 00:01:34.388 7 ERROR futurist.periodics check_and_update_node_interfaces(node, hw_type=hw_type) 2020-09-25 00:01:34.388 7 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/ironic/common/driver_factory.py", line 205, in check_and_update_node_interfaces 2020-09-25 00:01:34.388 7 ERROR futurist.periodics get_interface(hw_type, iface, impl_name) 2020-09-25 00:01:34.388 7 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/ironic/common/driver_factory.py", line 99, in get_interface 2020-09-25 00:01:34.388 7 ERROR futurist.periodics valid=factory.names) 2020-09-25 00:01:34.388 7 ERROR futurist.periodics ironic.common.exception.InterfaceNotFoundInEntrypoint: Could not find the following interface in the 'ironic.hardware.interfaces.inspect' entrypoint: inspector. Valid interfaces are ['no-inspect']. 2020-09-25 00:01:34.388 7 ERROR futurist.periodics There appears to be an inconsistency in the ironic configuration between the undercloud vs the minion. The minion has: enabled_inspect_interfaces=no-inspect The undercloud has: enabled_inspect_interfaces=idrac,ilo,inspector,no-inspect,redfish Version-Release number of selected component (if applicable): [root@minion ironic]# rpm -qa | grep tripleo puppet-tripleo-11.5.0-1.20200914161840.f716ef5.el8ost.noarch openstack-tripleo-heat-templates-11.3.2-1.20200914170155.29a02c1.el8ost.noarch ansible-tripleo-ipsec-9.2.1-0.20200311073016.0c8693c.el8ost.noarch ansible-role-tripleo-modify-image-1.2.1-0.20200804085623.1dffa21.el8ost.noarch openstack-tripleo-puppet-elements-11.2.2-0.20200701163410.432518a.el8ost.noarch python3-tripleoclient-12.3.2-1.20200914164926.72161e0.el8ost.noarch openstack-tripleo-common-containers-11.4.1-1.20200914165650.974b967.el8ost.noarch python3-tripleo-common-11.4.1-1.20200914165650.974b967.el8ost.noarch openstack-tripleo-common-11.4.1-1.20200914165650.974b967.el8ost.noarch openstack-tripleo-image-elements-10.6.2-0.20200528043425.7dc0fa1.el8ost.noarch openstack-tripleo-validations-11.3.2-1.20200914170823.4db92ba.el8ost.noarch ansible-tripleo-ipa-0.2.1-1.20200813093411.3bb3c53.el8ost.noarch tripleo-ansible-0.5.1-1.20200914163922.902c3c8.el8ost.noarch python3-tripleoclient-heat-installer-12.3.2-1.20200914164926.72161e0.el8ost.noarch How reproducible: When ironic conductor is enabled on the minon Steps to Reproduce: 1. Deploy minion with ironic conductor enabled 2. Attempt to deploy overcloud Actual results: Overcloud deployment times out because the systems are not provisioned Expected results: Nodes provisioned and overcloud deployed
*** Bug 1872121 has been marked as a duplicate of this bug. ***
Using RHOS-16.1-RHEL-8-20201110.n.1 still seeing overcloud deploy timeout: 2020-11-16 02:04:21Z [overcloud.Compute]: CREATE_FAILED CREATE aborted (Task create from ResourceGroup "Compute" Stack "overcloud" [a46c7f6c-fb19-4fc0-b9a1-4e594f6a2f0f] Timed out) 2020-11-16 02:04:21Z [overcloud.Compute]: UPDATE_FAILED Stack UPDATE cancelled 2020-11-16 02:04:21Z [overcloud]: CREATE_FAILED Timed out 2020-11-16 02:04:21Z [overcloud.Compute.0]: CREATE_FAILED Stack CREATE cancelled 2020-11-16 02:04:22Z [overcloud.Compute.0]: CREATE_FAILED resources[0]: Stack CREATE cancelled 2020-11-16 02:04:22Z [overcloud.Compute]: UPDATE_FAILED Resource CREATE failed: resources[0]: Stack CREATE cancelled Stack overcloud/a46c7f6c-fb19-4fc0-b9a1-4e594f6a2f0f CREATE_FAILED overcloud.Compute.0: resource_type: OS::TripleO::Compute physical_resource_id: 2e165f40-bd92-4348-b3eb-50cda0a08922 status: CREATE_FAILED status_reason: | resources[0]: Stack CREATE cancelled This is seen in /var/log/containers/ironic-conductor.log: 2020-11-16 00:26:26.688 8 DEBUG oslo_concurrency.lockutils [req-adf7b601-a62a-4cce-b724-22d4cb731bb5 78a39f59dac140a9a0664a46a77324a9 47bbc35fd2324e95940a5ae6c36b4bf5 - default default] Releasing lock "download-image" lock /usr/lib/python3.6/site-packages/oslo_concurrency/lockutils.py:281 2020-11-16 00:26:26.689 8 ERROR ironic.conductor.utils [req-adf7b601-a62a-4cce-b724-22d4cb731bb5 78a39f59dac140a9a0664a46a77324a9 47bbc35fd2324e95940a5ae6c36b4bf5 - default default] Node a63237a2-c89b-4faa-bccc-7111fda12c4d failed deploy step {'step': 'deploy', 'priority': 100, 'argsinfo': None, 'interface': 'deploy'} with unexpected error: Unexpected error while running command. Command: /usr/bin/python3 -m oslo_concurrency.prlimit --as=1073741824 -- qemu-img convert -O raw /var/lib/ironic/master_images/tmpbvm5hgu2/1423bc17-594f-4c5b-90fc-87b158786b41.part /var/lib/ironic/master_images/tmpbvm5hgu2/1423bc17-594f-4c5b-90fc-87b158786b41.converted Exit code: -6 Stdout: '' Stderr: 'qemu: qemu_thread_create: Resource temporarily unavailable\n': oslo_concurrency.processutils.ProcessExecutionError: Unexpected error while running command. Command: /usr/bin/python3 -m oslo_concurrency.prlimit --as=1073741824 -- qemu-img convert -O raw /var/lib/ironic/master_images/tmpbvm5hgu2/1423bc17-594f-4c5b-90fc-87b158786b41.part /var/lib/ironic/master_images/tmpbvm5hgu2/1423bc17-594f-4c5b-90fc-87b158786b41.converted Exit code: -6 Stdout: '' Stderr: 'qemu: qemu_thread_create: Resource temporarily unavailable\n' 2020-11-16 00:26:26.689 8 ERROR ironic.conductor.utils Traceback (most recent call last): 2020-11-16 00:26:26.689 8 ERROR ironic.conductor.utils File "/usr/lib/python3.6/site-packages/ironic/conductor/manager.py", line 3951, in _do_next_deploy_step 2020-11-16 00:26:26.689 8 ERROR ironic.conductor.utils result = interface.execute_deploy_step(task, step) 2020-11-16 00:26:26.689 8 ERROR ironic.conductor.utils File "/usr/lib/python3.6/site-packages/ironic/drivers/base.py", line 360, in execute_deploy_step 2020-11-16 00:26:26.689 8 ERROR ironic.conductor.utils return self._execute_step(task, step) 2020-11-16 00:26:26.689 8 ERROR ironic.conductor.utils File "/usr/lib/python3.6/site-packages/ironic/drivers/base.py", line 283, in _execute_step 2020-11-16 00:26:26.689 8 ERROR ironic.conductor.utils return getattr(self, step['step'])(task) 2020-11-16 00:26:26.689 8 ERROR ironic.conductor.utils File "/usr/lib/python3.6/site-packages/ironic_lib/metrics.py", line 60, in wrapped 2020-11-16 00:26:26.689 8 ERROR ironic.conductor.utils result = f(*args, **kwargs) 2020-11-16 00:26:26.689 8 ERROR ironic.conductor.utils File "/usr/lib/python3.6/site-packages/ironic/conductor/task_manager.py", line 148, in wrapper 2020-11-16 00:26:26.689 8 ERROR ironic.conductor.utils return f(*args, **kwargs) 2020-11-16 00:26:26.689 8 ERROR ironic.conductor.utils File "/usr/lib/python3.6/site-packages/ironic/drivers/modules/iscsi_deploy.py", line 422, in deploy 2020-11-16 00:26:26.689 8 ERROR ironic.conductor.utils deploy_utils.cache_instance_image(task.context, node) 2020-11-16 00:26:26.689 8 ERROR ironic.conductor.utils File "/usr/lib/python3.6/site-packages/ironic_lib/metrics.py", line 60, in wrapped 2020-11-16 00:26:26.689 8 ERROR ironic.conductor.utils result = f(*args, **kwargs) 2020-11-16 00:26:26.689 8 ERROR ironic.conductor.utils File "/usr/lib/python3.6/site-packages/ironic/drivers/modules/deploy_utils.py", line 1197, in cache_instance_image 2020-11-16 00:26:26.689 8 ERROR ironic.conductor.utils force_raw) 2020-11-16 00:26:26.689 8 ERROR ironic.conductor.utils File "/usr/lib/python3.6/site-packages/ironic/drivers/modules/deploy_utils.py", line 531, in fetch_images 2020-11-16 00:26:26.689 8 ERROR ironic.conductor.utils cache.fetch_image(href, path, ctx=ctx, force_raw=force_raw) 2020-11-16 00:26:26.689 8 ERROR ironic.conductor.utils File "/usr/lib/python3.6/site-packages/ironic/drivers/modules/image_cache.py", line 140, in fetch_image 2020-11-16 00:26:26.689 8 ERROR ironic.conductor.utils href, master_path, dest_path, ctx=ctx, force_raw=force_raw) 2020-11-16 00:26:26.689 8 ERROR ironic.conductor.utils File "/usr/lib/python3.6/site-packages/ironic/drivers/modules/image_cache.py", line 167, in _download_image 2020-11-16 00:26:26.689 8 ERROR ironic.conductor.utils _fetch(ctx, href, tmp_path, force_raw) 2020-11-16 00:26:26.689 8 ERROR ironic.conductor.utils File "/usr/lib/python3.6/site-packages/ironic/drivers/modules/image_cache.py", line 324, in _fetch 2020-11-16 00:26:26.689 8 ERROR ironic.conductor.utils images.image_to_raw(image_href, path, path_tmp) 2020-11-16 00:26:26.689 8 ERROR ironic.conductor.utils File "/usr/lib/python3.6/site-packages/ironic/common/images.py", line 361, in image_to_raw 2020-11-16 00:26:26.689 8 ERROR ironic.conductor.utils disk_utils.convert_image(path_tmp, staged, 'raw') 2020-11-16 00:26:26.689 8 ERROR ironic.conductor.utils File "/usr/lib/python3.6/site-packages/ironic_lib/disk_utils.py", line 413, in convert_image 2020-11-16 00:26:26.689 8 ERROR ironic.conductor.utils use_standard_locale=True) 2020-11-16 00:26:26.689 8 ERROR ironic.conductor.utils File "/usr/lib/python3.6/site-packages/ironic_lib/utils.py", line 99, in execute 2020-11-16 00:26:26.689 8 ERROR ironic.conductor.utils result = processutils.execute(*cmd, **kwargs) 2020-11-16 00:26:26.689 8 ERROR ironic.conductor.utils File "/usr/lib/python3.6/site-packages/oslo_concurrency/processutils.py", line 424, in execute 2020-11-16 00:26:26.689 8 ERROR ironic.conductor.utils cmd=sanitized_cmd) 2020-11-16 00:26:26.689 8 ERROR ironic.conductor.utils oslo_concurrency.processutils.ProcessExecutionError: Unexpected error while running command. 2020-11-16 00:26:26.689 8 ERROR ironic.conductor.utils Command: /usr/bin/python3 -m oslo_concurrency.prlimit --as=1073741824 -- qemu-img convert -O raw /var/lib/ironic/master_images/tmpbvm5hgu2/1423bc17-594f-4c5b-90fc-87b158786b41.part /var/lib/ironic/master_images/tmpbvm5hgu2/1423bc17-594f-4c5b-90fc-87b158786b41.converted 2020-11-16 00:26:26.689 8 ERROR ironic.conductor.utils Exit code: -6 2020-11-16 00:26:26.689 8 ERROR ironic.conductor.utils Stdout: '' 2020-11-16 00:26:26.689 8 ERROR ironic.conductor.utils Stderr: 'qemu: qemu_thread_create: Resource temporarily unavailable\n' 2020-11-16 00:26:26.689 8 ERROR ironic.conductor.utils 2020-11-16 00:26:26.716 8 DEBUG oslo_concurrency.lockutils [req-adf7b601-a62a-4cce-b724-22d4cb731bb5 78a39f59dac140a9a0664a46a77324a9 47bbc35fd2324e95940a5ae6c36b4bf5 - default default] Lock "master_image" acquired by "ironic.drivers.modules.image_cache.ImageCache.clean_up" :: waited 0.000s inner /usr/lib/python3.6/site-packages/oslo_concurrency/lockutils.py:327 Link to test log: https://rhos-ci-jenkins.lab.eng.tlv2.redhat.com/view/DFG/view/df/view/rfe/job/DFG-df-rfe-16.1-virsh-1cont_1comp_1ceph-minion-verification/18/
The linked test failure doesn't look related to the subject bug?
I did two submissions that both timed out before moving the BZ back to on_dev. Did a third test and still overcloud deploy timed out. The logs to the third test are here: https://rhos-ci-jenkins.lab.eng.tlv2.redhat.com/view/DFG/view/df/view/rfe/job/DFG-df-rfe-16.1-virsh-1cont_1comp_1ceph-minion-verification/19/
This time, the failure cause of https://rhos-ci-jenkins.lab.eng.tlv2.redhat.com/view/DFG/view/df/view/rfe/job/DFG-df-rfe-16.1-virsh-1cont_1comp_1ceph-minion-verification/19/ looks different. I'm not sure what it is, can see nothing strange in ironic logs
n.exception.InterfaceNotFoundInEntrypoint: Could not find the following interface in the 'ironic.hardware.interfaces.boot' entrypoint: ipxe. Valid interfaces are ['ilo-pxe', 'pxe']. 2020-11-19 09:55:21.570 7 ERROR futurist.periodics Traceback (most recent call last): 2020-11-19 09:55:21.570 7 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/ironic/common/driver_factory.py", line 94, in get_interface 2020-11-19 09:55:21.570 7 ERROR futurist.periodics impl_instance = factory.get_driver(interface_name) 2020-11-19 09:55:21.570 7 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/ironic/common/driver_factory.py", line 323, in get_driver 2020-11-19 09:55:21.570 7 ERROR futurist.periodics return self[name].obj 2020-11-19 09:55:21.570 7 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/ironic/common/driver_factory.py", line 320, in __getitem__ 2020-11-19 09:55:21.570 7 ERROR futurist.periodics return self._extension_manager[name] 2020-11-19 09:55:21.570 7 ERROR futurist.periodics File "/usr/lib/python3.6/site-packages/stevedore/extension.py", line 326, in __getitem__ 2020-11-19 09:55:21.570 7 ERROR futurist.periodics return self._extensions_by_name[name] 2020-11-19 09:55:21.570 7 ERROR futurist.periodics KeyError: 'ipxe'
Undercloud hiera has: "ironic::drivers::interfaces::enabled_bios_interfaces": [ "ilo", "no-bios", "redfish" ], "ironic::drivers::interfaces::enabled_boot_interfaces": [ "ilo-pxe", "ipxe", "pxe" ], "ironic::drivers::interfaces::enabled_console_interfaces": [ "ipmitool-socat", "ilo", "no-console" ], "ironic::drivers::interfaces::enabled_deploy_interfaces": [ "ansible", "direct", "iscsi" ], "ironic::drivers::interfaces::enabled_inspect_interfaces": [ "idrac", "ilo", "inspector", "no-inspect", "redfish" ], "ironic::drivers::interfaces::enabled_management_interfaces": [ "fake", "idrac", "ilo", "ipmitool", "noop", "redfish" ], "ironic::drivers::interfaces::enabled_network_interfaces": [ "flat" ], "ironic::drivers::interfaces::enabled_power_interfaces": [ "fake", "idrac", "ilo", "ipmitool", "redfish" ], And the minion: "ironic::drivers::interfaces::enabled_bios_interfaces": [ "ilo", "no-bios", "redfish" ], "ironic::drivers::interfaces::enabled_boot_interfaces": [ "ilo-pxe", "ipxe", "pxe" ], "ironic::drivers::interfaces::enabled_console_interfaces": [ "ipmitool-socat", "ilo", "no-console" ], "ironic::drivers::interfaces::enabled_deploy_interfaces": [ "ansible", "direct", "iscsi" ], "ironic::drivers::interfaces::enabled_inspect_interfaces": [ "idrac", "ilo", "inspector", "no-inspect", "redfish" ], "ironic::drivers::interfaces::enabled_management_interfaces": [ "fake", "idrac", "ilo", "ipmitool", "noop", "redfish" ], "ironic::drivers::interfaces::enabled_network_interfaces": [ "flat" ], "ironic::drivers::interfaces::enabled_power_interfaces": [ "fake", "idrac", "ilo", "ipmitool", "redfish" ], The crux should be in the enabled_boot_interfaces disalignment (minion has no ipxi there) But in tht we define it the same way for undercloud and minion: IronicEnabledBootInterfaces: ['pxe', 'ilo-pxe'] I'm not sure how to proceed with fixing that, sorry.
There is an configuration mismatch between the undercloud and minion. freeipa-0/var/lib/config-data/puppet-generated/ironic/etc/ironic/ironic.conf:enabled_boot_interfaces=pxe,ilo-pxe undercloud-0/var/lib/config-data/puppet-generated/ironic/etc/ironic/ironic.conf:enabled_boot_interfaces=ilo-pxe,ipxe,pxe
We're missing the Ironic*Interface logic in the minion installation that occurs with the undercloud.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat OpenStack Platform 16.1.3 bug fix and enhancement advisory), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2020:5413