Bug 1812149

Summary: [osp16] unable to deploy to efi overcloud nodes after undercloud upgrade OSP16.0.0 -> 16.0.1
Product: Red Hat OpenStack Reporter: Chris Janiszewski <cjanisze>
Component: openstack-ironicAssignee: RHOS Maint <rhos-maint>
Status: CLOSED DUPLICATE QA Contact: Alistair Tonner <atonner>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 16.0 (Train)CC: bfournie, mburns
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-03-10 20:53:54 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Chris Janiszewski 2020-03-10 16:03:19 UTC
Description of problem:
After a minor update to undercloud, ironic fails to deploy overcloud nodes that are in the uefi mode:

ironic.common.exception.InstanceDeployFailure: Failed to install a bootloader when deploying node 39e994ff-a692-41d3-8849-fef2e5a2920c. Error: {'type': 'FileNotFoun
dError', 'code': 500, 'message': "[Errno 2] No such file or directory: 'efibootmgr': 'efibootmgr'", 'details': ''}


Version-Release number of selected component (if applicable):
OSP16.0.0 -> 16.0.1

How reproducible:


Steps to Reproduce:
1. Deploy OSP16.0.0
2. Add efi nodes to ironic/inspect
3. deploy overcloud
4. delete overcloud
5. upgrade undercloud to OSP16.0.1
6. Try to re-deploy overcloud to the same (efi) nodes

Actual results:
Error

Expected results:
Success

Additional info:
In this lab I have one more node that is not running in UEFI (Legacy) and that node deployed fine
sosreport -> http://chrisj.cloud/sosreport-undercloud-osp16-2020-03-10-fvguqbk.tar.xz

Comment 1 Bob Fournier 2020-03-10 16:20:58 UTC
Chris - the OSP-16z1 IPA build is missing a package that is needed for UEFI, see https://bugzilla.redhat.com/show_bug.cgi?id=1810604.

Comment 2 Bob Fournier 2020-03-10 16:22:34 UTC
Specifically see https://bugzilla.redhat.com/show_bug.cgi?id=1810604#c9.  There is a hotfix available but the ipa rpm from that hotfix only should be used.

Comment 3 Bob Fournier 2020-03-10 20:53:54 UTC

*** This bug has been marked as a duplicate of bug 1810604 ***

Comment 4 Chris Janiszewski 2020-03-11 16:25:40 UTC
Thanks Bob for identifying the issue.
I have pulled the hotfixed images and now my Ironic complains with the following:

Stderr: 'qemu: qemu_thread_create: Resource temporarily unavailable\n': oslo_concurrency.processutils.ProcessExecutionError: Unexpected error while running command.
Command: /usr/bin/python3 -m oslo_concurrency.prlimit --as=1073741824 -- qemu-img convert -O raw /var/lib/ironic/master_images/tmp20nk8iua/b3e73f31-014b-439d-9e89-67e6e4f195ad.par
t /var/lib/ironic/master_images/tmp20nk8iua/b3e73f31-014b-439d-9e89-67e6e4f195ad.converted
Exit code: -6
Stdout: ''
Stderr: 'qemu: qemu_thread_create: Resource temporarily unavailable\n'
2020-03-11 12:08:19.640 7 ERROR ironic.conductor.utils Traceback (most recent call last):
2020-03-11 12:08:19.640 7 ERROR ironic.conductor.utils   File "/usr/lib/python3.6/site-packages/ironic/conductor/manager.py", line 3914, in _do_next_deploy_step
2020-03-11 12:08:19.640 7 ERROR ironic.conductor.utils     result = interface.execute_deploy_step(task, step)
2020-03-11 12:08:19.640 7 ERROR ironic.conductor.utils   File "/usr/lib/python3.6/site-packages/ironic/drivers/base.py", line 360, in execute_deploy_step
2020-03-11 12:08:19.640 7 ERROR ironic.conductor.utils     return self._execute_step(task, step)
2020-03-11 12:08:19.640 7 ERROR ironic.conductor.utils   File "/usr/lib/python3.6/site-packages/ironic/drivers/base.py", line 283, in _execute_step
2020-03-11 12:08:19.640 7 ERROR ironic.conductor.utils     return getattr(self, step['step'])(task)
2020-03-11 12:08:19.640 7 ERROR ironic.conductor.utils   File "/usr/lib/python3.6/site-packages/ironic_lib/metrics.py", line 60, in wrapped
2020-03-11 12:08:19.640 7 ERROR ironic.conductor.utils     result = f(*args, **kwargs)
2020-03-11 12:08:19.640 7 ERROR ironic.conductor.utils   File "/usr/lib/python3.6/site-packages/ironic/conductor/task_manager.py", line 148, in wrapper
2020-03-11 12:08:19.640 7 ERROR ironic.conductor.utils     return f(*args, **kwargs)
2020-03-11 12:08:19.640 7 ERROR ironic.conductor.utils   File "/usr/lib/python3.6/site-packages/ironic/drivers/modules/iscsi_deploy.py", line 422, in deploy
2020-03-11 12:08:19.640 7 ERROR ironic.conductor.utils     deploy_utils.cache_instance_image(task.context, node)
2020-03-11 12:08:19.640 7 ERROR ironic.conductor.utils   File "/usr/lib/python3.6/site-packages/ironic_lib/metrics.py", line 60, in wrapped
2020-03-11 12:08:19.640 7 ERROR ironic.conductor.utils     result = f(*args, **kwargs)
2020-03-11 12:08:19.640 7 ERROR ironic.conductor.utils   File "/usr/lib/python3.6/site-packages/ironic/drivers/modules/deploy_utils.py", line 1197, in cache_instance_image
2020-03-11 12:08:19.640 7 ERROR ironic.conductor.utils     force_raw)
2020-03-11 12:08:19.640 7 ERROR ironic.conductor.utils   File "/usr/lib/python3.6/site-packages/ironic/drivers/modules/deploy_utils.py", line 531, in fetch_images
2020-03-11 12:08:19.640 7 ERROR ironic.conductor.utils     cache.fetch_image(href, path, ctx=ctx, force_raw=force_raw)
2020-03-11 12:08:19.640 7 ERROR ironic.conductor.utils   File "/usr/lib/python3.6/site-packages/ironic/drivers/modules/image_cache.py", line 140, in fetch_image
2020-03-11 12:08:19.640 7 ERROR ironic.conductor.utils     href, master_path, dest_path, ctx=ctx, force_raw=force_raw)
2020-03-11 12:08:19.640 7 ERROR ironic.conductor.utils   File "/usr/lib/python3.6/site-packages/ironic/drivers/modules/image_cache.py", line 167, in _download_image
2020-03-11 12:08:19.640 7 ERROR ironic.conductor.utils     _fetch(ctx, href, tmp_path, force_raw)
2020-03-11 12:08:19.640 7 ERROR ironic.conductor.utils   File "/usr/lib/python3.6/site-packages/ironic/drivers/modules/image_cache.py", line 324, in _fetch
2020-03-11 12:08:19.640 7 ERROR ironic.conductor.utils     images.image_to_raw(image_href, path, path_tmp)
2020-03-11 12:08:19.640 7 ERROR ironic.conductor.utils   File "/usr/lib/python3.6/site-packages/ironic/common/images.py", line 361, in image_to_raw
2020-03-11 12:08:19.640 7 ERROR ironic.conductor.utils     disk_utils.convert_image(path_tmp, staged, 'raw')
2020-03-11 12:08:19.640 7 ERROR ironic.conductor.utils   File "/usr/lib/python3.6/site-packages/ironic_lib/disk_utils.py", line 392, in convert_image
2020-03-11 12:08:19.640 7 ERROR ironic.conductor.utils     utils.execute(*cmd, run_as_root=run_as_root, prlimit=QEMU_IMG_LIMITS)
2020-03-11 12:08:19.640 7 ERROR ironic.conductor.utils   File "/usr/lib/python3.6/site-packages/ironic_lib/utils.py", line 99, in execute
2020-03-11 12:08:19.640 7 ERROR ironic.conductor.utils     result = processutils.execute(*cmd, **kwargs)
2020-03-11 12:08:19.640 7 ERROR ironic.conductor.utils   File "/usr/lib/python3.6/site-packages/oslo_concurrency/processutils.py", line 424, in execute
2020-03-11 12:08:19.640 7 ERROR ironic.conductor.utils     cmd=sanitized_cmd)
2020-03-11 12:08:19.640 7 ERROR ironic.conductor.utils oslo_concurrency.processutils.ProcessExecutionError: Unexpected error while running command.
2020-03-11 12:08:19.640 7 ERROR ironic.conductor.utils Command: /usr/bin/python3 -m oslo_concurrency.prlimit --as=1073741824 -- qemu-img convert -O raw /var/lib/ironic/master_images/tmp20nk8iua/b3e73f31-014b-439d-9e89-67e6e4f195ad.part /var/lib/ironic/master_images/tmp20nk8iua/b3e73f31-014b-439d-9e89-67e6e4f195ad.converted
2020-03-11 12:08:19.640 7 ERROR ironic.conductor.utils Exit code: -6
2020-03-11 12:08:19.640 7 ERROR ironic.conductor.utils Stdout: ''
2020-03-11 12:08:19.640 7 ERROR ironic.conductor.utils Stderr: 'qemu: qemu_thread_create: Resource temporarily unavailable\n'
2020-03-11 12:08:19.640 7 ERROR ironic.conductor.utils 


I am running a mix of VM and BM, with VM running in legacy mode and power control via vbmc and the BM are supermicro UEFI boards.
Do I need to open another BZ for this ?

Comment 5 Bob Fournier 2020-03-11 17:00:18 UTC
Chris - I have not seen that before but I will dig into downstream and upstream bug reports to see if it has come up.  Yes can you open another bug as that is unrelated to this efibootmgr issue.

Comment 6 Bob Fournier 2020-05-08 12:53:14 UTC
Chris - did you ever up another bug for Comment 4?  Just asking because we've seen something similar and didn't want to duplicate it. If not we'll open a new one. Thanks.

Comment 7 Chris Janiszewski 2020-05-08 13:20:44 UTC
Hey Bob,

I totally missed it and have not opened another bug for this. Sorry for that. I have moved on to another project and didn't have a chance to reproduce.

Chris

Comment 8 Bob Fournier 2020-05-08 13:26:20 UTC
No problem, thanks Chris.