Bug 1716672 - UEFI: Installing GRUB2 boot loader to device /dev/vda failed with Unexpected error while running command
Summary: UEFI: Installing GRUB2 boot loader to device /dev/vda failed with Unexpected ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: diskimage-builder
Version: 15.0 (Stein)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: beta
: 15.0 (Stein)
Assignee: Ben Nemec
QA Contact: mlammon
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-06-03 21:28 UTC by mlammon
Modified: 2020-07-24 13:37 UTC (History)
6 users (show)

Fixed In Version: diskimage-builder-2.24.1-0.20190619080405.8dd7ca7.el8ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-09-21 11:22:59 UTC
Target Upstream Version:
Embargoed:
bnemec: needinfo-
bnemec: needinfo-


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 663693 0 'None' MERGED Use architecture-specific grub2 RPMs on RHEL8 2020-09-30 21:07:33 UTC
Red Hat Product Errata RHEA-2019:2811 0 None None None 2019-09-21 11:23:10 UTC

Description mlammon 2019-06-03 21:28:39 UTC
Description of problem:
Failed to install a bootloader when deploying node <uuid>. Error: {'type': 'CommandExecutionError', 'code': 500, 'message': 'Command execution failed: Installing GRUB2 boot loader to device /dev/vda failed with Unexpected error while running command.\nCommand: chroot /tmp/tmpl9zbxrib /bin/sh -c "grub2-install /dev/vda"\nExit code:

Trying to deploy oc nodes with bootmode=uefi, rhel8, openstack 15

How reproducible:
100%

Steps to Reproduce:
1. Deploy undercloud (images, introspect, tagging, etc) with bootmode uefi
2. Deploy overcloud
3.

Actual results:
Deployment failed

Expected results:
Successful deployment of node

Additional info:
cat undercloud.conf
[DEFAULT]
# Network interface on the Undercloud that will be handling the PXE
# boots and DHCP for Overcloud instances. (string value)
local_interface = eth0
local_ip = 192.168.24.1/24
undercloud_public_host = 192.168.24.2
undercloud_admin_host = 192.168.24.3
#TODO: use release >= 10 when RHBZ#1633193 is resolved
undercloud_ntp_servers=clock.redhat.com
container_images_file=/home/stack/containers-prepare-parameter.yaml
container_insecure_registries = brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888
undercloud_timezone = UTC
undercloud_service_certificate = /etc/pki/instack-certs/undercloud.pem
hieradata_override = /home/stack/hiera_override.yaml
[ctlplane-subnet]
local_subnet = ctlplane-subnet
cidr = 192.168.24.0/24
dhcp_start = 192.168.24.5
dhcp_end = 192.168.24.24
gateway = 192.168.24.1
inspection_iprange = 192.168.24.100,192.168.24.120
masquerade = true
#TODO(skatlapa): add param to override masq
ipxe_enabled = True
inspection_enable_uefi = True


(undercloud) [stack@undercloud-0 ~]$ openstack baremetal node show controller-0  --field properties
+------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Field      | Value                                                                                                                                                                |
+------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| properties | {'cpus': '4', 'memory_mb': '20480', 'local_gb': '39', 'cpu_arch': 'x86_64', 'capabilities': 'profile:controller,boot_mode:uefi,boot_option:local,node:controller-0'} |
+------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------+

(undercloud) [stack@undercloud-0 ~]$ openstack server list
+--------------------------------------+--------------+--------+----------+----------------+------------+
| ID                                   | Name         | Status | Networks | Image          | Flavor     |
+--------------------------------------+--------------+--------+----------+----------------+------------+
| a74e2195-6a65-4c67-9a74-051fec974d62 | controller-0 | ERROR  |          | overcloud-full | controller |
| fd6c80c1-5b59-47b2-8c2a-cb403a77fc21 | compute-1    | ERROR  |          | overcloud-full | compute    |
| 997e534a-581e-4d5b-8c54-c7871cb7e0f8 | controller-1 | BUILD  |          | overcloud-full | controller |
| dba6b021-4609-47c6-8dd2-205a73b3377d | controller-2 | BUILD  |          | overcloud-full | controller |
| c8ffbaf5-00eb-4dc3-97e2-aa198d470a6f | compute-0    | BUILD  |          | overcloud-full | compute    |
+--------------------------------------+--------------+--------+----------+----------------+------------+

(undercloud) [stack@undercloud-0 ~]$ openstack stack failures list overcloud
overcloud.Controller.0.Controller:
  resource_type: OS::TripleO::ControllerServer
  physical_resource_id: a74e2195-6a65-4c67-9a74-051fec974d62
  status: CREATE_FAILED
  status_reason: |
    ResourceInError: resources.Controller: Went to status ERROR due to "Message: Exceeded maximum number of retries. Exhausted all hosts available for retrying build failures for instance a74e2195-6a65-4c67-9a74-051fec974d62., Code: 500"
overcloud.Compute.1.NovaCompute:
  resource_type: OS::TripleO::ComputeServer
  physical_resource_id: fd6c80c1-5b59-47b2-8c2a-cb403a77fc21
  status: CREATE_FAILED
  status_reason: |
    ResourceInError: resources.NovaCompute: Went to status ERROR due to "Message: Exceeded maximum number of retries. Exhausted all hosts available for retrying build failures for instance fd6c80c1-5b59-47b2-8c2a-cb403a77fc21., Code: 500"


2019-06-03 19:23:11.922 8 ERROR oslo.service.loopingcall nova.exception.InstanceDeployFailure: Failed to provision instance 6014882c-60e3-4898-8264-5a1409f3d3e9: Failed to install a bootloader when deploying node c385eb47-397b-4b44-a8d6-0758f564d80e. Error: {'type': 'CommandExecutionError', 'code': 500, 'message': 'Command execution failed: Installing GRUB2 boot loader to device /dev/vda failed with Unexpected error while running command.\nCommand: chroot /tmp/tmpzxxtjkc_ /bin/sh -c "grub2-install /dev/vda"\nExit code: 1\nStdout: \'\'\nStderr: "grub2-install: error: /usr/l
ib/grub/x86_64-efi/modinfo.sh doesn\'t exist. Please specify --target or --directory.\\n".', 'details': 'Installing GRUB2 boot loader to device /dev/vda failed with Unexpected error while running command.\nCommand: chroot /tmp/tmpzxxtjkc_ /bin/sh -c "grub2-install /dev/vda"\nExit code: 1\nStdout: \'\'\nStderr: "grub2-install: error: /usr/lib/grub/x86_64-efi/modinfo.sh doesn\'t exist. Please specify --target or --directory.\\n".'}

2019-06-03 19:23:11.923 8 ERROR nova.virt.ironic.driver [req-0726e69d-63ec-4065-8e69-061cc0236dcf aba260735ba94be4a6fd56541fdcfb48 a0db2596e4994e3dafdf5fe97b16d008 - default default] Error depl
oying instance 6014882c-60e3-4898-8264-5a1409f3d3e9 on baremetal node c385eb47-397b-4b44-a8d6-0758f564d80e.: nova.exception.InstanceDeployFailure: Failed to provision instance 6014882c-60e3-4898-8264-5a1409f3d3e9: Failed to install a bootloader when deploying node c385eb47-397b-4b44-a8d6-0758f564d80e. Error: {'type': 'CommandExecutionError', 'code': 500, 'message': 'Command execution failed: Installing GRUB2 boot loader to device /dev/vda failed with Unexpected error while running command.\nCommand: chroot /tmp/tmpzxxtjkc_ /bin/sh -c "grub2-install /dev/vda"\nExit code: 1\
nStdout: \'\'\nStderr: "grub2-install: error: /usr/lib/grub/x86_64-efi/modinfo.sh doesn\'t exist. Please specify --target or --directory.\\n".', 'details': 'Installing GRUB2 boot loader to device /dev/vda failed with Unexpected error while running command.\nCommand: chroot /tmp/tmpzxxtjkc_ /bin/sh -c "grub2-install /dev/vda"\nExit code: 1\nStdout: \'\'\nStderr: "grub2-install: error: /usr/lib/grub/x86_64-efi/modinfo.sh doesn\'t exist. Please specify --target or --directory.\\n".'}



2019-06-03 19:19:54.853 7 DEBUG ironic.drivers.modules.agent_client [req-c9f9235e-7631-4c26-a1d7-34a8ef492cdc - - - - -] Agent command image.install_bootloader for node c385eb47-397b-4b44-a8d6-0758f564d80e returned result None, error {'type': 'CommandExecutionError', 'code': 500, 'message': 'Command execution failed: Installing GRUB2 boot loader to device /dev/vda failed with Unexpected error while running command.\nCommand: chroot /tmp/tmpjrm6mw0b /bin/sh -c "grub2-install /dev/vda"\nExit code: 1\nStdout: \'\'\nStderr: "grub2-install: error: /usr/lib/grub/x86_64-efi/modinfo.sh doesn\'t exist. Please specify --target or --directory.\\n".', 'details': 'Installing GRUB2 boot loader to device /dev/vda failed with Unexpected error while running command.\nCommand: chroot /tmp/tmpjrm6mw0b /bin/sh -c "grub2-install /dev/vda"\nExit code: 1\nStdout: \'\'\nStderr: "grub2-install: error: /usr/lib/grub/x86_64-efi/modinfo.sh doesn\'t exist. Please specify --target or --directory.\\n".'}, HTTP status code 200 _command /usr/lib/python3.6/site-packages/ironic/drivers/modules/agent_client.py:122



 [root@undercloud-0 nova]# ls -l /usr/lib/grub/x86_64-efi/modinfo.sh
ls: cannot access '/usr/lib/grub/x86_64-efi/modinfo.sh': No such file or directory
 [root@undercloud-0 nova]# yum whatprovides /usr/lib/grub/x86_64-efi/modinfo.sh
RHOS 15.0-trunk Override                                                                                                                                         6.3 kB/s | 284  B     00:00
Failed to synchronize cache for repo 'rhelosp-15.0-image-build-override', ignoring this repo.
grub2-efi-x64-modules-1:2.02-66.el8.noarch : Modules used to build custom grub.efi images
Repo        : rhosp-rhel-8.0-baseos
Matched from:
Filename    : /usr/lib/grub/x86_64-efi/modinfo.sh

Comment 2 Yolanda Robla 2019-06-04 06:58:24 UTC
Hi, is that on a security hardened image or with the default one?

Comment 3 mlammon 2019-06-04 13:04:29 UTC
Hi Yolanda, This is the default image.

Comment 4 mlammon 2019-06-04 15:32:43 UTC
Let me know if you need the broken environment.  Right now I have to re-purpose .

Comment 5 Yolanda Robla 2019-06-05 07:18:25 UTC
Hi, being the default image i have no control on how it's built. It seems to be a problem with UEFI not being enabled, you should contact with the team building it:
The error is

Stderr: "grub2-install: error: /usr/lib/grub/x86_64-efi/modinfo.sh doesn\'t exist."

Comment 6 Julia Kreger 2019-06-05 15:23:18 UTC
Seems like grub2-efi-x64.x86_64 might be missing from the image.

Comment 7 Bob Fournier 2019-06-05 19:44:16 UTC
As Julia indicated, it seems that the OSP-15 overcloud and ipa images are missing the grub2-efi-x64 package.

We see it in OSP-13/RHEL 7.6:
[heat-admin@overcloud-controller-0 ~]$ sudo rpm -qa | grep grub2-efi
grub2-efi-x64-2.02-0.65.el7_4.2.x86_64
grub2-efi-x64-modules-2.02-0.65.el7_4.2.noarch

$ ls /usr/lib/grub/:    i386-pc  x86_64-efi


On OSP-15 we see:
[root@overcloud-controller-2 ~]# rpm -qa | grep grub2-efi                                                                                                                                          
grub2-efi-ia32-2.02-66.el8.x86_64
grub2-efi-aa64-modules-2.02-66.el8.noarch

$ ls /usr/lib/grub/:   arm64-efi  i386-pc

Comment 8 Bob Fournier 2019-06-06 12:19:49 UTC
Lon has a downstream build fix to ensure grub2-efi-x64 packages are included in the next compose.

Comment 9 Lon Hohberger 2019-06-06 12:45:55 UTC
This has to do with the RHEL8 builds of grub2. There ends up being a bunch of architecture-specific modules which are "noarch" as well as both x86_64 and i686 RPMs in the RHEL8 repositories. Since all the "noarch" module RPMs provide "grub2-efi-modules" and the architecture-specific grub2-efi-x64 and grub2-efi-ia32 RPMs provide "grub2-efi", there's a bit of nondeterminism.

I think a proper fix to diskimage-builder would be to use the new RHEL8 bits that merged last week - select specifically the grub2-efi-x64 and grub2-efi-x64-modules packages in the grub2 element (or possibly the bootloader element) when RHEL8 is in use. It looks like, right now, x86_64 is the only architecture to use the EFI RPMs from a DIB perspective, so this should not be difficult. I'll don't mind taking a look at this.

Comment 10 Lon Hohberger 2019-06-06 14:51:18 UTC
https://review.opendev.org/#/c/663693/

Comment 11 Bob Fournier 2019-06-06 15:28:46 UTC
Thanks Lon!  We can handle this bug by testing your downstream workaround and, if it works, removing the blocker flag (since we'll have a workaround). We'll then use the bug to track the upstream fix.

Comment 12 Lon Hohberger 2019-06-06 17:32:43 UTC
Sounds good!

Comment 18 mlammon 2019-07-09 15:12:25 UTC
Env
diskimage-builder-2.24.1-0.20190628131635.091a4e2.el8ost.noarch.rpm

Regression testing completed and verified deployment

Comment 22 errata-xmlrpc 2019-09-21 11:22:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2019:2811


Note You need to log in before you can comment on or make changes to this bug.