When grub2-mkconfig is called as a day 2 operation (for example, to change the kernelargs) the existing boot values should be taken from the /boot/loader/entries/<machine-id>-<kernel>.conf file. However this doesn't happen because the machine-id is different to when the image was built. To fix the /boot/loader/entries/<machine-id>-<kernel>.conf filename, a systemd unit needs to be included in the image which: - Runs after first-boot-complete.target - Has ConditionFirstBoot=yes - Renames /boot/loader/entries/ similar to [1] We know the overcloud image will be the only OS installed to disk, but /boot/loader/entries is a multi-OS mechanism. This is why this bug is filed against tripleo-image-elements instead of diskimage-builder. [1] https://opendev.org/openstack/diskimage-builder/src/branch/master/diskimage_builder/elements/centos/pre-install.d/03-reset-bls-entries
Downstream backports are now attached, only the diskimage-builder change is in a build so far.
(In reply to James Parker from comment #3) > @sbaker can you confirm this is ready for QA? Just tried this > with the latest puddle and I'm still seeing different machine ids in my > deployment I deployed the image from this package[1] to baremetal locally, it did not rename the BLS files. The image should have 'uninitialized' as the contents of /etc/machine-id but instead the file was empty. When I wrote uninitialized to /etc/machine-id and rebooted then it renamed the BLS entries as expected. The diskimage-builder package[2] has the change[3] which writes uninitialized to /etc/machine-id. I can't tell without looking at the overcloud-hardened-uefi-full.qcow2 build logs, but is it possible the image was built with a different version of diskimage-builder than the one in the compose? (setting NEEDINFO to Lon) [1] https://download.eng.bos.redhat.com/rcm-guest/puddles/OpenStack/17.0-RHEL-9/latest-RHOS-17-RHEL-9/compose/OpenStack/x86_64/os/Packages/rhosp-director-images-uefi-x86_64-17.0-20220603.1.test.el9ost.noarch.rpm [2] https://download.eng.bos.redhat.com/rcm-guest/puddles/OpenStack/17.0-RHEL-9/latest-RHOS-17-RHEL-9/compose/OpenStack/x86_64/os/Packages/diskimage-builder-3.21.2-0.20220514080754.2f06cbc.el9ost.noarch.rpm [3] https://review.opendev.org/c/openstack/diskimage-builder/+/837251/2/diskimage_builder/elements/sysprep/finalise.d/99-clear-machine-id#20
I can't tell without looking at the overcloud-hardened-uefi-full.qcow2 build logs, but is it possible the image was built with a different version of diskimage-builder than the one in the compose? (setting NEEDINFO to Lon)
I'm not sure if this is indicative of what ends up in rhosp-director-images-uefi-x86_64-17.0-20220603.1.test.el9ost.noarch.rpm, but this image build log[1] shows the expected diskimage-builder version. The logging level isn't high enough to show exactly what 99-clear-machine-id is running. [1] https://download-node-02.eng.bos.redhat.com/brewroot/packages/overcloud-hardened-uefi-full-x86_64/17.0/20220603.1.test/data/logs/image/oz-indirection.log
The image build packaging fix is in, so I'm moving this back to MODIFIED
I checked the image in this package[1] and it looks correct. I deployed it and the BLS files got renamed (see the commands run below) Some options for what is happening for you: - sourcing an image other than from rhosp-director-images-uefi-x86_64-17.0-20220623.1.test.el9ost.noarch.rpm - your deployment tooling is modifying the image with guestfish, and truncating /etc/machine-id Can you provide deployment logs which show neither of these are the case? Also running the following will show if the rename script is actually run and what it does, if it doesn't run at all then the image has been deployed with a modified /etc/machine-id: cat /var/log/messages |grep reset-bls-entries [root@localhost ~]# cat /etc/machine-id a5112b2cf42e4fe4a088e758c65d2acc [root@localhost ~]# ls /boot/loader/entries/ a5112b2cf42e4fe4a088e758c65d2acc-0-rescue.conf a5112b2cf42e4fe4a088e758c65d2acc-5.14.0-70.13.1.el9_0.x86_64.conf [root@localhost log]# cat /var/log/messages |grep reset-bls-entries Jun 29 00:09:24 localhost reset-bls-entries[880]: + pushd /boot/loader/entries Jun 29 00:09:24 localhost reset-bls-entries[880]: /boot/loader/entries / Jun 29 00:09:24 localhost reset-bls-entries[880]: + machine_id=a5112b2cf42e4fe4a088e758c65d2acc Jun 29 00:09:24 localhost reset-bls-entries[880]: + for entry in *.conf Jun 29 00:09:24 localhost reset-bls-entries[883]: ++ echo 1db7e42fdfe541e4b8e84a89422a4c5b-0-rescue.conf Jun 29 00:09:24 localhost reset-bls-entries[884]: ++ sed 's/^[a-f0-9]*/a5112b2cf42e4fe4a088e758c65d2acc/' Jun 29 00:09:24 localhost reset-bls-entries[880]: + new_entry=a5112b2cf42e4fe4a088e758c65d2acc-0-rescue.conf Jun 29 00:09:24 localhost reset-bls-entries[880]: + [[ 1db7e42fdfe541e4b8e84a89422a4c5b-0-rescue.conf != a5112b2cf42e4fe4a088e758c65d2acc-0-rescue.conf ]] Jun 29 00:09:24 localhost reset-bls-entries[880]: + echo 'renaming 1db7e42fdfe541e4b8e84a89422a4c5b-0-rescue.conf to a5112b2cf42e4fe4a088e758c65d2acc-0-rescue.conf for new machine-id' Jun 29 00:09:24 localhost reset-bls-entries[880]: renaming 1db7e42fdfe541e4b8e84a89422a4c5b-0-rescue.conf to a5112b2cf42e4fe4a088e758c65d2acc-0-rescue.conf for new machine-id Jun 29 00:09:24 localhost reset-bls-entries[880]: + mv 1db7e42fdfe541e4b8e84a89422a4c5b-0-rescue.conf a5112b2cf42e4fe4a088e758c65d2acc-0-rescue.conf Jun 29 00:09:24 localhost reset-bls-entries[880]: + for entry in *.conf Jun 29 00:09:24 localhost reset-bls-entries[887]: ++ echo 1db7e42fdfe541e4b8e84a89422a4c5b-5.14.0-70.13.1.el9_0.x86_64.conf Jun 29 00:09:24 localhost reset-bls-entries[888]: ++ sed 's/^[a-f0-9]*/a5112b2cf42e4fe4a088e758c65d2acc/' Jun 29 00:09:24 localhost reset-bls-entries[880]: + new_entry=a5112b2cf42e4fe4a088e758c65d2acc-5.14.0-70.13.1.el9_0.x86_64.conf Jun 29 00:09:24 localhost reset-bls-entries[880]: + [[ 1db7e42fdfe541e4b8e84a89422a4c5b-5.14.0-70.13.1.el9_0.x86_64.conf != a5112b2cf42e4fe4a088e758c65d2acc-5.14.0-70.13.1.el9_0.x86_64.conf ]] Jun 29 00:09:24 localhost reset-bls-entries[880]: + echo 'renaming 1db7e42fdfe541e4b8e84a89422a4c5b-5.14.0-70.13.1.el9_0.x86_64.conf to a5112b2cf42e4fe4a088e758c65d2acc-5.14.0-70.13.1.el9_0.x86_64.conf for new machine-id' Jun 29 00:09:24 localhost reset-bls-entries[880]: renaming 1db7e42fdfe541e4b8e84a89422a4c5b-5.14.0-70.13.1.el9_0.x86_64.conf to a5112b2cf42e4fe4a088e758c65d2acc-5.14.0-70.13.1.el9_0.x86_64.conf for new machine-id Jun 29 00:09:24 localhost reset-bls-entries[880]: + mv 1db7e42fdfe541e4b8e84a89422a4c5b-5.14.0-70.13.1.el9_0.x86_64.conf a5112b2cf42e4fe4a088e758c65d2acc-5.14.0-70.13.1.el9_0.x86_64.conf Jun 29 00:09:24 localhost reset-bls-entries[880]: + popd Jun 29 00:09:24 localhost reset-bls-entries[880]: / Jun 29 00:09:25 localhost kernel: audit: type=1130 audit(1656475765.203:135): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=reset-bls-entries comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' [1] https://download.eng.bos.redhat.com/rcm-guest/puddles/OpenStack/17.0-RHEL-9/RHOS-17.0-RHEL-9-20220623.n.1/compose/OpenStack/x86_64/os/Packages/rhosp-director-images-uefi-x86_64-17.0-20220623.1.test.el9ost.noarch.rpm
I found a couple of machine-id resets in infrared and posted a change https://review.gerrithub.io/c/redhat-openstack/infrared/+/540613
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Release of components for Red Hat OpenStack Platform 17.0 (Wallaby)), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2022:6543