Bug 2228928 - tripleo_ansible clobbers settings that ReaR saves into etc/rear/rescue.conf
Summary: tripleo_ansible clobbers settings that ReaR saves into etc/rear/rescue.conf
Keywords:
Status: ASSIGNED
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: tripleo-ansible
Version: 17.1 (Wallaby)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: ---
Assignee: Fernando Díaz
QA Contact: Joe H. Rahme
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-08-03 16:21 UTC by Pavel Cahyna
Modified: 2023-08-07 12:07 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker OSP-27183 0 None None None 2023-08-03 16:22:21 UTC

Description Pavel Cahyna 2023-08-03 16:21:04 UTC
Description of problem:

When debugging the issue described in https://bugzilla.redhat.com/show_bug.cgi?id=2222899#c19 I found out that the UEFI bootloader settings ( USING_UEFI_BOOTLOADER= and UEFI_BOOTLOADER= ) are not properly read from /etc/rear/rescue.conf during rear recovery. It turned out that the file contains this instead:

# This configuration file is generated automatically
# by the backup_and_restore role part of TripleO
# Ansible. Do not edit this file, all changes
# will be lost. Refer to the following URL for
# more information and implementation details:
# https://opendev.org/openstack/tripleo-ansible

BACKUP_PROG_OPTIONS+=( --anchored --xattrs-include='*.*' --xattrs )

and it is present in the system that was backed up. That's not how this file should be used. ReaR creates this file in the rescue image with settings that it has detected, but if the file is present in the system where the image is being produced, it overrides the file in the image. So, this file should not be present in the original system (AFAICT, it is even undocumented, the manual page documents only /etc/rear/local.conf and /etc/rear/site.conf). As a result, the settings that ReaR has autodetected are lost. Moreover, the line in /etc/rear/rescue.conf is useless, because the very same setting is present in /etc/rear/local.conf: 

BACKUP_PROG_OPTIONS+=( --anchored --xattrs-include='*.*' --xattrs )

so the tar arguments during file restore are then duplicated, as can be seen from the recovery debug log:
dd if=/var/tmp/rear.hfo8hrTDR3hREJr/outputfs/osp17-1r1-controller-2/backup.tar.gz | tar --block-number --totals --verbose --anchored --anchored --xattrs-include=*.* --xattrs --anchored --xattrs-include=*.* --xattrs --exclude-from=/var/tmp/rear.hfo8hrTDR3hREJr/tmp/restore-exclude-list.txt --gzip -C /mnt/local/ -x -f -
(fortunately, tar accepts this).

Version-Release number of selected component (if applicable):

The problem has existed since 0.3.0 according to Git: https://opendev.org/openstack/tripleo-ansible/src/tag/0.3.0/tripleo_ansible/roles/backup-and-restore/templates/rescue.conf.j2

How reproducible:

Not sure, I don't have the environment myself.

Steps to Reproduce:
1. Backup and recover controller on an UEFI machine

Actual results:

At the end of recovery, ReaR prints

WARNING:
For this system
RedHatEnterpriseServer/9 on Linux-i386 (based on Fedora/9/i386)
there is no code to install a boot loader on the recovered system
or the code that we have failed to install the boot loader correctly.
Please contribute appropriate code to the Relax-and-Recover project,
see http://relax-and-recover.org/development/
Take a look at the scripts in /usr/share/rear/finalize - for example
for PC architectures like x86 and x86_64 see the script
/usr/share/rear/finalize/Linux-i386/660_install_grub2.sh
and for POWER architectures like ppc64le see the script
/usr/share/rear/finalize/Linux-ppc64le/660_install_grub2.sh
---------------------------------------------------
|  IF YOU DO NOT INSTALL A BOOT LOADER MANUALLY,  |
|  THEN YOUR SYSTEM WILL NOT BE ABLE TO BOOT.     |
---------------------------------------------------
You can use 'chroot /mnt/local bash --login'
to change into the recovered system and
manually install a boot loader therein.


Expected results:

Recovery completes without warnings.

Additional info:

The problematic code is here:

https://opendev.org/openstack/tripleo-ansible/src/commit/e281ae7624774d71f22fbb993af967ed1ec08780/tripleo_ansible/roles/backup_and_restore/tasks/setup_rear.yml#L118

A customer has hit this in bz2222899


Note You need to log in before you can comment on or make changes to this bug.