Bug 1911535
| Summary: | OSP 13 undercloud recovery wasn't possible because XFS filesystems were created with reflink=1 | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | Vagner Farias <vfarias> |
| Component: | rear | Assignee: | Pavel Cahyna <pcahyna> |
| Status: | CLOSED NOTABUG | QA Contact: | CS System Management SST QE <rhel-cs-system-management-subsystem-qe> |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 8.2 | CC: | elicohen, jbadiapa, joflynn, kthakre, ovasik, pcahyna, pveiga |
| Target Milestone: | rc | Flags: | pm-rhel:
mirror+
|
| Target Release: | 8.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-03-02 13:41:10 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 1916851, 1921668 | ||
|
Description
Vagner Farias
2020-12-29 22:52:03 UTC
Hello, is RHEL 8 involved in any way? From your description it looks that RHEL 7 is involved. Please specify the complete name-version-release of the package, as printed by the rpm utility (you gave only rear-2.4, which does not contain the release part that I could use to identify the exact package build.) I don't see then how you can set -m reflink on RHEL 7, and how it is possible that ReaR has created a filesystem with reflink on RHEL 7, because mkfs.xfs does not support this. From further conversation, it looks that you have booted a RHEL 7 kernel on a RHEL 8 system, is that the case? What does it mean "During controller leapp"? Does leapp refer to the RHEL-7 > RHEL-8 upgrade process? Leapp[1] is a process from the framwork to upgrade from OSP13 to OSP16.1, so, the entire environment from RHEL7 to RHEL8 "The long-life Red Hat OpenStack Platform upgrade also requires an upgrade from Red Hat Enterprise Linux 7 to Red Hat Enterprise Linux 8. Red Hat Enterprise Linux 7 includes a tool named leapp, which performs the upgrade to Red Hat Enterprise Linux 8. Both the undercloud and overcloud use a separate process for performing the operating system upgrade." [1] https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.1/html-single/framework_for_upgrades_13_to_16.1/index#leapp-upgrade-usage-in-red-hat-openstack-platform @Juan identified the root cause, which I could confirm.
Timeline was like the following:
- day 1
. rear mkbackup generated rescue ISO image and system backup
. undercloud upgrade from 13 to 16.1 (ie RHEL 7.9 to RHEL 8.2)
. several problems to upgrade overcloud
- day 2
. tentatives to fix problems to upgrade overcloud were unsuccessful.
. download undercloud rescue ISO image from NFS server to consultant laptop
. boot from undercloud rescue ISO image and recover from backup
However, there's a cron job that runs everyday at 1:30am that generates a new rescue image if there were changes in disk layout.
~~~
# cat /etc/cron.d/rear
30 1 * * * root /usr/sbin/rear checklayout || /usr/sbin/rear mkrescue
~~~
This means that between day 1 and day 2 the rescue ISO image was regenerated while RHEL 8.2 was running, thus generating a RHEL 8.2 image.
~~~
Dec 24 01:30:01 os2001 CROND[176657]: (root) CMD (/usr/sbin/rear checklayout || /usr/sbin/rear mkrescue)
~~~
The log file for "rear mkrescue" is already overwritten and we can only observe "checklayout" being executed in /var/log/rear/rear-director.log, but the symptom indicates that mkrescue was indeed executed and current ReaR configuration overwrites the existing rescue image.
~~~
# grep ^ISO_PREFIX /etc/rear/local.conf
ISO_PREFIX=director
~~~
It seems there's nothing wrong with ReaR. Instead, the OpenStack backup documentation[1] could be improved to suggest a configuration that could avoid the image to be overwritten. I did one test adding variables to ISO_PREFIX and it worked, but I'm not sure this is the best way of doing it. See below my configuration file and the output:
~~~
# cat /etc/rear/local.conf
export DATETIME=$(date +%Y%m%d-%H%M)
export RHELRELEASE=$(lsb_release -rs)
OUTPUT=ISO
OUTPUT_URL=nfs://172.16.110.1/home/export/ctl_plane_backups
ISO_PREFIX=director-${RHELRELEASE}-${DATETIME}
BACKUP=NETFS
BACKUP_PROG_COMPRESS_OPTIONS=( --gzip )
BACKUP_PROG_COMPRESS_SUFFIX=".gz"
BACKUP_PROG_EXCLUDE=( '/tmp/*' '/data/*' )
BACKUP_URL=nfs://172.16.110.1/home/export/ctl_plane_backups
BACKUP_PROG_EXCLUDE=("${BACKUP_PROG_EXCLUDE[@]}" '/media' '/var/tmp' '/var/crash')
BACKUP_PROG_OPTIONS+=( --anchored --xattrs-include='*.*' --xattrs )
--
[root@tesla director]# pwd
/home/export/ctl_plane_backups/director
[root@tesla director]# ls -l
total 226440
-rw-------. 1 root root 231116800 Jan 5 12:34 director-7.7-20210105-1030.iso
-rw-------. 1 root root 202 Jan 5 12:34 README
-rw-------. 1 root root 746270 Jan 5 12:34 rear-director.log
-rw-------. 1 root root 273 Jan 5 12:34 VERSION
~~~
I can't tell if this will always work, so more comprehensive testing should be done.
[1] https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/13/html/undercloud_and_control_plane_back_up_and_restore/install-and-configure-rear-osp-ctlplane-br#create-the-configuration-files-osp-ctlplane-br
Adding a couple of thoughts/tests: - On Vagner's comment#5 only creates the iso accordingly with the date, but the data of the filesystem will be overwritten. We can add the "BACKUP_URL=iso:///backup/" modification to the local.conf so the data of the filesystem will be added to the iso image. These may make the iso image too big. From comment#5 export DATETIME=$(date +%Y%m%d-%H%M) export RHELRELEASE=$(lsb_release -rs) OUTPUT=ISO BACKUP=NETFS ISO_PREFIX=director-${RHELRELEASE}-${DATETIME} BACKUP_URL=iso:///backup/ # this is the line added. - Another option, BACKUP_PROG_ARCHIVE, will create backup-DATETIME file: BACKUP_PROG_ARCHIVE=backup-${DATETIME} - A little bit further, these bellow lines will create the directory /DIR_BACKUP/${HOSTNAME}-${DATETIME}/ for every backup. There is a drawback here, the backup.tar.gz (system backup) needs to be manually specify on the restoration as it will use the restoration date instead of the backup day. Just modifying the [export DATETIME="20201231-2250"] on the /etc/rear/local.conf before restore the filesystem. NETFS_PREFIX=${HOSTNAME}-${DATETIME} OUTPUT_PREFIX=${HOSTNAME}-${DATETIME} I had the impression that mkrescue wouldn't generate the backup again, but only the rescue image. At least this was my experience. Regardless, I do like the idea of having versioned backups. On another engagement I was renaming the files myself to ensure I had more than one version of the backup. Concerning versioned backups, please see bz1896239, ReaR has some possibilities for that, but they are not very well documented. Hello, is there any problem with ReaR that is not covered by other bugs, like bz1896239 for versioned backups? If not I would like to close the bug. |