Bug 1973018

Summary: Ironic rhcos downloader breaks image cache in upgrade process from 4.7 to 4.8
Product: OpenShift Container Platform Reporter: Arda Guclu <aguclu>
Component: Bare Metal Hardware ProvisioningAssignee: Arda Guclu <aguclu>
Bare Metal Hardware Provisioning sub component: ironic QA Contact: Ori Michaeli <omichael>
Status: CLOSED ERRATA Docs Contact:
Severity: urgent    
Priority: urgent CC: lshilin, omichael, shardy, tsedovic
Version: 4.8Keywords: Triaged, UpcomingSprint
Target Milestone: ---   
Target Release: 4.8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-07-27 23:13:09 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1972572    
Bug Blocks: 1974074    

Description Arda Guclu 2021-06-17 06:08:38 UTC
Description of problem:
4.8 has image caching mechanism and 4.7 has not. Thus, these 2 versions are using different image paths but eventually they are using the same image. In upgrade process, ironic rhcos downloader does not know image exist and that's why, in any cases, it re-downloads same image.

This breaks symlinks and also delay the completion of upgrade process.

Version-Release number of selected component (if applicable):
4.8+

How reproducible:
Running e2e-metal-ipi-upgrade from 4.7 to 4.8 and checking metal3-machine-os-downloader container logs.

Actual results:
Symlinks of images are broken.

Expected results:
Symlinks are correctly aligned.

Comment 1 Steven Hardy 2021-06-17 14:22:50 UTC
I renamed this and increased the severity - initially we/I thought the image cache ended up broken only for the duration of the duplicate image download, but in fact when referring to the "old" 4.7 URL it's permanently broken due to the corruption of the symlinks.

As such this is a blocker I think, it means that all existing hosts will fail to adopt after upgrade, and scale-out will not work.

Comment 4 Ori Michaeli 2021-06-24 06:42:56 UTC
Verified with upgrade from 4.7.17 to 4.8.0-0.nightly-2021-06-22-022125

Comment 6 Zane Bitter 2021-06-28 13:54:05 UTC
*** Bug 1974074 has been marked as a duplicate of this bug. ***

Comment 8 errata-xmlrpc 2021-07-27 23:13:09 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438