Bug 1844496 - [Containerized UPGRADES] Upgrade from 4.0 to 4.1 on RHEL 8 fails due to error on set_fact ceph_osd_image_repodigest_before_pulling
Summary: [Containerized UPGRADES] Upgrade from 4.0 to 4.1 on RHEL 8 fails due to error...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: Ceph-Ansible
Version: 4.1
Hardware: Unspecified
OS: Unspecified
high
urgent
Target Milestone: z1
: 4.1
Assignee: Dimitri Savineau
QA Contact: Sunil Angadi
URL:
Whiteboard:
Depends On:
Blocks: 1816167
TreeView+ depends on / blocked
 
Reported: 2020-06-05 14:28 UTC by Mike Hackett
Modified: 2023-12-15 18:05 UTC (History)
22 users (show)

Fixed In Version: ceph-ansible-4.0.24-1.el8cp, ceph-ansible-4.0.24-1.el7cp
Doc Type: Bug Fix
Doc Text:
.Upgrading a containerized cluster from 4.0 to 4.1 on {os-product} 8.1 no longer fails Previously, when upgrading a {storage-product} cluster from 4.0 to 4.1 the upgrade could fail with an error on `set_fact ceph_osd_image_repodigest_before_pulling`. Due to an issue with how the container image tag was updated, `ceph-ansible` could fail. In {storage-product} 4.1z1 `ceph-ansible` has been updated so it no longer fails and upgrading works as expected.
Clone Of:
Environment:
Last Closed: 2020-07-30 15:05:11 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github ceph ceph-ansible issues 5607 0 None closed Upgrade fails because podman sees the old image 2021-01-15 18:32:55 UTC
Github ceph ceph-ansible pull 5405 0 None closed container: inspect Id field instead of RepoDigests 2021-01-15 18:32:55 UTC
Red Hat Issue Tracker RHCEPH-7663 0 None None None 2023-10-06 20:31:42 UTC
Red Hat Product Errata RHSA-2020:3003 0 None None None 2020-07-20 14:21:59 UTC

Description Mike Hackett 2020-06-05 14:28:18 UTC
Description of problem:
When upgrading a containerized (podman) cluster from RHCS 4.0 to RHCS 4.1 on RHEL 8 the upgrade fails when reaching the OSD's with the following error:

TASK [ceph-container-common : set_fact ceph_osd_image_repodigest_before_pulling] *************************************************************************************************************************************************************
Friday 05 June 2020  19:37:35 +0530 (0:00:00.070)       0:08:21.175 *********** 
fatal: [ceph4node1.example.com]: FAILED! => 
  msg: |-
    The task includes an option with an undefined variable. The error was: None has no element 0
  
    The error appears to be in '/usr/share/ceph-ansible/roles/ceph-container-common/tasks/fetch_image.yml': line 137, column 3, but may
    be elsewhere in the file depending on the exact syntax problem.
  
    The offending line appears to be:
  
  
    - name: set_fact ceph_osd_image_repodigest_before_pulling
      ^ here

The monitor containers upgrade successfully, we do not fail until the OSD's are attempted to be upgraded.


Cluster was installed on RHCS 4.0 3 days ago following the Ceph documentation when 4-20 image was latest. Upgrade was attempted following the documentation without diverting, and the upgrade failed at this step.

A customer has encountered this issue as well as a field resource in his home lab, so this is 3rd reproduction so far reported.

Comment 17 errata-xmlrpc 2020-07-20 14:21:41 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:3003


Note You need to log in before you can comment on or make changes to this bug.