Bug 1967435 - noobaa-core-0 core image didn't get upgraded when upgrading OCS from 4.7 to 4.8
Summary: noobaa-core-0 core image didn't get upgraded when upgrading OCS from 4.7 to 4.8
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: Multi-Cloud Object Gateway
Version: 4.8
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: ---
Assignee: Liran Mauda
QA Contact: Petr Balogh
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-06-03 07:18 UTC by Petr Balogh
Modified: 2023-08-09 16:49 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-07-12 14:03:15 UTC
Embargoed:


Attachments (Terms of Use)

Description Petr Balogh 2021-06-03 07:18:05 UTC
Description of problem (please be detailed as possible and provide log
snippests):
In this execution AWS IPI FIPS ENCRYPTION 3AZ RHCOS 3M 3W 3I Cluster:
https://ocs4-jenkins-csb-ocsqe.apps.ocp4.prod.psi.redhat.com/job/qe-deploy-ocs-cluster-prod/1003/consoleFull

I see that noobaa-core pod image didn't get upgraded.

03:29:55 - MainThread - ocs_ci.ocs.ocs_upgrade - INFO - Old images which are going to be upgraded: ['registry.redhat.io/ocs4/cephcsi-rhel8@sha256:eb8922464a2f5b8a78f0b003d00f208fb319b462b866a18d1e393fffa84a5a34', 'registry.redhat.io/ocs4/mcg-core-rhel8@sha256:1496a3e823db8536380e01c58e39670e9fa2cc3d15229b2edc300acc56282c8c', 'registry.redhat.io/ocs4/mcg-rhel8-operator@sha256:5c9ebda7eb82db9b20d3cbac472e2cc284e099a099e2e8a8db11994e61e17e19',

In must gather:
http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/j011aife3c333-ua/j011aife3c333-ua_20210602T151755/logs/failed_testcase_ocs_logs_1622655138/test_upgrade_ocs_logs/ocs_must_gather/quay-io-rhceph-dev-ocs-must-gather-sha256-4cf9b04bc34bccb6fd801e42867308aee3dec18987d8507f2b58552d6d45dc19/namespaces/openshift-storage/oc_output/pods

you can see that noobaa-core pod still have old image but I guess it suppose to have one of this new one:
03:29:55 - MainThread - ocs_ci.ocs.ocs_upgrade - INFO - New images for upgrade: ['quay.io/rhceph-dev/cephcsi@sha256:2296774ae82d85b93cef91dbfe6897a6a40dcc1cf3d9cff589b283313474f747', 'quay.io/rhceph-dev/mcg-core@sha256:68832b8afaf01e49f418e67cec1e3def3a86cd967f8ea6fa4728045484cfd69f', 'quay.io/rhceph-dev/mcg-operator@sha256:f73d206c0e206ca9d83bd90d0a9c37a580bf94aca5819cba431876ad8f549e6c',


This is defined in  noobaa-operator-54c886c97c-md4jd
NOOBAA_CORE_IMAGE:        quay.io/rhceph-dev/mcg-core@sha256:68832b8afaf01e49f418e67cec1e3def3a86cd967f8ea6fa4728045484cfd69f


Version of all relevant components (if applicable):
Upgrade from 4.7.0 to 4.8.0-406.ci


Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?

CSV:
http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/j011aife3c333-ua/j011aife3c333-ua_20210602T151755/logs/failed_testcase_ocs_logs_1622655138/test_upgrade_ocs_logs/ocs_must_gather/quay-io-rhceph-dev-ocs-must-gather-sha256-4cf9b04bc34bccb6fd801e42867308aee3dec18987d8507f2b58552d6d45dc19/namespaces/openshift-storage/oc_output/csv

NAME                         DISPLAY                       VERSION        REPLACES              PHASE
ocs-operator.v4.8.0-406.ci   OpenShift Container Storage   4.8.0-406.ci   ocs-operator.v4.7.0   Succeeded

Is already in 4.8 . So I guess in this stage the noobaa Core should have new image.


Is there any workaround available to the best of your knowledge?
Don't know.


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
1


Can this issue reproducible?
Not sure yet, this is first job I started looking at today, if I will find more occurrences I will link it in follow up comment.

Can this issue reproduce from the UI?
Haven't tried

If this is a regression, please provide more details to justify this:
Yes

Steps to Reproduce:
1. Install OCS 4.7.0 on mentioned platform
2. Upgrade to 4.8.0 internal build
3. Core pod has old image


Actual results:
noobaa-core pod has old image

Expected results:
Have new image in noobaa-core


Additional info:
Must gather:
http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/j011aife3c333-ua/j011aife3c333-ua_20210602T151755/logs/failed_testcase_ocs_logs_1622655138/test_upgrade_ocs_logs/

Comment 5 Petr Balogh 2021-06-03 08:59:20 UTC
Trying to reproduce it here:
https://ocs4-jenkins-csb-ocsqe.apps.ocp4.prod.psi.redhat.com/job/qe-deploy-ocs-cluster-prod/1011/console

This job will pause before teardown.

Comment 6 Petr Balogh 2021-06-04 08:11:13 UTC
In the execution:

https://ocs4-jenkins-csb-ocsqe.apps.ocp4.prod.psi.redhat.com/job/qe-deploy-ocs-cluster-prod/1011/consoleFull

it wasn't reproduced as I see we are running now tier1 after upgrade + from console output I see:

18:44:11 - MainThread - ocs_ci.ocs.ocp - INFO - All the images: {'core': 'quay.io/rhceph-dev/mcg-core@sha256:68832b8afaf01e49f418e67cec1e3def3a86cd967f8ea6fa4728045484cfd69f'} were successfully upgraded in: noobaa-core-0!
18:44:11 - MainThread - ocs_ci.utility.utils - INFO - Executing command: oc -n openshift-storage get Pod noobaa-db-pg-0 -n openshift-storage -o yaml
18:44:16 - MainThread - ocs_ci.ocs.ocp - INFO - All the images: {'db': 'registry.redhat.io/rhel8/postgresql-12@sha256:03a1e02a1b3245f9aa0ddd3f7507b915a8f7387a1674969f6ef039a5d7fd8bf0', 'init': 'quay.io/rhceph-dev/mcg-core@sha256:68832b8afaf01e49f418e67cec1e3def3a86cd967f8ea6fa4728045484cfd69f'} were successfully upgraded in: noobaa-db-pg-0!
18:44:16 - MainThread - ocs_ci.utility.utils - INFO - Executing command: oc -n openshift-storage get Pod noobaa-endpoint-74c8949cf8-pwf8r -n openshift-storage -o yaml
18:44:21 - MainThread - ocs_ci.ocs.ocp - INFO - All the images: {'endpoint': 'quay.io/rhceph-dev/mcg-core@sha256:68832b8afaf01e49f418e67cec1e3def3a86cd967f8ea6fa4728045484cfd69f'} were successfully upgraded in: noobaa-endpoint-74c8949cf8-pwf8r!
18:44:21 - MainThread - ocs_ci.utility.utils - INFO - Executing command: oc -n openshift-storage get Pod noobaa-operator-864dbcf7bb-w922v -n openshift-storage -o yaml
18:44:27 - MainThread - ocs_ci.ocs.ocp - INFO - All the images: {'noobaa_cor': 'quay.io/rhceph-dev/mcg-core@sha256:68832b8afaf01e49f418e67cec1e3def3a86cd967f8ea6fa4728045484cfd69f', 'noobaa_db': 'registry.redhat.io/rhel8/postgresql-12@sha256:03a1e02a1b3245f9aa0ddd3f7507b915a8f7387a1674969f6ef039a5d7fd8bf0', 'noobaa-operator': 'quay.io/rhceph-dev/mcg-operator@sha256:f73d206c0e206ca9d83bd90d0a9c37a580bf94aca5819cba431876ad8f549e6c'} were successfully upgraded in: noobaa-operator-864dbcf7bb-w922v!

Comment 11 Petr Balogh 2021-07-12 13:18:12 UTC
We had more executions, for example this one:
https://ocs4-jenkins-csb-ocsqe.apps.ocp4.prod.psi.redhat.com/job/qe-deploy-ocs-cluster-prod/1372/

From this production job:
https://ocs4-jenkins-csb-ocsqe.apps.ocp4.prod.psi.redhat.com/job/qe-trigger-aws-ipi-fips-encryption-3az-rhcos-3m-3w-3i-upgrade-ocs-auto/18/

This has passed upgrade stage and now running tier1 after upgrade which is still progressing.

So we didn't hit this issue again yet.


Note You need to log in before you can comment on or make changes to this bug.