Description of problem (please be detailed as possible and provide log snippests): In this execution AWS IPI FIPS ENCRYPTION 3AZ RHCOS 3M 3W 3I Cluster: https://ocs4-jenkins-csb-ocsqe.apps.ocp4.prod.psi.redhat.com/job/qe-deploy-ocs-cluster-prod/1003/consoleFull I see that noobaa-core pod image didn't get upgraded. 03:29:55 - MainThread - ocs_ci.ocs.ocs_upgrade - INFO - Old images which are going to be upgraded: ['registry.redhat.io/ocs4/cephcsi-rhel8@sha256:eb8922464a2f5b8a78f0b003d00f208fb319b462b866a18d1e393fffa84a5a34', 'registry.redhat.io/ocs4/mcg-core-rhel8@sha256:1496a3e823db8536380e01c58e39670e9fa2cc3d15229b2edc300acc56282c8c', 'registry.redhat.io/ocs4/mcg-rhel8-operator@sha256:5c9ebda7eb82db9b20d3cbac472e2cc284e099a099e2e8a8db11994e61e17e19', In must gather: http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/j011aife3c333-ua/j011aife3c333-ua_20210602T151755/logs/failed_testcase_ocs_logs_1622655138/test_upgrade_ocs_logs/ocs_must_gather/quay-io-rhceph-dev-ocs-must-gather-sha256-4cf9b04bc34bccb6fd801e42867308aee3dec18987d8507f2b58552d6d45dc19/namespaces/openshift-storage/oc_output/pods you can see that noobaa-core pod still have old image but I guess it suppose to have one of this new one: 03:29:55 - MainThread - ocs_ci.ocs.ocs_upgrade - INFO - New images for upgrade: ['quay.io/rhceph-dev/cephcsi@sha256:2296774ae82d85b93cef91dbfe6897a6a40dcc1cf3d9cff589b283313474f747', 'quay.io/rhceph-dev/mcg-core@sha256:68832b8afaf01e49f418e67cec1e3def3a86cd967f8ea6fa4728045484cfd69f', 'quay.io/rhceph-dev/mcg-operator@sha256:f73d206c0e206ca9d83bd90d0a9c37a580bf94aca5819cba431876ad8f549e6c', This is defined in noobaa-operator-54c886c97c-md4jd NOOBAA_CORE_IMAGE: quay.io/rhceph-dev/mcg-core@sha256:68832b8afaf01e49f418e67cec1e3def3a86cd967f8ea6fa4728045484cfd69f Version of all relevant components (if applicable): Upgrade from 4.7.0 to 4.8.0-406.ci Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? CSV: http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/j011aife3c333-ua/j011aife3c333-ua_20210602T151755/logs/failed_testcase_ocs_logs_1622655138/test_upgrade_ocs_logs/ocs_must_gather/quay-io-rhceph-dev-ocs-must-gather-sha256-4cf9b04bc34bccb6fd801e42867308aee3dec18987d8507f2b58552d6d45dc19/namespaces/openshift-storage/oc_output/csv NAME DISPLAY VERSION REPLACES PHASE ocs-operator.v4.8.0-406.ci OpenShift Container Storage 4.8.0-406.ci ocs-operator.v4.7.0 Succeeded Is already in 4.8 . So I guess in this stage the noobaa Core should have new image. Is there any workaround available to the best of your knowledge? Don't know. Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? 1 Can this issue reproducible? Not sure yet, this is first job I started looking at today, if I will find more occurrences I will link it in follow up comment. Can this issue reproduce from the UI? Haven't tried If this is a regression, please provide more details to justify this: Yes Steps to Reproduce: 1. Install OCS 4.7.0 on mentioned platform 2. Upgrade to 4.8.0 internal build 3. Core pod has old image Actual results: noobaa-core pod has old image Expected results: Have new image in noobaa-core Additional info: Must gather: http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/j011aife3c333-ua/j011aife3c333-ua_20210602T151755/logs/failed_testcase_ocs_logs_1622655138/test_upgrade_ocs_logs/
Trying to reproduce it here: https://ocs4-jenkins-csb-ocsqe.apps.ocp4.prod.psi.redhat.com/job/qe-deploy-ocs-cluster-prod/1011/console This job will pause before teardown.
In the execution: https://ocs4-jenkins-csb-ocsqe.apps.ocp4.prod.psi.redhat.com/job/qe-deploy-ocs-cluster-prod/1011/consoleFull it wasn't reproduced as I see we are running now tier1 after upgrade + from console output I see: 18:44:11 - MainThread - ocs_ci.ocs.ocp - INFO - All the images: {'core': 'quay.io/rhceph-dev/mcg-core@sha256:68832b8afaf01e49f418e67cec1e3def3a86cd967f8ea6fa4728045484cfd69f'} were successfully upgraded in: noobaa-core-0! 18:44:11 - MainThread - ocs_ci.utility.utils - INFO - Executing command: oc -n openshift-storage get Pod noobaa-db-pg-0 -n openshift-storage -o yaml 18:44:16 - MainThread - ocs_ci.ocs.ocp - INFO - All the images: {'db': 'registry.redhat.io/rhel8/postgresql-12@sha256:03a1e02a1b3245f9aa0ddd3f7507b915a8f7387a1674969f6ef039a5d7fd8bf0', 'init': 'quay.io/rhceph-dev/mcg-core@sha256:68832b8afaf01e49f418e67cec1e3def3a86cd967f8ea6fa4728045484cfd69f'} were successfully upgraded in: noobaa-db-pg-0! 18:44:16 - MainThread - ocs_ci.utility.utils - INFO - Executing command: oc -n openshift-storage get Pod noobaa-endpoint-74c8949cf8-pwf8r -n openshift-storage -o yaml 18:44:21 - MainThread - ocs_ci.ocs.ocp - INFO - All the images: {'endpoint': 'quay.io/rhceph-dev/mcg-core@sha256:68832b8afaf01e49f418e67cec1e3def3a86cd967f8ea6fa4728045484cfd69f'} were successfully upgraded in: noobaa-endpoint-74c8949cf8-pwf8r! 18:44:21 - MainThread - ocs_ci.utility.utils - INFO - Executing command: oc -n openshift-storage get Pod noobaa-operator-864dbcf7bb-w922v -n openshift-storage -o yaml 18:44:27 - MainThread - ocs_ci.ocs.ocp - INFO - All the images: {'noobaa_cor': 'quay.io/rhceph-dev/mcg-core@sha256:68832b8afaf01e49f418e67cec1e3def3a86cd967f8ea6fa4728045484cfd69f', 'noobaa_db': 'registry.redhat.io/rhel8/postgresql-12@sha256:03a1e02a1b3245f9aa0ddd3f7507b915a8f7387a1674969f6ef039a5d7fd8bf0', 'noobaa-operator': 'quay.io/rhceph-dev/mcg-operator@sha256:f73d206c0e206ca9d83bd90d0a9c37a580bf94aca5819cba431876ad8f549e6c'} were successfully upgraded in: noobaa-operator-864dbcf7bb-w922v!
We had more executions, for example this one: https://ocs4-jenkins-csb-ocsqe.apps.ocp4.prod.psi.redhat.com/job/qe-deploy-ocs-cluster-prod/1372/ From this production job: https://ocs4-jenkins-csb-ocsqe.apps.ocp4.prod.psi.redhat.com/job/qe-trigger-aws-ipi-fips-encryption-3az-rhcos-3m-3w-3i-upgrade-ocs-auto/18/ This has passed upgrade stage and now running tier1 after upgrade which is still progressing. So we didn't hit this issue again yet.