Bug 2102498 - [MS-ODF UPGRADE] MS-ODF-clusters with previous odf version (4.10.2-3) and deployer version 2.0.2. does not upgraded to deployer v2.0.3
Summary: [MS-ODF UPGRADE] MS-ODF-clusters with previous odf version (4.10.2-3) and dep...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: odf-managed-service
Version: 4.10
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: ---
Assignee: Dhruv Bindra
QA Contact: suchita
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-06-30 04:41 UTC by suchita
Modified: 2023-08-09 17:00 UTC (History)
8 users (show)

Fixed In Version: 2.0.2
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-11-02 05:17:49 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker MTSRE-590 0 None None None 2022-07-01 06:57:16 UTC
Red Hat Issue Tracker RHSTOR-3455 0 None None None 2022-07-01 06:57:16 UTC

Description suchita 2022-06-30 04:41:09 UTC
Description of problem:

Consumer and provider Clusters with addon version v2.0.2 upgraded failed to addon deployer version v2.0.3

and OCP version 4.10.18 and 4.8.43  upgrade Failed to addon deployer version v2.0.3

while preparing for Deployer Upgrade v2.0.2 to v2.0.3 on the stagging QE add-on, we have 3 types of cluster setup 
Setup 1.  Provide OCP4.10.18 + ODF addon v2.0.2 and 2 Consumer with OCP4.10.43 and ODF Consumer add-on   v2.0.2
Setup 2.  Provide OCP4.10.18 + ODF addon v2.0.2 and 2 Consumer with OCP4.8.43 and ODF Consumer add-on   v2.0.2
Setup 3 . Private link cluster Provide OCP4.10.18 + ODF addon v2.0.2 and 2 Consumer with OCP4.10.43 and ODF Consumer add-on v2.0.2

Upgrade Failed on all the above setups. New Fresh deployed cluster has deployer version v2.0.3

Version-Release number of selected component (if applicable):

++++++++++++++++++++++++++++++++++
Wed Jun 29 17:40:10 UTC 2022
Deployer
    Mediatype:   image/svg+xml
                Image:  quay.io/openshift/origin-kube-rbac-proxy:4.10.0
                Image:             quay.io/osd-addons/ocs-osd-deployer:2.0.2-3
                Image:             quay.io/osd-addons/ocs-osd-deployer:2.0.2-3
-----------
ODF version
"4.10.2-3"

========CSV ======
NAME                                      DISPLAY                       VERSION           REPLACES                                  PHASE
mcg-operator.v4.10.4                      NooBaa Operator               4.10.4            mcg-operator.v4.10.3                      Succeeded
ocs-operator.v4.10.2                      OpenShift Container Storage   4.10.2            ocs-operator.v4.10.1                      Succeeded
ocs-osd-deployer.v2.0.2                   OCS OSD Deployer              2.0.2             ocs-osd-deployer.v2.0.1                   Succeeded
odf-csi-addons-operator.v4.10.4           CSI Addons                    4.10.4            odf-csi-addons-operator.v4.10.2           Succeeded
odf-operator.v4.10.2                      OpenShift Data Foundation     4.10.2            odf-operator.v4.10.1                      Succeeded
ose-prometheus-operator.4.10.0            Prometheus Operator           4.10.0            ose-prometheus-operator.4.8.0             Succeeded
route-monitor-operator.v0.1.422-151be96   Route Monitor Operator        0.1.422-151be96   route-monitor-operator.v0.1.420-b65f47e   Succeeded
--------------



How reproducible:
6/6

Steps to Reproduce:
1. Create an appliance provider cluster with OCP 4.10 and ocs-provider addon
(rosa create service --type ocs-provider-qe --name $CLUSTER_NAME --size 20 --onboarding-validation-key $CONSUMER_KEY  --subnet-ids $SUBNET_IDS )

2.Create rosa Consumer cluster with OCP4.8 and ocs-consumer-qe addon
3.Create rosa Consumer cluster with OCP4.10 and ocs-consumer-qe addon
 
4. Initiate upgrade
Provider: https://gitlab.cee.redhat.com/service/managed-tenants/-/merge_requests/2559
Consumer: https://gitlab.cee.redhat.com/service/managed-tenants/-/merge_requests/2558



Actual results:
Consumer and provider Cluster with OCP4.10/OCP4.8+ ODF4.10 and deployer v2.0.2 Failed to upgrade to deployer version v2.0.3

Expected results:
All clusters provider and consumer should upgrade from deployer v2.0.2 to deployer version v2.0.3

Additional info:


Merging of PRs around Wed Jun 29 13:03:00 UTC 2022
June 29 6:30 IST [ssotest01ue1] SelectorSyncSet addon-ocs-provider-qe applied
June 29 6:30 IST [hive-stage-01] SelectorSyncSet addon-ocs-provider-qe applied

June 29 6:33 IST [hives02ue1] SelectorSyncSet addon-ocs-consumer-qe applied
June 29 6:33 IST [ssotest01ue1] SelectorSyncSet addon-ocs-consumer-qe applied



Few OC command outputs tracing along with time here: 
Provider: 

http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/sgatfane-29jp1/sgatfane-29jp1_20220629T033315/logs/upgrade_logs/nohup.out

Consumer OCP4.8 deployer V2.0.2: http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/sgatfane-29jc1/sgatfane-29jc1_20220629T044715/logs/upgrade_logs/nohup.out

Consumer OCP4.10 deployer V2.0.2:http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/sgatfane-29jc5/sgatfane-29jc5_20220629T050952/logs/upgrade_logs/nohup.out
http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/sgatfane-29jc2/sgatfane-29jc2_20220629T044717/logs/upgrade_logs/nohup.out

Comment 1 suchita 2022-06-30 12:37:24 UTC
After discussing with odf-ms- engineering folks Dhruv and Leela, as per engineering the bundle image index and subscription are updated as expected for the deployer. 

The issue is raised with MTSRE: https://issues.redhat.com/browse/MTSRE-590?filter=-2

Comment 2 suchita 2022-07-01 06:56:28 UTC
More discussion on slack thread https://coreos.slack.com/archives/C01L46M0FQC/p1656569714574329.
New Big raised https://issues.redhat.com/browse/RHSTOR-3455

Comment 3 suchita 2022-07-19 06:04:24 UTC
Now, v2.0.2 to v2.0.3 upgraded successfully. 
Upgrade results: 
http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/sgatfane-p12p3/sgatfane-p12p3_20220712T040139/logs/upgrade_test_report_1657612661.html
http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/sgatfane-p12c3/sgatfane-p12c3_20220712T065127/logs/upgrade_test_report_1657615630.html
http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/sgatfane-p12c4/sgatfane-p12c4_20220712T065124/logs/upgrade_test_report_1657615613.html
http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/sgatfane-p12j/sgatfane-p12j_20220712T040156/logs/test_report_1657695621.html
 Cosumer1:(OCP4.8)
http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/sgatfane-p12c3/sgatfane-p12c3_20220712T065127/logs/upgrade_test_report_1657615630.html

oc command output during upgrade:
Privatelink Cluster:
Provider: http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/sgatfane-p12p3/sgatfane-p12p3_20220712T040139/logs/upgrade_logs/nohup.out
Consumer1:http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/sgatfane-p12c3/sgatfane-p12c3_20220712T065127/logs/upgrade_logs/nohup.out
Consumer2:http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/sgatfane-p12c4/sgatfane-p12c4_20220712T065124/logs/upgrade_logs/nohup.out

Non - Private Link appliance mode:
Provider: http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/sgatfane-p12j/sgatfane-p12j_20220712T040156/logs/upgrade_logs/nohup.out
Consumer1: 
Consumer2:http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/sgatfane-j12c2/sgatfane-j12c2_20220712T051842/logs/upgrade_logs/nohup.out

Verifed on :
    Mediatype:   image/svg+xml
                Image:  quay.io/openshift/origin-kube-rbac-proxy:4.10.0
                Image:             quay.io/osd-addons/ocs-osd-deployer:2.0.2-3
                Image:             quay.io/osd-addons/ocs-osd-deployer:2.0.2-3
========CSV ======
E0712 10:12:22.251346   51028 v2.go:105] read /dev/stdin: bad file descriptor
UID          PID    PPID  C STIME TTY          TIME CMD
1001050+       1       0  0 10:09 ?        00:00:00 /usr/bin/openshift-deploy
1001050+      85       0  0 10:12 ?        00:00:00 ps -ef
NAME                                      DISPLAY                       VERSION           REPLACES                                  PHASE
mcg-operator.v4.10.4                      NooBaa Operator               4.10.4            mcg-operator.v4.10.3                      Succeeded
ocs-operator.v4.10.2                      OpenShift Container Storage   4.10.2            ocs-operator.v4.10.1                      Succeeded
ocs-osd-deployer.v2.0.2                   OCS OSD Deployer              2.0.2             ocs-osd-deployer.v2.0.1                   Succeeded
odf-csi-addons-operator.v4.10.4           CSI Addons                    4.10.4            odf-csi-addons-operator.v4.10.2           Succeeded
odf-operator.v4.10.2                      OpenShift Data Foundation     4.10.2            odf-operator.v4.10.1                      Succeeded
ose-prometheus-operator.4.10.0            Prometheus Operator           4.10.0            ose-prometheus-operator.4.8.0             Succeeded
route-monitor-operator.v0.1.422-151be96   Route Monitor Operator        0.1.422-151be96   route-monitor-operator.v0.1.420-b65f47e   Succeeded
--------------
Verified on Clusters
Non- Private link : OCP Version: 4.10.22 provider, ocp4.10.22 and OCP4.8.46 consumer
privatelink: OCP Version: 4.10.22 provider, ocp4.10.22 and OCP4.8.46 consumer


Note You need to log in before you can comment on or make changes to this bug.