Bug 1928509

Summary: OCS upgrade failed with noobaa-db-pg-0 in CrashLoopBackOff
Product: [Red Hat Storage] Red Hat OpenShift Container Storage Reporter: Aviad Polak <apolak>
Component: Multi-Cloud Object GatewayAssignee: Nimrod Becker <nbecker>
Status: CLOSED DUPLICATE QA Contact: Aviad Polak <apolak>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.7CC: dzaken, etamir, nberry, ocs-bugs, sgatfane
Target Milestone: ---Keywords: Automation, Upgrades
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-02-15 09:23:58 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Aviad Polak 2021-02-14 14:41:32 UTC
Description of problem (please be detailed as possible and provide log
snippests):
Automated test - upgrade OCS auto, 3w, aws


Version of all relevant components (if applicable):
OCS: upgrade from ocs-operator.v4.6.2 --> 4.7.0-262.ci
OCP: 4.7.0-0.nightly-2021-02-13-071408


Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
yes


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
1

If this is a regression, please provide more details to justify this:
automated regression test

Steps to Reproduce:
1. trigger build with same parameters: https://ocs4-jenkins-csb-ocsqe.apps.ocp4.prod.psi.redhat.com/view/Upgrade-OCS/job/qe-trigger-aws-ipi-3az-rhcos-3m-3w-upgrade-ocs-auto/3/
2.
3.


Actual results:
ocs operator failed to upgrade:
NAME                         DISPLAY                       VERSION        REPLACES              PHASE
ocs-operator.v4.7.0-262.ci   OpenShift Container Storage   4.7.0-262.ci   ocs-operator.v4.6.2   Installing

with pods status:
NAME                                                              READY   STATUS                  RESTARTS   AGE   IP             NODE                                         NOMINATED NODE   READINESS GATES
csi-cephfsplugin-6qd2j                                            3/3     Running                 0          45m   10.0.184.161   ip-10-0-184-161.us-east-2.compute.internal   <none>           <none>
csi-cephfsplugin-nhqzs                                            3/3     Running                 0          44m   10.0.147.198   ip-10-0-147-198.us-east-2.compute.internal   <none>           <none>
csi-cephfsplugin-provisioner-7d47bf8989-k9brh                     6/6     Running                 0          45m   10.131.0.29    ip-10-0-184-161.us-east-2.compute.internal   <none>           <none>
csi-cephfsplugin-provisioner-7d47bf8989-mmk77                     6/6     Running                 0          45m   10.128.2.80    ip-10-0-216-47.us-east-2.compute.internal    <none>           <none>
csi-cephfsplugin-qpzt4                                            3/3     Running                 0          44m   10.0.216.47    ip-10-0-216-47.us-east-2.compute.internal    <none>           <none>
csi-rbdplugin-4sh2g                                               3/3     Running                 0          44m   10.0.147.198   ip-10-0-147-198.us-east-2.compute.internal   <none>           <none>
csi-rbdplugin-k6nqp                                               3/3     Running                 0          44m   10.0.184.161   ip-10-0-184-161.us-east-2.compute.internal   <none>           <none>
csi-rbdplugin-provisioner-9b866fdbd-5kwfm                         6/6     Running                 0          45m   10.131.0.28    ip-10-0-184-161.us-east-2.compute.internal   <none>           <none>
csi-rbdplugin-provisioner-9b866fdbd-5lkbd                         6/6     Running                 0          45m   10.129.2.25    ip-10-0-147-198.us-east-2.compute.internal   <none>           <none>
csi-rbdplugin-ptwlv                                               3/3     Running                 0          45m   10.0.216.47    ip-10-0-216-47.us-east-2.compute.internal    <none>           <none>
ip-10-0-147-198us-east-2computeinternal-debug                     1/1     Running                 0          72s   10.0.147.198   ip-10-0-147-198.us-east-2.compute.internal   <none>           <none>
ip-10-0-184-161us-east-2computeinternal-debug                     1/1     Running                 0          72s   10.0.184.161   ip-10-0-184-161.us-east-2.compute.internal   <none>           <none>
ip-10-0-216-47us-east-2computeinternal-debug                      1/1     Running                 0          72s   10.0.216.47    ip-10-0-216-47.us-east-2.compute.internal    <none>           <none>
must-gather-s95rv-helper                                          1/1     Running                 0          72s   10.131.0.36    ip-10-0-184-161.us-east-2.compute.internal   <none>           <none>
noobaa-db-0                                                       1/1     Running                 0          44m   10.128.2.82    ip-10-0-216-47.us-east-2.compute.internal    <none>           <none>
noobaa-db-pg-0                                                    0/1     Init:CrashLoopBackOff   6          45m   10.128.2.85    ip-10-0-216-47.us-east-2.compute.internal    <none>           <none>
noobaa-operator-976b44c4f-gcq26                                   1/1     Running                 0          45m   10.128.2.76    ip-10-0-216-47.us-east-2.compute.internal    <none>           <none>
ocs-metrics-exporter-6747d5b4cb-ffk4d                             1/1     Running                 0          45m   10.128.2.77    ip-10-0-216-47.us-east-2.compute.internal    <none>           <none>
ocs-operator-694bb7ff4b-lqm8c                                     0/1     Running                 0          45m   10.128.2.78    ip-10-0-216-47.us-east-2.compute.internal    <none>           <none>
rook-ceph-crashcollector-ip-10-0-147-198-75766bc8b7-sxv7x         1/1     Running                 0          44m   10.129.2.26    ip-10-0-147-198.us-east-2.compute.internal   <none>           <none>
rook-ceph-crashcollector-ip-10-0-184-161-dd44f6d78-5cvdp          1/1     Running                 0          44m   10.131.0.30    ip-10-0-184-161.us-east-2.compute.internal   <none>           <none>
rook-ceph-crashcollector-ip-10-0-216-47-758b59f68f-zqgr6          1/1     Running                 0          44m   10.128.2.84    ip-10-0-216-47.us-east-2.compute.internal    <none>           <none>
rook-ceph-mds-ocs-storagecluster-cephfilesystem-a-79c6746djm9df   2/2     Running                 0          44m   10.128.2.83    ip-10-0-216-47.us-east-2.compute.internal    <none>           <none>
rook-ceph-mds-ocs-storagecluster-cephfilesystem-b-dff7f9bfxbsxc   2/2     Running                 0          43m   10.131.0.32    ip-10-0-184-161.us-east-2.compute.internal   <none>           <none>
rook-ceph-mgr-a-7bdd446c48-5g6nh                                  2/2     Running                 0          42m   10.129.2.28    ip-10-0-147-198.us-east-2.compute.internal   <none>           <none>
rook-ceph-mon-a-674ffd5b44-9xk4c                                  2/2     Running                 0          43m   10.128.2.86    ip-10-0-216-47.us-east-2.compute.internal    <none>           <none>
rook-ceph-mon-b-b66dfbdc4-kwmmj                                   2/2     Running                 0          44m   10.129.2.27    ip-10-0-147-198.us-east-2.compute.internal   <none>           <none>
rook-ceph-mon-c-57c5bb7454-rdrsg                                  2/2     Running                 0          43m   10.131.0.31    ip-10-0-184-161.us-east-2.compute.internal   <none>           <none>
rook-ceph-operator-955b7554f-5b9t4                                1/1     Running                 0          45m   10.128.2.75    ip-10-0-216-47.us-east-2.compute.internal    <none>           <none>
rook-ceph-osd-0-55744c7585-mq2d5                                  2/2     Running                 0          42m   10.131.0.33    ip-10-0-184-161.us-east-2.compute.internal   <none>           <none>
rook-ceph-osd-1-7585dc9767-jkksj                                  2/2     Running                 0          20m   10.129.2.36    ip-10-0-147-198.us-east-2.compute.internal   <none>           <none>
rook-ceph-osd-2-697454b779-4l9qh                                  2/2     Running                 0          31m   10.128.2.87    ip-10-0-216-47.us-east-2.compute.internal    <none>           <none>
rook-ceph-osd-prepare-ocs-deviceset-0-data-0-htzvr-4x9sz          0/1     Completed               0          76m   10.131.0.20    ip-10-0-184-161.us-east-2.compute.internal   <none>           <none>
rook-ceph-osd-prepare-ocs-deviceset-1-data-0-rzndh-6hjwm          0/1     Completed               0          76m   10.128.2.30    ip-10-0-216-47.us-east-2.compute.internal    <none>           <none>
rook-ceph-osd-prepare-ocs-deviceset-2-data-0-mgrld-gwxsh          0/1     Completed               0          76m   10.129.2.20    ip-10-0-147-198.us-east-2.compute.internal   <none>           <none>
rook-ceph-tools-555b7b49c4-kwxg5                                  1/1     Running                 0          45m   10.0.216.47    ip-10-0-216-47.us-east-2.compute.internal    <none>           <none>


Expected results:
upgrade success

Additional info:
build info: https://ocs4-jenkins-csb-ocsqe.apps.ocp4.prod.psi.redhat.com/view/Upgrade-OCS/job/qe-trigger-aws-ipi-3az-rhcos-3m-3w-upgrade-ocs-auto/3/
noobaa operator log: http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/j003ai3c33-ua/j003ai3c33-ua_20210214T084920/logs/failed_testcase_ocs_logs_1613295694/test_upgrade_ocs_logs/ocs_must_gather/quay-io-rhceph-dev-ocs-must-gather-sha256-0ffefc2dc74915ab29cf8d312aec957edf028e4e605c2a2ce8d74d7f295fa53f/noobaa/logs/openshift-storage/noobaa-operator-976b44c4f-gcq26.log
must gathers: http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/j003ai3c33-ua/j003ai3c33-ua_20210214T084920/logs/failed_testcase_ocs_logs_1613295694/test_upgrade_ocs_logs/