1900722 – Failed to init upgrade process on noobaa-core-0

Bug 1900722 - Failed to init upgrade process on noobaa-core-0

Summary: Failed to init upgrade process on noobaa-core-0

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat OpenShift Container Storage
Classification:	Red Hat Storage
Component:	Multi-Cloud Object Gateway
Sub Component:
Version:	4.7
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	urgent
Target Milestone:	---
Target Release:	OCS 4.7.0
Assignee:	Jacky Albo
QA Contact:	Petr Balogh
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2020-11-23 15:25 UTC by Petr Balogh
Modified:	2021-05-19 09:17 UTC (History)
CC List:	5 users (show)
Fixed In Version:	4.7.0-192.ci
Doc Type:	No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed:	2021-05-19 09:16:24 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Github	noobaa noobaa-operator pull 465	None	closed	Adding missing Env variables for core sts	2021-02-17 13:32:47 UTC
Github	noobaa noobaa-operator pull 478	None	closed	Backport to 5.7	2021-02-17 13:32:47 UTC
Red Hat Product Errata	RHSA-2021:2041	None	None	None	2021-05-19 09:17:01 UTC

Description Petr Balogh 2020-11-23 15:25:19 UTC

Description of problem (please be detailed as possible and provide log
snippests):

Upgrade from OCS 4.6 to 4.7 is failing to succeed and CSV hangs in installing phase cause of noobaa core pod is in CrashLoopBackOff.


Version of all relevant components (if applicable):
OCS upgrade from v4.6.0-160.ci to 4.7.0-163.ci
OCP 4.7.0-0.nightly-2020-11-22-204912


Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
Yes, cannot upgrade.


Is there any workaround available to the best of your knowledge?
No


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
1


Can this issue reproducible?
Need to find out.


Can this issue reproduce from the UI?
Haven't tried


If this is a regression, please provide more details to justify this:
This worked before.

Steps to Reproduce:
1. Install OCS 4.6 on top of OCP 4.7
2. Upgrade OCS to 4.7 build
3. Upgrade will not finish


Actual results:
CSV in installing phase, noba core pod in CrashLoopBackOff

Expected results:
Have upgrade succeeded


Additional info:
Job:
https://ocs4-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/qe-deploy-ocs-cluster/14920/consoleFull
Must gather:
http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/j010vu1cs33-uan/j010vu1cs33-uan_20201123T080920/logs/failed_testcase_ocs_logs_1606123450/test_upgrade_ocs_logs/

Comment 3 Petr Balogh 2020-11-23 15:32:39 UTC

I see it got reproduced also in this run:
https://ocs4-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/qe-deploy-ocs-cluster/14923/

Must gather:
http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/j011vu1cs33-uan/j011vu1cs33-uan_20201123T091220/logs/failed_testcase_ocs_logs_1606127218/test_upgrade_ocs_logs/

Comment 4 Nimrod Becker 2020-11-24 15:02:13 UTC

This is a 4.7 issue (the problem is in the new version) and would be fixed in 4.7
Moving

Comment 9 Petr Balogh 2020-12-04 22:18:39 UTC

The issue still persist:

https://ocs4-jenkins-csb-ocsqe.cloud.paas.psi.redhat.com/job/qe-deploy-ocs-cluster/85/console

noobaa-core-0                                                     0/1     CrashLoopBackOff   10         29m

Comment 11 Petr Balogh 2020-12-14 14:22:49 UTC

Ran verification job here:
https://ocs4-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/qe-deploy-ocs-cluster/15583/

Upgrade from 4.6 RC 7 to 4.7.0-192.ci which I see should have a fix.

Comment 15 Petr Balogh 2021-01-11 10:41:15 UTC

Running new upgrade job here:
https://ocs4-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/qe-trigger-aws-ipi-3az-rhcos-3m-3w-upgrade-ocs-auto-nightly/2/console

Comment 16 Petr Balogh 2021-02-04 13:00:13 UTC

Running another verification jobs here:

Running 2 verification jobs here:

vSphere OCP 4.6
https://ocs4-jenkins-csb-ocsqe.apps.ocp4.prod.psi.redhat.com/job/qe-deploy-ocs-cluster/217/

AWS OCP 4.7:
https://ocs4-jenkins-csb-ocsqe.apps.ocp4.prod.psi.redhat.com/job/qe-deploy-ocs-cluster/218/

Comment 17 Petr Balogh 2021-02-08 11:44:08 UTC

Upgrade passed but then we saw some failures in our infra cause of:
https://bugzilla.redhat.com/show_bug.cgi?id=1919967

So I am marking this BZ as verified.

Petr

Comment 20 errata-xmlrpc 2021-05-19 09:16:24 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat OpenShift Container Storage 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2041

Note You need to log in before you can comment on or make changes to this bug.