Bug 2234948 - [4.13 backport] Update client-go library to avoid crash on OCP 4.14
Summary: [4.13 backport] Update client-go library to avoid crash on OCP 4.14
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: rook
Version: 4.13
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: ODF 4.13.3
Assignee: Subham Rai
QA Contact: Shivam Durgbuns
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-08-25 18:49 UTC by Travis Nielsen
Modified: 2023-09-27 14:22 UTC (History)
4 users (show)

Fixed In Version: 4.13.3-2
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-09-27 14:22:44 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github red-hat-storage rook pull 514 0 None open Bug 2234948: build: upgrade client-go to v0.26.4 2023-08-29 05:54:33 UTC
Red Hat Product Errata RHSA-2023:5376 0 None None None 2023-09-27 14:22:57 UTC

Description Travis Nielsen 2023-08-25 18:49:49 UTC
Description of problem (please be detailed as possible and provide log
snippests):

OCP-4.14 has a dependency on the k8s.io/client-go library which should be on version v0.26.4 or higher to avoid pods entering into the CrashLoopBackOff state when aggregated discovery is enabled on K8s 1.27+ environments, as also seen for RDR in #2228319.

Rook saw this failure upstream a few months back as seen in this issue: 
https://github.com/rook/rook/issues/12114

The upstream fix for Rook v1.11 is here: https://github.com/rook/rook/pull/12161

The fix is already in rook for ODF 4.14, but needs to be backported to 4.13.


Version of all relevant components (if applicable):

ODF 4.13

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?

Without this fix, Rook will fail when ODF 4.13 is run on OCP 4.14.

Is there any workaround available to the best of your knowledge?

No

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?

1

Can this issue reproducible?

It will be 100% reproducible if we don't get a fix out before OCP 4.14 is released.


Can this issue reproduce from the UI?

NA

If this is a regression, please provide more details to justify this:

NA

Steps to Reproduce:
1. Install OCP 4.14 
2. Install ODF 4.13


Actual results:

The operator would crash

Expected results:

The operator should not crash

Additional info:

Comment 3 Travis Nielsen 2023-08-25 18:53:15 UTC
Subham, please look at backporting https://github.com/rook/rook/pull/12161 to downstream release-4.13

Comment 4 Travis Nielsen 2023-08-25 18:56:51 UTC
Or if there are merge conflicts, perhaps there is a more scoped fix similar to the fix that ocs operator made: https://github.com/red-hat-storage/ocs-operator/commit/a35a4f970894170a9dadd525e1b590b40b63985a

Comment 11 Shivam Durgbuns 2023-09-11 10:35:33 UTC
Moving to verified, as deployment completed without any crashloopback of pods
Job: https://ocs4-jenkins-csb-odf-qe.apps.ocp-c1.prod.psi.redhat.com/job/qe-deploy-ocs-cluster/29202/

Comment 17 errata-xmlrpc 2023-09-27 14:22:44 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.13.3 security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:5376


Note You need to log in before you can comment on or make changes to this bug.