Bug 1993366

Summary: Hive Operator CrashLoopBackOff when deploying ACM with latest downstream 2.4
Product: Red Hat Advanced Cluster Management for Kubernetes Reporter: Chad Crum <ccrum>
Component: InstallerAssignee: Nathan Weatherly <nweather>
Status: CLOSED ERRATA QA Contact: Chad Crum <ccrum>
Severity: high Docs Contact: Christopher Dawson <cdawson>
Priority: unspecified    
Version: rhacm-2.4CC: bjacot, ccrum, daliu, sasha, vboulos
Target Milestone: ---Keywords: TestBlocker
Target Release: rhacm-2.4Flags: ming: rhacm-2.4+
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-11-11 18:33:32 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Chad Crum 2021-08-12 20:37:31 UTC
Description of the problem:
Attempting to deploy RHACM from the latest 2.4 results in Hive operator going into crash loop and deploy does not finish.

Operator snapshot version:
2.4.0-DOWNSTREAM-2021-08-12-05-54-55

OCP version:
4.9.0-0.nightly-2021-08-07-175228

Environment: IPv4 connected baremetal (libvirt)

Browser Info:
N/A - Deploying via CRD

Steps to reproduce:
1. Create catalogsource on OCP hub cluster from ACM downstream snapshot
2. Create operatorgroup and subscription for ACM 2.4

Actual results:
Hive stuck 
hive-operator-657f9f86b9-wjvnl                                    0/1     CrashLoopBackOff   60         4h59m

Expected results:
All pods deploy without error 

Additional info:

Comment 2 Alexander Chuzhoy 2021-08-16 13:29:20 UTC
Reproducing with 2.4.0-DOWNSTREAM-2021-08-16-03-49-17

Comment 3 Chad Crum 2021-08-17 16:05:33 UTC
FYI I tried ds build 2.4.0-DOWNSTREAM-2021-08-17-07-07-34 today and I'm getting more pods deployed (think it's getting farther with mch), but besides hive-operator I'm getting more pods in clbo.

https://bugzilla.redhat.com/show_bug.cgi?id=1994652 for reference.

Comment 4 daliu 2021-08-18 04:38:33 UTC
Current ACM still use hive v1.1.8, which crd version is v1beta1, CRD v1beta1 is deprecated in OCP 4.9.
so we will upgrade hive to v1.1.14 to fix this issue.

Comment 5 Chad Crum 2021-08-26 11:26:41 UTC
Verified working - All pods in a Running status on latest downstream snapshot

OCP Hub: 4.9.0-0.nightly-2021-08-25-185404
Snapshot: 2.4.0-DOWNSTREAM-2021-08-25-19-01-28

Comment 9 errata-xmlrpc 2021-11-11 18:33:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat Advanced Cluster Management 2.4 images and security updates), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:4618