Bug 2056701 - Non platform install fails agentclusterinstall CRD is outdated in rhacm2.5
Summary: Non platform install fails agentclusterinstall CRD is outdated in rhacm2.5
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Advanced Cluster Management for Kubernetes
Classification: Red Hat
Component: Infrastructure Operator
Version: rhacm-2.5
Hardware: All
OS: All
unspecified
high
Target Milestone: ---
: rhacm-2.5
Assignee: Ori Amizur
QA Contact:
Derek
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-02-21 22:05 UTC by bjacot
Modified: 2022-06-09 02:11 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-06-09 02:11:31 UTC
Target Upstream Version:
Embargoed:
bjacot: Blocker-
bjacot: rhacm-2.5+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github stolostron backlog issues 20095 0 None None None 2022-02-22 01:58:44 UTC
Red Hat Issue Tracker MGMTBUGSM-106 0 None None None 2022-02-21 22:07:57 UTC
Red Hat Product Errata RHSA-2022:4956 0 None None None 2022-06-09 02:11:45 UTC

Description bjacot 2022-02-21 22:05:57 UTC
Description of the problem:
Non platform deployment is failing to start.  The agentcluster install validations are failing.  I am noticing userManagedNetworking: true is not showing up in the .spec when you describe the aci.

The cluster's validations are pending for the user: The cluster has hosts that are not ready to install.,The API virtual IP is undefined and must be provided.,The API virtual IP is undefined.,The Ingress virtual IP is undefined and must be provided

Release version:
rhacm 2.5
Operator snapshot version:
2.5.0-DOWNSTREAM-2022-02-18-19-59-32

OCP version:
hub 4.9
rhacm: 2.5.0-DOWNSTREAM-2022-02-18-19-59-32
spoke deployment: 2.9

Browser Info:

Steps to reproduce:
1. Deploy hub and rhacm operator 2.5.0-DOWNSTREAM-2022-02-18-19-59-32
2. Prepare non-platform manifest
Remove: apiVIP: and ingressVIP: --> from the agentclusterinstall manifest
3. Add dns entries for api and api-int and wildcard to point to the gateway
4. add a load balancer to round-robin ports 443, 6443, 22623, between the master 

Actual results:
installation fails to start.  Describe agentcluster install.

Message:                     The cluster's validations are pending for user: The cluster has hosts that are not ready to install.,The API virtual IP is undefined and must be provided.,The API virtual IP is undefined.,The Ingress virtual IP is undefined and must be provided.,The Ingress virtual IP is undefined.,The Machine Network CIDR is undefined; the Machine Network CIDR can be defined by setting either the API or Ingress virtual IPs.,The Machine Network CIDR, API virtual IP, or Ingress virtual IP is undefined.,At least one of the CIDRs (Machine Network, Cluster Network, Service Network) is undefined.


Expected results:
The installation to start.
Additional info:

Work around:

Apply the updated CRD's from thea assisted-service repo
https://github.com/openshift/assisted-service/tree/master/config/crd/bases

Comment 1 bjacot 2022-02-21 22:06:55 UTC
This prevents us from testing day2 non platform deployments.

Comment 2 bjacot 2022-02-21 22:56:27 UTC
A comment was shared that the agent crd is old.

[kni@provisionhost-0-0 ~]$ oc describe crd agents.agent-install.openshift.io | grep "Stage Start Time"
                  Stage Start Time:

Comment 3 Michael Filanov 2022-02-22 13:09:23 UTC
Day2 none platform i not implemented yet 
wip in https://issues.redhat.com/browse/MGMT-8879 
@oamizur correct me if i'm wrong or if it's a different issue.

Comment 4 Ori Amizur 2022-02-22 13:35:20 UTC
Day2 none platform ZTP is implemented, but no installation scripts exist for this scenario.  So BMHs need to be created manually.
Also, installation scripts for none platform day1 exist for libvirt environment only.

Comment 6 bjacot 2022-02-22 14:10:40 UTC
removed workaround as i have a workaround.

Comment 7 Yuanyuan He 2022-03-01 08:44:19 UTC
@bjacot is the workaround acceptable so we can close or do you still expect a fix? Thanks!

Comment 8 bjacot 2022-03-01 17:07:13 UTC
@yuhe  yes we still need a fix.  these outdated CRD's are still a problem.  Id prefer to close once that is addressed.

Comment 9 bjacot 2022-03-02 17:41:07 UTC
@yuhe Are you tracking the outdated CRD's on another defect?  In this scenario, the updated CRD's are the fix for this issue which we can move to ON_QA instead of closed.

Comment 10 Yuanyuan He 2022-03-10 02:49:41 UTC
@bjacot I believe installer team has been working on an automation tool to sync CRDs every day, will leave them to confirm. Thanks!

Comment 11 Jakob 2022-03-11 15:58:19 UTC
I've just merged in the latest CRD changes from assisted-service-operator v0.2.13 that were published 12 hours ago. Does that cover the CRD updates needed to close this issue?

PR: https://github.com/stolostron/backplane-operator/pull/134

Comment 12 Michael Filanov 2022-03-16 08:01:06 UTC
Yes it should, can you please check again ?

Comment 13 bjacot 2022-03-17 16:42:08 UTC
I will move to verified.  I did a spoke deployment yesterday without the CRD workaround on 2.5.0-DOWNSTREAM-2022-03-14-18-18-07.

Comment 17 errata-xmlrpc 2022-06-09 02:11:31 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat Advanced Cluster Management 2.5 security updates, images, and bug fixes), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:4956


Note You need to log in before you can comment on or make changes to this bug.