Bug 1971308

Summary: [4.8.0] AI KubeAPI AgentClusterInstall confusing "Validated" condition about VIP not matching machine network
Product: OpenShift Container Platform Reporter: Ronnie Lazar <alazar>
Component: assisted-installerAssignee: Ori Amizur <oamizur>
assisted-installer sub component: Deployment Operator QA Contact: bjacot
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: urgent CC: alazar, aos-bugs, oamizur, otuchfel, trwest
Version: 4.8Keywords: Triaged
Target Milestone: ---   
Target Release: 4.8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: KNI-EDGE-JUKE-4.8 AI-Team-Core
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1970134 Environment:
Last Closed: 2021-07-27 23:12:53 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1970134    
Bug Blocks:    

Description Ronnie Lazar 2021-06-13 14:16:01 UTC
+++ This bug was initially created as a clone of Bug #1970134 +++

Description of problem:
On a multinode cluster AI installation, when setting the VIPs in an AgentClusterInstall, before the hosts boot up, the machine network cidr is still undetermined. This leads to a lot of confusing validations error messages:

api vip 192.168.111.202 does not belong to the Machine CIDR or is already in use.,ingress vip 192.168.111.203 does not belong to the Machine CIDR or is already in use.,The Machine Network CIDR is undefined; the Machine Network CIDR can be defined by setting either the API or Ingress virtual IPs.,The Cluster Machine CIDR  is different than the calculated CIDR .

While in reality, the only real "problem" is that the hosts simply didn't boot up yet.

Version-Release number of selected component (if applicable):
assisted-service 481b03a775007b927dd0ed108f22f98b7f76db9d

How reproducible:
100%

Steps to Reproduce:
1. Multi node agentclusterinstall
2. Set VIPs as required
3. Error is shown

Actual results:
Lots of confusing validations

Expected results:
Should have better UX - not sure how, but it should be made clearer to the user that they just need to wait for the hosts to boot up

Additional info:

--- Additional comment from mfilanov on 20210613T12:06:46

We have all the logic and the validations in the backend, kube-api is just a translation layer that does not aware to the validations that are failing.
because it's a validations issue i think that can be easily resolved in the validations logic.

`clusterValidator` handle specific host so it probably can store a state, so maybe when running the validation it can store a specific error and then use it in `printIsApiVipValid`
so in this case the validation can check if cluster have registered hosts and give a better reply

@oamizur @alazar what do you think? it will require some changes in the logic but i think that this is not the only case that will require different types of errors.

--- Additional comment from alazar on 20210613T13:22:17

@oamizur Maybe in case we don't have hosts, these validation errors should not be displayed, or show "pending" status?

--- Additional comment from oamizur on 20210613T14:09:00

@alazar basically this is right.  Validations that have need some pending inputs should not fail but just be pending.  All the above validations should be pending if there are no hosts with inventories.

Comment 2 Trey West 2021-07-06 19:50:12 UTC
I am seeing this message now: 

The cluster's validations are pending for user: Clusters must have exactly 3 dedicated masters. Please either add hosts, or disable the worker host,Hosts have not been discovered yet,Hosts have not been discovered yet,Hosts have not been discovered yet,Hosts have not been discovered yet,At least one of the CIDRs (Machine Network, Cluster Network, Service Network) is undefined.

Not sure why the discovery message is being displayed 4 times. 

@oamizur does this look acceptable? Is there anyway to cut back on how many times that message is displayed?

Comment 4 Ori Amizur 2021-07-15 13:31:26 UTC
Usually pending validations are not displayed (by UI).  In general it means that these validations cannot be evaluated until these issues are fixed.

Comment 7 errata-xmlrpc 2021-07-27 23:12:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438