+++ This bug was initially created as a clone of Bug #1970134 +++ Description of problem: On a multinode cluster AI installation, when setting the VIPs in an AgentClusterInstall, before the hosts boot up, the machine network cidr is still undetermined. This leads to a lot of confusing validations error messages: api vip 192.168.111.202 does not belong to the Machine CIDR or is already in use.,ingress vip 192.168.111.203 does not belong to the Machine CIDR or is already in use.,The Machine Network CIDR is undefined; the Machine Network CIDR can be defined by setting either the API or Ingress virtual IPs.,The Cluster Machine CIDR is different than the calculated CIDR . While in reality, the only real "problem" is that the hosts simply didn't boot up yet. Version-Release number of selected component (if applicable): assisted-service 481b03a775007b927dd0ed108f22f98b7f76db9d How reproducible: 100% Steps to Reproduce: 1. Multi node agentclusterinstall 2. Set VIPs as required 3. Error is shown Actual results: Lots of confusing validations Expected results: Should have better UX - not sure how, but it should be made clearer to the user that they just need to wait for the hosts to boot up Additional info: --- Additional comment from mfilanov on 20210613T12:06:46 We have all the logic and the validations in the backend, kube-api is just a translation layer that does not aware to the validations that are failing. because it's a validations issue i think that can be easily resolved in the validations logic. `clusterValidator` handle specific host so it probably can store a state, so maybe when running the validation it can store a specific error and then use it in `printIsApiVipValid` so in this case the validation can check if cluster have registered hosts and give a better reply @oamizur @alazar what do you think? it will require some changes in the logic but i think that this is not the only case that will require different types of errors. --- Additional comment from alazar on 20210613T13:22:17 @oamizur Maybe in case we don't have hosts, these validation errors should not be displayed, or show "pending" status? --- Additional comment from oamizur on 20210613T14:09:00 @alazar basically this is right. Validations that have need some pending inputs should not fail but just be pending. All the above validations should be pending if there are no hosts with inventories.
I am seeing this message now: The cluster's validations are pending for user: Clusters must have exactly 3 dedicated masters. Please either add hosts, or disable the worker host,Hosts have not been discovered yet,Hosts have not been discovered yet,Hosts have not been discovered yet,Hosts have not been discovered yet,At least one of the CIDRs (Machine Network, Cluster Network, Service Network) is undefined. Not sure why the discovery message is being displayed 4 times. @oamizur does this look acceptable? Is there anyway to cut back on how many times that message is displayed?
Usually pending validations are not displayed (by UI). In general it means that these validations cannot be evaluated until these issues are fixed.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438