Bug 1970134 - [master] AI KubeAPI AgentClusterInstall confusing "Validated" condition about VIP not matching machine network
Summary: [master] AI KubeAPI AgentClusterInstall confusing "Validated" condition about...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: assisted-installer
Version: 4.8
Hardware: Unspecified
OS: Unspecified
urgent
high
Target Milestone: ---
: ---
Assignee: Ori Amizur
QA Contact: Omri Hochman
URL:
Whiteboard: KNI-EDGE-JUKE-4.8 AI-Team-Core
Depends On:
Blocks: 1971308
TreeView+ depends on / blocked
 
Reported: 2021-06-09 21:06 UTC by Omer Tuchfeld
Modified: 2022-08-28 08:45 UTC (History)
2 users (show)

Fixed In Version: OCP-Metal-v1.0.23.1
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1971308 (view as bug list)
Environment:
Last Closed: 2022-08-28 08:45:59 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Omer Tuchfeld 2021-06-09 21:06:34 UTC
Description of problem:
On a multinode cluster AI installation, when setting the VIPs in an AgentClusterInstall, before the hosts boot up, the machine network cidr is still undetermined. This leads to a lot of confusing validations error messages:

api vip 192.168.111.202 does not belong to the Machine CIDR or is already in use.,ingress vip 192.168.111.203 does not belong to the Machine CIDR or is already in use.,The Machine Network CIDR is undefined; the Machine Network CIDR can be defined by setting either the API or Ingress virtual IPs.,The Cluster Machine CIDR  is different than the calculated CIDR .

While in reality, the only real "problem" is that the hosts simply didn't boot up yet.

Version-Release number of selected component (if applicable):
assisted-service 481b03a775007b927dd0ed108f22f98b7f76db9d

How reproducible:
100%

Steps to Reproduce:
1. Multi node agentclusterinstall
2. Set VIPs as required
3. Error is shown

Actual results:
Lots of confusing validations

Expected results:
Should have better UX - not sure how, but it should be made clearer to the user that they just need to wait for the hosts to boot up

Additional info:

Comment 1 Michael Filanov 2021-06-13 12:06:46 UTC
We have all the logic and the validations in the backend, kube-api is just a translation layer that does not aware to the validations that are failing.
because it's a validations issue i think that can be easily resolved in the validations logic.

`clusterValidator` handle specific host so it probably can store a state, so maybe when running the validation it can store a specific error and then use it in `printIsApiVipValid`
so in this case the validation can check if cluster have registered hosts and give a better reply

@oamizur @alazar what do you think? it will require some changes in the logic but i think that this is not the only case that will require different types of errors.

Comment 2 Ronnie Lazar 2021-06-13 13:22:17 UTC
@oamizur Maybe in case we don't have hosts, these validation errors should not be displayed, or show "pending" status?

Comment 3 Ori Amizur 2021-06-13 14:09:00 UTC
@alazar basically this is right.  Validations that have need some pending inputs should not fail but just be pending.  All the above validations should be pending if there are no hosts with inventories.


Note You need to log in before you can comment on or make changes to this bug.