Version: Release tag stable Assisted Installer UI version quay.io/ocpmetal/ocp-metal-ui:5f73c3c37938163c99d5559a27accd027eba3e40 Assisted Installer UI library version 1.5.35 Assisted Installer quay.io/ocpmetal/assisted-installer:3673218609bec42b6cf64e2d81152e2cb25ced91 Assisted Installer Controller quay.io/ocpmetal/assisted-installer-controller:3673218609bec42b6cf64e2d81152e2cb25ced91 assistedInstallerService quay.io/ocpmetal/assisted-service:ae1fe9b323a2ba70e32cde08bde87aa93d707897 Discovery Agent quay.io/ocpmetal/assisted-installer-agent:60ac74ef05e45fd612222f3bd17f0b148d346d98 OCP: 4.9.0-rc.0 Trying to deploy OCP on RHOS. The discovered instances have "insufficient" status. Platform: Platform OpenStack Compute is allowed only for Single Node OpenShift or user-managed networking. The network configuration step actually comes after this step and requires the nodes to not be in "insufficient" state.
cc: @jtomasek @tjelinek can you please take a look? We enabled to install with Openstack if users are defining none platform but looks like it collide with the UI wizard steps
@mfilanov, if the `valid-platform` host validation is in neither one of these states: `disabled`, `success` nor it has been explicitly marked in the UI as a `softValidation` it will fail the validation check and prevent the user from moving to the next wizard step. @sasha do you have an environment where we can reproduce this? I would like to see what we receive from the BE during the polling to /v1/clusters/:cluster_id
This is the cluster validation info: {'configuration': [{'id': 'pull-secret-set', 'message': 'The pull secret is set.', 'status': 'success'}], 'hosts-data': [{'id': 'all-hosts-are-ready-to-install', 'message': 'The cluster has hosts that are not ready to ' 'install.', 'status': 'failure'}, {'id': 'sufficient-masters-count', 'message': 'The cluster has a sufficient number of master ' 'candidates.', 'status': 'success'}], 'network': [{'id': 'api-vip-defined', 'message': 'The API virtual IP is undefined; IP allocation from ' 'the DHCP server timed out.', 'status': 'failure'}, {'id': 'api-vip-valid', 'message': 'The API virtual IP is undefined.', 'status': 'pending'}, {'id': 'cluster-cidr-defined', 'message': 'The Cluster Network CIDR is defined.', 'status': 'success'}, {'id': 'dns-domain-defined', 'message': 'The base domain is defined.', 'status': 'success'}, {'id': 'ingress-vip-defined', 'message': 'The Ingress virtual IP is undefined; IP allocation ' 'from the DHCP server timed out.', 'status': 'failure'}, {'id': 'ingress-vip-valid', 'message': 'The Ingress virtual IP is undefined.', 'status': 'pending'}, {'id': 'machine-cidr-defined', 'message': 'The Machine Network CIDR is defined.', 'status': 'success'}, {'id': 'machine-cidr-equals-to-calculated-cidr', 'message': 'The Machine Network CIDR, API virtual IP, or Ingress ' 'virtual IP is undefined.', 'status': 'pending'}, {'id': 'network-prefix-valid', 'message': 'The Cluster Network prefix is valid.', 'status': 'success'}, {'id': 'network-type-valid', 'message': 'The cluster has a valid network type', 'status': 'success'}, {'id': 'no-cidrs-overlapping', 'message': 'No CIDRS are overlapping.', 'status': 'success'}, {'id': 'ntp-server-configured', 'message': 'No ntp problems found', 'status': 'success'}, {'id': 'service-cidr-defined', 'message': 'The Service Network CIDR is defined.', 'status': 'success'}], 'operators': [{'id': 'cnv-requirements-satisfied', 'message': 'cnv is disabled', 'status': 'success'}, {'id': 'lso-requirements-satisfied', 'message': 'lso is disabled', 'status': 'success'}, {'id': 'ocs-requirements-satisfied', 'message': 'ocs is disabled', 'status': 'success'}]} And this is the host validation info: {'hardware': [{'id': 'has-inventory', 'message': 'Valid inventory exists for the host', 'status': 'success'}, {'id': 'has-min-cpu-cores', 'message': 'Sufficient CPU cores', 'status': 'success'}, {'id': 'has-min-memory', 'message': 'Sufficient minimum RAM', 'status': 'success'}, {'id': 'has-min-valid-disks', 'message': 'Sufficient disk capacity', 'status': 'success'}, {'id': 'has-cpu-cores-for-role', 'message': 'Sufficient CPU cores for role master', 'status': 'success'}, {'id': 'has-memory-for-role', 'message': 'Sufficient RAM for role master', 'status': 'success'}, {'id': 'hostname-unique', 'message': 'Hostname ' 'ci-vm-10-0-97-34.hosted.upshift.rdu2.redhat.com is ' 'unique in cluster', 'status': 'success'}, {'id': 'hostname-valid', 'message': 'Hostname ' 'ci-vm-10-0-97-34.hosted.upshift.rdu2.redhat.com is ' 'allowed', 'status': 'success'}, {'id': 'valid-platform', 'message': 'Platform OpenStack Compute is allowed only for ' 'Single Node OpenShift or user-managed networking', 'status': 'failure'}, {'id': 'sufficient-installation-disk-speed', 'message': 'Speed of installation disk has not yet been ' 'measured', 'status': 'success'}, {'id': 'compatible-with-cluster-platform', 'message': 'Host is compatible with cluster platform baremetal', 'status': 'success'}], 'network': [{'id': 'connected', 'message': 'Host is connected', 'status': 'success'}, {'id': 'machine-cidr-defined', 'message': 'Machine Network CIDR is defined', 'status': 'success'}, {'id': 'belongs-to-machine-cidr', 'message': 'Host belongs to all machine network CIDRs', 'status': 'success'}, {'id': 'belongs-to-majority-group', 'message': 'Host has connectivity to the majority of hosts in ' 'the cluster', 'status': 'success'}, {'id': 'ntp-synced', 'message': "Host couldn't synchronize with any NTP server", 'status': 'failure'}, {'id': 'container-images-available', 'message': 'All required container images were either pulled ' 'successfully or no attempt was made to pull them', 'status': 'success'}, {'id': 'sufficient-network-latency-requirement-for-role', 'message': 'Network latency requirement has been satisfied.', 'status': 'success'}, {'id': 'sufficient-packet-loss-requirement-for-role', 'message': 'Packet loss requirement has been satisfied.', 'status': 'success'}, {'id': 'has-default-route', 'message': 'Host has been configured with at least one default ' 'route.', 'status': 'success'}, {'id': 'api-domain-name-resolved-correctly', 'message': 'Domain name resolution is not required (managed ' 'networking)', 'status': 'success'}, {'id': 'api-int-domain-name-resolved-correctly', 'message': 'Domain name resolution is not required (managed ' 'networking)', 'status': 'success'}, {'id': 'apps-domain-name-resolved-correctly', 'message': 'Domain name resolution is not required (managed ' 'networking)', 'status': 'success'}, {'id': 'dns-wildcard-not-configured', 'message': 'DNS wildcard check was successful', 'status': 'success'}], 'operators': [{'id': 'cnv-requirements-satisfied', 'message': 'cnv is disabled', 'status': 'success'}, {'id': 'lso-requirements-satisfied', 'message': 'lso is disabled', 'status': 'success'}, {'id': 'ocs-requirements-satisfied', 'message': 'ocs is disabled', 'status': 'success'}]} @jkilzi if we move (technically we will split the logic and add a new validation) the platform validation from "hardware" to "network" will the user be able to get to the network part inorder to set the "user managed networking"?
(In reply to Eran Cohen from comment #4) > This is the cluster validation info: > > {'configuration': [{'id': 'pull-secret-set', > 'message': 'The pull secret is set.', > 'status': 'success'}], > 'hosts-data': [{'id': 'all-hosts-are-ready-to-install', > 'message': 'The cluster has hosts that are not ready to ' > 'install.', > 'status': 'failure'}, > {'id': 'sufficient-masters-count', > 'message': 'The cluster has a sufficient number of master ' > 'candidates.', > 'status': 'success'}], > 'network': [{'id': 'api-vip-defined', > 'message': 'The API virtual IP is undefined; IP allocation > from ' > 'the DHCP server timed out.', > 'status': 'failure'}, > {'id': 'api-vip-valid', > 'message': 'The API virtual IP is undefined.', > 'status': 'pending'}, > {'id': 'cluster-cidr-defined', > 'message': 'The Cluster Network CIDR is defined.', > 'status': 'success'}, > {'id': 'dns-domain-defined', > 'message': 'The base domain is defined.', > 'status': 'success'}, > {'id': 'ingress-vip-defined', > 'message': 'The Ingress virtual IP is undefined; IP allocation > ' > 'from the DHCP server timed out.', > 'status': 'failure'}, > {'id': 'ingress-vip-valid', > 'message': 'The Ingress virtual IP is undefined.', > 'status': 'pending'}, > {'id': 'machine-cidr-defined', > 'message': 'The Machine Network CIDR is defined.', > 'status': 'success'}, > {'id': 'machine-cidr-equals-to-calculated-cidr', > 'message': 'The Machine Network CIDR, API virtual IP, or > Ingress ' > 'virtual IP is undefined.', > 'status': 'pending'}, > {'id': 'network-prefix-valid', > 'message': 'The Cluster Network prefix is valid.', > 'status': 'success'}, > {'id': 'network-type-valid', > 'message': 'The cluster has a valid network type', > 'status': 'success'}, > {'id': 'no-cidrs-overlapping', > 'message': 'No CIDRS are overlapping.', > 'status': 'success'}, > {'id': 'ntp-server-configured', > 'message': 'No ntp problems found', > 'status': 'success'}, > {'id': 'service-cidr-defined', > 'message': 'The Service Network CIDR is defined.', > 'status': 'success'}], > 'operators': [{'id': 'cnv-requirements-satisfied', > 'message': 'cnv is disabled', > 'status': 'success'}, > {'id': 'lso-requirements-satisfied', > 'message': 'lso is disabled', > 'status': 'success'}, > {'id': 'ocs-requirements-satisfied', > 'message': 'ocs is disabled', > 'status': 'success'}]} > > > And this is the host validation info: > {'hardware': [{'id': 'has-inventory', > 'message': 'Valid inventory exists for the host', > 'status': 'success'}, > {'id': 'has-min-cpu-cores', > 'message': 'Sufficient CPU cores', > 'status': 'success'}, > {'id': 'has-min-memory', > 'message': 'Sufficient minimum RAM', > 'status': 'success'}, > {'id': 'has-min-valid-disks', > 'message': 'Sufficient disk capacity', > 'status': 'success'}, > {'id': 'has-cpu-cores-for-role', > 'message': 'Sufficient CPU cores for role master', > 'status': 'success'}, > {'id': 'has-memory-for-role', > 'message': 'Sufficient RAM for role master', > 'status': 'success'}, > {'id': 'hostname-unique', > 'message': 'Hostname ' > 'ci-vm-10-0-97-34.hosted.upshift.rdu2.redhat.com > is ' > 'unique in cluster', > 'status': 'success'}, > {'id': 'hostname-valid', > 'message': 'Hostname ' > 'ci-vm-10-0-97-34.hosted.upshift.rdu2.redhat.com > is ' > 'allowed', > 'status': 'success'}, > {'id': 'valid-platform', > 'message': 'Platform OpenStack Compute is allowed only for ' > 'Single Node OpenShift or user-managed networking', > 'status': 'failure'}, > {'id': 'sufficient-installation-disk-speed', > 'message': 'Speed of installation disk has not yet been ' > 'measured', > 'status': 'success'}, > {'id': 'compatible-with-cluster-platform', > 'message': 'Host is compatible with cluster platform > baremetal', > 'status': 'success'}], > 'network': [{'id': 'connected', > 'message': 'Host is connected', > 'status': 'success'}, > {'id': 'machine-cidr-defined', > 'message': 'Machine Network CIDR is defined', > 'status': 'success'}, > {'id': 'belongs-to-machine-cidr', > 'message': 'Host belongs to all machine network CIDRs', > 'status': 'success'}, > {'id': 'belongs-to-majority-group', > 'message': 'Host has connectivity to the majority of hosts in ' > 'the cluster', > 'status': 'success'}, > {'id': 'ntp-synced', > 'message': "Host couldn't synchronize with any NTP server", > 'status': 'failure'}, > {'id': 'container-images-available', > 'message': 'All required container images were either pulled ' > 'successfully or no attempt was made to pull them', > 'status': 'success'}, > {'id': 'sufficient-network-latency-requirement-for-role', > 'message': 'Network latency requirement has been satisfied.', > 'status': 'success'}, > {'id': 'sufficient-packet-loss-requirement-for-role', > 'message': 'Packet loss requirement has been satisfied.', > 'status': 'success'}, > {'id': 'has-default-route', > 'message': 'Host has been configured with at least one default > ' > 'route.', > 'status': 'success'}, > {'id': 'api-domain-name-resolved-correctly', > 'message': 'Domain name resolution is not required (managed ' > 'networking)', > 'status': 'success'}, > {'id': 'api-int-domain-name-resolved-correctly', > 'message': 'Domain name resolution is not required (managed ' > 'networking)', > 'status': 'success'}, > {'id': 'apps-domain-name-resolved-correctly', > 'message': 'Domain name resolution is not required (managed ' > 'networking)', > 'status': 'success'}, > {'id': 'dns-wildcard-not-configured', > 'message': 'DNS wildcard check was successful', > 'status': 'success'}], > 'operators': [{'id': 'cnv-requirements-satisfied', > 'message': 'cnv is disabled', > 'status': 'success'}, > {'id': 'lso-requirements-satisfied', > 'message': 'lso is disabled', > 'status': 'success'}, > {'id': 'ocs-requirements-satisfied', > 'message': 'ocs is disabled', > 'status': 'success'}]} > > @jkilzi if we move (technically we will split the logic and add a > new validation) the platform validation from "hardware" to "network" will > the user be able to get to the network part inorder to set the "user managed > networking"? Yes. As I mentioned before, at the hosts-discovery step we only pay attention to the 'hardware' group. The 'network' group is evaluated at the networking step.
FiledQA Tried to verify with Assisted-ui-lib version: 1.5.36-2 The issue still persists.
Unsure the assisted-ui-lib have something to do with it. It failed QA on staging? What is the assisted-service version you tested?
Re-tested on staging and it works as expected now. Assisted-ui-lib version: 1.5.37
Hi, In our 3 (control) + 2 (worker) AI based cluster deployment, hosts are not in Ready state to proceed with installation. Status of host is Insufficient even after NTP sync is successful Output from the one of the cluster nodes ============================================================================== $ chronyc sources 210 Number of sources = 1 MS Name/IP address Stratum Poll Reach LastRx Last sample =============================================================================== ^* 192.168.10.4 3 10 377 947 -292us[ -295us] +/- 92ms $ timedatectl Local time: Tue 2022-05-10 15:51:51 UTC Universal time: Tue 2022-05-10 15:51:51 UTC RTC time: Tue 2022-05-10 15:51:51 Time zone: UTC (UTC, +0000) System clock synchronized: yes NTP service: active RTC in local TZ: no Cluster events : ============================================================================================ 5/10/2022, 7:21:04 PM Updated status of the cluster to insufficient 5/10/2022, 7:21:04 PM Cluster validation 'api-vip-defined' is now fixed 5/10/2022, 7:07:40 PM Cluster validation 'ntp-server-configured' is now fixed 5/10/2022, 7:07:38 PM Host sl12345.net: validation 'ntp-synced' is now fixed 5/10/2022, 7:07:14 PM Host sl12346.net: validation 'ntp-synced' is now fixed 5/10/2022, 7:06:58 PM Host sl12347.net: validation 'ntp-synced' is now fixed 5/10/2022, 7:06:34 PM Host sl12348.net: validation 'ntp-synced' is now fixed 5/10/2022, 7:05:38 PM Cluster validation 'sufficient-masters-count' is now fixed 5/10/2022, 7:05:38 PM warning Host sl12345.net: updated status from discovering to insufficient (Host cannot be installed due to following failing validation(s): Host couldn't synchronize with any NTP server ; No connectivity to the majority of hosts in the cluster) 5/10/2022, 7:05:38 PM Host sl12349.net: validation 'ntp-synced' is now fixed 5/10/2022, 7:05:14 PM warning Host sl12346.net: updated status from discovering to insufficient (Host cannot be installed due to following failing validation(s): Host couldn't synchronize with any NTP server ; No connectivity to the majority of hosts in the cluster) 5/10/2022, 7:04:58 PM warning Host sl12347.net: updated status from discovering to insufficient (Host cannot be installed due to following failing validation(s): Host couldn't synchronize with any NTP server ; No connectivity to the majority of hosts in the cluster) 5/10/2022, 7:04:54 PM Host 67bb1e80-ccb6-2902-bebd-c722391c6b27: Successfully registered 5/10/2022, 7:04:39 PM warning Cluster validation 'ntp-server-configured' that used to succeed is now failing 5/10/2022, 7:04:34 PM warning Host sl12348.net: updated status from discovering to insufficient (Host cannot be installed due to following failing validation(s): Host couldn't synchronize with any NTP server) 5/10/2022, 7:04:30 PM Host 90570f72-619d-2327-e948-7a1ac68387b6: Successfully registered 5/10/2022, 7:03:38 PM warning Cluster validation 'all-hosts-are-ready-to-install' that used to succeed is now failing 5/10/2022, 7:03:38 PM warning Host sl12349.net: updated status from discovering to insufficient (Host cannot be installed due to following failing validation(s): Host couldn't synchronize with any NTP server) Validation info from the cluster : ================================================================================================================= "configuration": [ { "id": "pull-secret-set", "status": "success", "message": "The pull secret is set." } ], "hosts-data": [ { "id": "all-hosts-are-ready-to-install", "status": "failure", "message": "The cluster has hosts that are not ready to install." }, { "id": "sufficient-masters-count", "status": "success", "message": "The cluster has a sufficient number of master candidates." } ], "network": [ { "id": "api-vip-defined", "status": "success", "message": "The API virtual IP is defined." }, { "id": "api-vip-valid", "status": "success", "message": "api vip 192.168.10.40 belongs to the Machine CIDR and is not in use." }, { "id": "cluster-cidr-defined", "status": "success", "message": "The Cluster Network CIDR is defined." }, { "id": "dns-domain-defined", "status": "success", "message": "The base domain is defined." }, { "id": "ingress-vip-defined", "status": "success", "message": "The Ingress virtual IP is defined." }, { "id": "ingress-vip-valid", "status": "success", "message": "ingress vip 192.168.10.41 belongs to the Machine CIDR and is not in use." }, { "id": "machine-cidr-defined", "status": "success", "message": "The Machine Network CIDR is defined." }, { "id": "machine-cidr-equals-to-calculated-cidr", "status": "success", "message": "The Cluster Machine CIDR is equivalent to the calculated CIDR." }, { "id": "network-prefix-valid", "status": "success", "message": "The Cluster Network prefix is valid." }, { "id": "network-type-valid", "status": "success", "message": "The cluster has a valid network type" }, { "id": "networks-same-address-families", "status": "success", "message": "Same address families for all networks." }, { "id": "no-cidrs-overlapping", "status": "success", "message": "No CIDRS are overlapping." }, { "id": "ntp-server-configured", "status": "success", "message": "No ntp problems found" }, { "id": "service-cidr-defined", "status": "success", "message": "The Service Network CIDR is defined." } ], "operators": [ { "id": "cnv-requirements-satisfied", "status": "success", "message": "cnv is disabled" }, { "id": "lso-requirements-satisfied", "status": "success", "message": "lso is disabled" }, { "id": "odf-requirements-satisfied", "status": "success", "message": "odf is disabled" } ] } Used following images : ============================================================================================== quay.io/edge-infrastructure/postgresql-12-centos7:0.3.25 quay.io/edge-infrastructure/assisted-service:v2.3.1 quay.io/edge-infrastructure/assisted-installer-ui:v2.3.9 quay.io/edge-infrastructure/assisted-image-service:v2.3.1 quay.io/edge-infrastructure/assisted-installer-agent:v2.3.1 quay.io/edge-infrastructure/assisted-installer:v2.3.1 quay.io/edge-infrastructure/assisted-installer-controller:v2.3.1 With this unable to proceed for the installation, how to proceed further?