Bug 2025788
| Summary: | [IPI on azure]Pre-check on IPI Azure, should check VM Size’s vCPUsAvailable instead of vCPUs for the sku. | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | MayXu <maxu> |
| Component: | Installer | Assignee: | Aditya Narayanaswamy <anarayan> |
| Installer sub component: | openshift-installer | QA Contact: | MayXu <maxu> |
| Status: | CLOSED ERRATA | Docs Contact: | |
| Severity: | low | ||
| Priority: | medium | CC: | anarayan, jialiu, maxu, mstaeble |
| Version: | 4.9 | ||
| Target Milestone: | --- | ||
| Target Release: | 4.10.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: |
The installer was checking if the total number of vcpus available for a given instance type in a region was more than the minimum resource requirement to deploy the cluster but it should have checked for the number of vcpus currently available for that instance type in the region.
Changing the check from total number of vcpus to number of vcpus available.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-03-10 16:30:15 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
with Standard_E8-2s_v4 as control node FATAL failed to fetch Metadata: failed to load asset "Install Config": controlPlane.platform.azure.type: Invalid value: "Standard_E8-2s_v4": instance type does not meet minimum resource requirements of 4 vCPUsAvailable version: ./openshift-install 4.10.0-0.nightly-2022-01-05-135407 built from commit 22d874c8d0751d5645de95121662e32d17d6eada release image registry.ci.openshift.org/ocp/release@sha256:592eb8e80ff7d65ee57137b8fb50adc566df066aba532be7779ac009e36f6b59 release architecture amd64 @anarayan vCPUsAvailable is the property of vm size (instance type) , similar as vCPUs, some of vm sizes have the same values, but some vm size, the vCPUsAvailable is less then the vCPUs, such as standard_E8-4ds_v4, vCPus is 8, vCPUsAvailable is 4. Installer will check the vCPUsAvailable instead of vCPUs, whether is match our minimum requirement. Yeah I understand that. By total number of vcpus, I do mean the field vCPUs and I mean the same with vCPUs available: vCPUsAvailable. I thought from a docs perspective, explaining what they actually mean wuld be more useful. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056 |
Version: $ openshift-install version 4.9.0-0.nightly-2021-11-18-000209 Platform: Azure Please specify: * IPI (automated install with `openshift-install`. If you don't know, then it's IPI) What happened? Install Azure cluster with a special VM size which vCPUsAvailable < CPU minimum requirement,vCPUs >= CPU minimum requirement,installer pre-check get passed, but cluster install completed with ERROR message. $oc get po -n openshift-kube-apiserver installer-8-maxusizeei-pkzw6-master-2 0/1 UnexpectedAdmissionError 0 17m Check the pods which have issues, as the following: oc get pods -n openshift-kube-apiserver `oc get -oyaml -n "openshift-kube-apiserver" pods <erro master pod> ` message: 'Pod Unexpected error while attempting to recover from admission failure: preemption: error finding a set of pods to preempt: no set of running pods found to reclaim resources: [(res: cpu, q: 105), ]' What did you expect to happen? $openshift-install create cluster --dir <installFolder> Pre-check of the installer should prompt the vm size does not meet the minimum resource requirements of vCPUs immediately. controlPlane.platform.azure.type: Invalid value: "Standard_E8-2s_v4": instance type does not meet minimum resource requirements of 4 vCPUs, How to reproduce it (as minimally and precisely as possible)? 1.Create install-config.yaml $openshift-install create install-config --dir <installFolder> 2. Customize the vm size type in install-config.yaml, use the vm which vCPUs meet the limit (master is 4, worker is 2), vCPUsAvailable < vCPUs, such as Standard_E8-2s_v4 name: master platform: azure: type: Standard_E8-2s_v4 3. $openshift-install create cluster --dir <instalFolder> Has some error message like the following: ERROR Cluster operator authentication Degraded is True with OAuthServerDeployment_UnavailablePod::OAuthServerRouteEndpointAccessibleController_SyncError::WellKnownReadyController_SyncError: OAuthServerDeploymentDegraded: 1 of 3 requested instances are unavailable for oauth-openshift.openshift-authentication (container is not ready in oauth-openshift-6b8db4f9bc-lcj2t pod) ERROR OAuthServerRouteEndpointAccessibleControllerDegraded: Get "https://oauth-openshift.apps.maxusizee4a.qe.azure.devcluster.openshift.com/healthz": dial tcp: lookup oauth-openshift.apps.maxusizee4a.qe.azure.devcluster.openshift.com on 172.30.0.10:53: no such host Anything else we need to know? https://docs.microsoft.com/en-us/azure/virtual-machines/vm-naming-conventions#example-4-m8-2ms_v2-constrained-vcpu list the vm size which vCPUsAvailable different with vCPUs: az vm list-skus -l centralus --query "[?resourceType=='virtualMachines'&&capabilities[?name=='vCPUs'].value!=capabilities[?name=='vCPUsAvailable'].value].{Name:name, PremiumIO:capabilities[?name=='PremiumIO'].value, vCPUsAvailable:capabilities[?name=='vCPUsAvailable'].value, vCPUs:capabilities[?name=='vCPUs'].value}" Not all the size has “vCPUsAvailable”, such as “Standard_B1ls”,”Standard_M416s_v2”,“Standard_M416s_v2”