Bug 2025868 - [IPI on azure] Pre-check on IPI Azure, should check whether the VM Size’s HyperVGenerations contains ‘V1’ for the sku.
Summary: [IPI on azure] Pre-check on IPI Azure, should check whether the VM Size’s Hyp...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 4.9
Hardware: Unspecified
OS: Unspecified
medium
low
Target Milestone: ---
: 4.10.z
Assignee: Aditya Narayanaswamy
QA Contact: MayXu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-11-23 09:10 UTC by MayXu
Modified: 2022-03-21 12:40 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-03-21 12:40:05 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift installer pull 5509 0 None Merged Bug 2025868: Check HyperVGenerations for instance type 2022-01-31 22:17:44 UTC
Red Hat Product Errata RHBA-2022:0928 0 None None None 2022-03-21 12:40:22 UTC

Description MayXu 2021-11-23 09:10:29 UTC
Version:

$ openshift-install version
./openshift-install 4.9.0-0.nightly-2021-11-18-000209
built from commit 1c538b8949f3a0e5b993e1ae33b9cd799806fa93
release image registry.ci.openshift.org/ocp/release@sha256:c2c8cd51afb5d02717881b2af4e8965f03a893c2f04511a3544b8477e3484e16
release architecture amd64


Platform:

Azure

Please specify:
* IPI 


What happened?

Install Azure cluster with a special VM size which HyperVGenerations is ‘V2’, does not contain ‘V1’, installer Pre-check gets passed, but cluster install completed with ERROR message.

Specified the vm size as master :
ERROR Error: creating Linux Virtual Machine "maxusizedd-t5bww-master-0" (Resource Group "maxusizedd-t5bww-rg"): compute.VirtualMachinesClient#CreateOrUpdate: Failure sending request: StatusCode=0 -- Original Error: Code="BadRequest" Message="The selected VM size 'Standard_DC4s_v3' cannot boot Hypervisor Generation '1'. If this was a Create operation please check that the Hypervisor Generation of the Image matches the Hypervisor Generation of the selected VM Size. If this was an Update operation please select a Hypervisor Generation '1' VM Size." 
ERROR                                              
ERROR   on ../../../tmp/openshift-install-cluster-736542714/master/master.tf line 84, in resource "azurerm_linux_virtual_machine" "master": 
ERROR   84: resource "azurerm_linux_virtual_machine" "master" { 
ERROR 

Specified the vm size as worker :
$ oc get nodes 
 no worker be listed
$oc get event -n openshift-machine-api 
…
57m         Warning   FailedCreate        machine/may-sg-rflz7-worker-centralus2-5b75c        InvalidConfiguration: failed to reconcile machine "may-sg-rflz7-worker-centralus2-5b75c": failed to create vm may-sg-rflz7-worker-centralus2-5b75c: failure sending request for machine may-sg-rflz7-worker-centralus2-5b75c: cannot create vm: compute.VirtualMachinesClient#CreateOrUpdate: Failure sending request: StatusCode=400 -- Original Error: Code="BadRequest" Message="The selected VM size 'Standard_DC2ds_v3' cannot boot Hypervisor Generation '1'. If this was a Create operation please check that the Hypervisor Generation of the Image matches the Hypervisor Generation of the selected VM Size. If this was an Update operation please select a Hypervisor Generation '1' VM Size."



What did you expect to happen?

Pre-check of the installer should prompt the vm size is invalid immediately,
 eg:The selected VM size 'Standard_DC4s_v3' cannot boot Hypervisor Generation '1'.


How to reproduce it (as minimally and precisely as possible)?

1.Create install-config.yaml 
$openshift-install create install-config --dir <installFolder>

2. Customize the vm size in the created install-config.yaml, use the vm size which HyperVGenerations value is just 'V2' such as Standard_DC2s_v3, Standard_DC4s_v3.
name: master
  platform:
    azure:
        type: Standard_DC4s_v3
or 
name: worker
  platform:
    azure:
        type: Standard_DC2s_v3

3. $openshift-install create cluster --dir <instalFolder>


Anything else we need to know?

ref:https://bugzilla.redhat.com/show_bug.cgi?id=1954707 Azure VHD fails to install on gen2 NDv2 instances

Comment 2 Patrick Dillon 2021-12-17 21:27:20 UTC
Trying to establish some background on this. My take is: Azure supports gen1 & gen2 VMs. Typically you create a gen2 VM by selecting a gen2 compatible instance type (for example a Standard D4s v3 is both gen1 & gen2 compatible) AND a gen2 image. The gen2 image is what tells the instance to be gen2. In particular, it seems to be metadata on the image. 

For our use case where we create images from VHDs, this is addressed in the FAQS here: https://docs.microsoft.com/en-us/azure/virtual-machines/generation-2#frequently-asked-questions

It looks like a managed disk is required.

Comment 3 MayXu 2021-12-19 14:18:54 UTC
$ az vm list-skus -l centralus --size Standard_DC4s_v3 --query "[].{HyperVGenerations:capabilities[?name=='HyperVGenerations'].value}"
[
  {
    "HyperVGenerations": [
      "V2"
    ]
  }
]

 if the vm size's HyperVGenerations value is not included "V1", and now we have not support, how about prompt user early ?

Comment 6 MayXu 2022-01-29 08:34:55 UTC
when select the Gen2 market image RedHat:ocp-worker:ocp-worker-a:4.8.2021122100, with the Standard_DC4s_v3 still prompt : 
level=fatal msg=failed to fetch Master Machines: failed to load asset "Install Config": compute[0].platform.azure.type: Invalid value: "Standard_DC4s_v3": only disks with HyperVGeneration V1 are supported

expected result: install succeed without error with Standard_DC4s_v3 based on the gen2 market image.

Comment 8 Patrick Dillon 2022-03-11 01:30:46 UTC
For 4.10, we expect V2-only instance types to be rejected when entered in the install config. Marketplace images are only supported through editing the manifests. I have updated the KCS article to reflect this: "If you choose to use an instance type which is only Gen2-compatible, the instance type must be specified when editing the manifests--it cannot be specified in the install config."

I am setting this back to ON_QA. Please let me know if there are more questions. Note, we are hoping to add Gen2 support in 4.11, which would make all of this more straightforward.

Comment 9 MayXu 2022-03-11 03:01:31 UTC
(In reply to Patrick Dillon from comment #8)
> For 4.10, we expect V2-only instance types to be rejected when entered in
> the install config. Marketplace images are only supported through editing
> the manifests. I have updated the KCS article to reflect this: "If you
> choose to use an instance type which is only Gen2-compatible, the instance
> type must be specified when editing the manifests--it cannot be specified in
> the install config."
> 
> I am setting this back to ON_QA. Please let me know if there are more
> questions. Note, we are hoping to add Gen2 support in 4.11, which would make
> all of this more straightforward.

Thanks

Comment 12 errata-xmlrpc 2022-03-21 12:40:05 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.10.5 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:0928


Note You need to log in before you can comment on or make changes to this bug.