Bug 1845610

Summary: Windows VM MachineSet error on Azure
Product: OpenShift Container Platform Reporter: Anand <anachand>
Component: Cloud ComputeAssignee: Danil Grigorev <dgrigore>
Cloud Compute sub component: Other Providers QA Contact: sunzhaohua <zhsun>
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: unspecified CC: aarapov, aos-bugs, aravindh, dgrigore, sdodson
Version: unspecified   
Target Milestone: ---   
Target Release: 4.6.0   
Hardware: Unspecified   
OS: Windows   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-10-27 16:06:02 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Anand 2020-06-09 16:03:41 UTC
I am trying to bring up a Windows VM using a MachineSet on Azure. I am getting the following error:
Error Message:  failed to reconcile machine "aravindh-k5mj2-win-worker-centralus1-vlbpp": compute.VirtualMachinesClient#CreateOrUpdate: Failure sending request: StatusCode=400 -- Original Error: Code="InvalidParameter" Message="The value of parameter linuxConfiguration is invalid." Target="linuxConfiguration"
Looking at the code, I see that  osProfile.LinuxConfiguration is always set. Is there any way around this? 

Please ping aravindh for more details on the issue

AzureWindowsMachineSet.yaml 

apiVersion: machine.openshift.io/v1beta1
kind: MachineSet
metadata:
  labels:
    machine.openshift.io/cluster-api-cluster: aravindh-k5mj2
  name: aravindh-k5mj2-win-worker-centralus1
  namespace: openshift-machine-api
spec:
  replicas: 1
  selector:
    matchLabels:
      machine.openshift.io/cluster-api-cluster: aravindh-k5mj2
      machine.openshift.io/cluster-api-machineset: aravindh-k5mj2-worker-centralus1
  template:
    metadata:
      labels:
        machine.openshift.io/cluster-api-cluster: aravindh-k5mj2
        machine.openshift.io/cluster-api-machine-role: worker
        machine.openshift.io/cluster-api-machine-type: worker
        machine.openshift.io/cluster-api-machineset: aravindh-k5mj2-worker-centralus1
    spec:
      metadata: {}
      providerSpec:
        value:
          apiVersion: azureproviderconfig.openshift.io/v1beta1
          credentialsSecret:
            name: azure-cloud-credentials
            namespace: openshift-machine-api
          image:
            offer: WindowsServer
            publisher: MicrosoftWindowsServer
            sku: 2019-Datacenter
            version: latest
          kind: AzureMachineProviderSpec
          location: centralus
          managedIdentity: aravindh-k5mj2-identity
          metadata:
            creationTimestamp: null
          networkResourceGroup: aravindh-k5mj2-rg
          osDisk:
            diskSizeGB: 128
            managedDisk:
              storageAccountType: Premium_LRS
            osType: Windows
          publicIP: false
          resourceGroup: aravindh-k5mj2-rg
          subnet: aravindh-k5mj2-worker-subnet
          vmSize: Standard_D2s_v3
          vnet: aravindh-k5mj2-vnet
          zone: "1"

Comment 3 sunzhaohua 2020-07-06 09:40:48 UTC
Failed to verify.
clusterversion: 4.6.0-0.nightly-2020-07-05-234845
Create a new machineset with below configuration. Machine stucted at Provisioned status, couldn't join the cluster. No csr pending. Machine didn't have a node.

    spec:
      metadata: {}
      providerSpec:
        value:
          apiVersion: azureproviderconfig.openshift.io/v1beta1
          credentialsSecret:
            name: azure-cloud-credentials
            namespace: openshift-machine-api
          image:
            offer: WindowsServer
            publisher: MicrosoftWindowsServer
            resourceID: ""
            sku: 2019-Datacenter
            version: latest
          kind: AzureMachineProviderSpec
          location: westus
          managedIdentity: zhsunazure76-7szfp-identity
          metadata:
            creationTimestamp: null
          networkResourceGroup: zhsunazure76-7szfp-rg
          osDisk:
            diskSizeGB: 128
            managedDisk:
              storageAccountType: Premium_LRS
            osType: Windows
          publicIP: false
          publicLoadBalancer: zhsunazure76-7szfp
          resourceGroup: zhsunazure76-7szfp-rg
          subnet: zhsunazure76-7szfp-worker-subnet
          userDataSecret:
            name: worker-user-data
          vmSize: Standard_D2s_v3
          vnet: zhsunazure76-7szfp-vnet
          zone: ""

$ oc get machine
NAME                                     PHASE         TYPE              REGION   ZONE   AGE
windows-ps6jf                            Provisioned   Standard_D2s_v3   westus          44m
zhsunazure76-7szfp-master-0              Running       Standard_D8s_v3   westus          160m
zhsunazure76-7szfp-master-1              Running       Standard_D8s_v3   westus          160m
zhsunazure76-7szfp-master-2              Running       Standard_D8s_v3   westus          160m
zhsunazure76-7szfp-worker-westus-p8zd8   Running       Standard_D2s_v3   westus          146m
zhsunazure76-7szfp-worker-westus-stskz   Running       Standard_D2s_v3   westus          146m
zhsunazure76-7szfp-worker-westus-ttk5w   Running       Standard_D2s_v3   westus          146m
$ oc get node
NAME                                     STATUS   ROLES    AGE    VERSION
zhsunazure76-7szfp-master-0              Ready    master   156m   v1.18.3+1a1d81c
zhsunazure76-7szfp-master-1              Ready    master   156m   v1.18.3+1a1d81c
zhsunazure76-7szfp-master-2              Ready    master   156m   v1.18.3+1a1d81c
zhsunazure76-7szfp-worker-westus-p8zd8   Ready    worker   138m   v1.18.3+1a1d81c
zhsunazure76-7szfp-worker-westus-stskz   Ready    worker   138m   v1.18.3+1a1d81c
zhsunazure76-7szfp-worker-westus-ttk5w   Ready    worker   137m   v1.18.3+1a1d81c

$ oc describe machine windows-ps6jf
Status:
  Addresses:
    Address:     windows-ps6jf
    Type:        Hostname
    Address:     windows-ps6jf
    Type:        InternalDNS
    Address:     windows-ps6jf.1vwu5y0vpbxube3bhbh0z0dxwa.dx.internal.cloudapp.net
    Type:        InternalDNS
    Address:     10.0.32.7
    Type:        InternalIP
  Last Updated:  2020-07-06T08:43:39Z
  Phase:         Provisioned
  Provider Status:
    Conditions:
      Last Probe Time:       2020-07-06T08:43:38Z
      Last Transition Time:  2020-07-06T08:42:11Z
      Message:               machine successfully created
      Reason:                MachineCreationSucceeded
      Status:                True
      Type:                  MachineCreated
    Metadata:
    Vm Id:     /subscriptions/53b8f551-f0fc-4bea-8cba-6d1fefd54c8a/resourceGroups/zhsunazure76-7szfp-rg/providers/Microsoft.Compute/virtualMachines/windows-ps6jf
    Vm State:  Running


I0706 09:28:23.864566       1 controller.go:172] windows-ps6jf: reconciling Machine
I0706 09:28:23.864602       1 actuator.go:201] windows-ps6jf: actuator checking if machine exists
I0706 09:28:24.029270       1 reconciler.go:376] Machine a9b9d8c6-dfa0-4f84-b69b-31b894e7892b is running
I0706 09:28:24.029290       1 reconciler.go:384] Found vm for machine windows-ps6jf
I0706 09:28:24.029300       1 controller.go:285] windows-ps6jf: reconciling machine triggers idempotent update
I0706 09:28:24.029305       1 actuator.go:168] Updating machine windows-ps6jf
I0706 09:28:24.425520       1 machine_scope.go:141] windows-ps6jf: status unchanged
I0706 09:28:24.425568       1 machine_scope.go:141] windows-ps6jf: status unchanged
I0706 09:28:24.425576       1 machine_scope.go:157] windows-ps6jf: patching machine
I0706 09:28:24.450648       1 controller.go:301] windows-ps6jf: has no node yet, requeuing
I0706 09:28:24.450722       1 recorder.go:52] controller-runtime/manager/events "msg"="Normal"  "message"="Updated machine \"windows-ps6jf\"" "object"={"kind":"Machine","namespace":"openshift-machine-api","name":"windows-ps6jf","uid":"2214d4e1-c626-4ad8-8ba5-648f46b7cbb3","apiVersion":"machine.openshift.io/v1beta1","resourceVersion":"56756"} "reason"="Updated"

Comment 4 Aravindh Puthiyaparambil 2020-07-06 15:34:29 UTC
The machine is a Windows VM and will not join the cluster unless you are running the Windows Machine Config Operator which is a work in progress at the moment. The expected result is that the Machine is in the Provisioned phase.

Comment 5 Danil Grigorev 2020-07-09 08:57:54 UTC
Please take into account the info Aravindh posted. For now, it is the desired outcome - to get the machine into the provisioned state.

Comment 6 sunzhaohua 2020-07-09 09:30:18 UTC
@Aravindh Puthiyaparambil  @Danil Grigorev  Thank you, move this to verified.

Comment 8 errata-xmlrpc 2020-10-27 16:06:02 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196