Bug 1828704 - [Azure]Machine status should be "Failed" when creating a machineset with "publicIP: true" and name the machineset with a longer name
Summary: [Azure]Machine status should be "Failed" when creating a machineset with "pub...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cloud Compute
Version: 4.5
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.5.0
Assignee: Joel Speed
QA Contact: sunzhaohua
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-04-28 08:00 UTC by sunzhaohua
Modified: 2020-07-13 17:32 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: Certain configuration errors were not interpreted as configuration errors by the Machine controller Consequence: The Machine controller did not mark the Machine as failed as expected Fix: Make sure the configuration errors are raised as configuration errors that the Machine controller can interpret Result: These configuration errors now mark the Machine as Failed
Clone Of:
Environment:
Last Closed: 2020-07-13 17:32:07 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-api-provider-azure pull 126 0 None closed BUG 1828704: Return invalid configuration errors when creating NICs with invalid config 2020-07-16 18:00:00 UTC
Red Hat Product Errata RHBA-2020:2409 0 None None None 2020-07-13 17:32:18 UTC

Description sunzhaohua 2020-04-28 08:00:01 UTC
Description of problem:
Machine status should be "Failed" when creating a machineset with "publicIP: true" and name the machineset with a longer name 
 
Version-Release number of selected component (if applicable):
4.5.0-0.nightly-2020-04-27-204255

How reproducible:
Always

Steps to Reproduce:
1. Creating a machineset with "publicIP: true", and name the machineset with a longer name:
apiVersion: machine.openshift.io/v1beta1
kind: MachineSet
metadata:
  labels:
    machine.openshift.io/cluster-api-cluster: zhsunazure428-jvwvn
    machine.openshift.io/cluster-api-machine-role: worker
    machine.openshift.io/cluster-api-machine-type: worker
  name: zhsunazure428-jvwvn-worker-westus-invalid
  namespace: openshift-machine-api
spec:
  replicas: 1
  selector:
    matchLabels:
      machine.openshift.io/cluster-api-cluster: zhsunazure428-jvwvn
      machine.openshift.io/cluster-api-machineset: zhsunazure428-jvwvn-worker-westus-invalid
  template:
    metadata:
      labels:
        machine.openshift.io/cluster-api-cluster: zhsunazure428-jvwvn
        machine.openshift.io/cluster-api-machine-role: worker
        machine.openshift.io/cluster-api-machine-type: worker
        machine.openshift.io/cluster-api-machineset: zhsunazure428-jvwvn-worker-westus-invalid
    spec:
      metadata: {}
      providerSpec:
        value:
          apiVersion: azureproviderconfig.openshift.io/v1beta1
          credentialsSecret:
            name: azure-cloud-credentials
            namespace: openshift-machine-api
          image:
            offer: ""
            publisher: ""
            resourceID: /resourceGroups/zhsunazure428-jvwvn-rg/providers/Microsoft.Compute/images/zhsunazure428-jvwvn
            sku: ""
            version: ""
          kind: AzureMachineProviderSpec
          location: westus
          managedIdentity: zhsunazure428-jvwvn-identity
          metadata:
            creationTimestamp: null
          networkResourceGroup: zhsunazure428-jvwvn-rg
          osDisk:
            diskSizeGB: 128
            managedDisk:
              storageAccountType: Premium_LRS
            osType: Linux
          publicIP: true
          resourceGroup: zhsunazure428-jvwvn-rg
          subnet: zhsunazure428-jvwvn-worker-subnet
          userDataSecret:
            name: worker-user-data
          vmSize: Standard_D2s_v3
          vnet: zhsunazure428-jvwvn-vnet
          zone: ""

2. Check machines and logs

Actual results:
Machine stucks in Provisioning status
$ oc get machine
NAME                                              PHASE          TYPE              REGION   ZONE   AGE
zhsunazure428-jvwvn-master-0                      Running        Standard_D8s_v3   westus          4h32m
zhsunazure428-jvwvn-master-1                      Running        Standard_D8s_v3   westus          4h32m
zhsunazure428-jvwvn-master-2                      Running        Standard_D8s_v3   westus          4h32m
zhsunazure428-jvwvn-worker-westus-invalid-thmlm   Provisioning                                     69s
zhsunazure428-jvwvn-worker-westus-wz74f           Running        Standard_D2s_v3   westus          4h22m
zhsunazure428-jvwvn-worker-westus-xxjdx           Running        Standard_D2s_v3   westus          4h22m

status:
  lastUpdated: "2020-04-28T07:15:21Z"
  phase: Provisioning
  providerStatus:
    conditions:
    - lastProbeTime: "2020-04-28T07:15:21Z"
      lastTransitionTime: "2020-04-28T07:15:21Z"
      message: 'failed to create nic zhsunazure428-jvwvn-worker-westus-invalid-thmlm-nic
        for machine zhsunazure428-jvwvn-worker-westus-invalid-thmlm: unable to create
        Public IP: machine public IP name is longer than 63 characters'
      reason: MachineCreationFailed
      status: "True"
      type: MachineCreated
    metadata: {}

I0428 07:16:22.474842       1 controller.go:402] Actuator returned requeue-after error: requeue in: 20s
I0428 07:16:22.474931       1 recorder.go:52] controller-runtime/manager/events "msg"="Warning"  "message"="CreateError: failed to reconcile machine \"zhsunazure428-jvwvn-worker-westus-invalid-thmlm\"s: failed to create nic zhsunazure428-jvwvn-worker-westus-invalid-thmlm-nic for machine zhsunazure428-jvwvn-worker-westus-invalid-thmlm: unable to create Public IP: machine public IP name is longer than 63 characters" "object"={"kind":"Machine","namespace":"openshift-machine-api","name":"zhsunazure428-jvwvn-worker-westus-invalid-thmlm","uid":"066548e3-0097-402c-9913-23b600682cd8","apiVersion":"machine.openshift.io/v1beta1","resourceVersion":"119206"} "reason"="FailedCreate"
I0428 07:16:42.475055       1 controller.go:166] zhsunazure428-jvwvn-worker-westus-invalid-thmlm: reconciling Machine
I0428 07:16:42.475083       1 actuator.go:197] Checking if machine zhsunazure428-jvwvn-worker-westus-invalid-thmlm exists
I0428 07:16:42.810049       1 controller.go:310] zhsunazure428-jvwvn-worker-westus-invalid-thmlm: reconciling machine triggers idempotent create
I0428 07:16:42.810068       1 actuator.go:85] Creating machine zhsunazure428-jvwvn-worker-westus-invalid-thmlm
I0428 07:16:42.810806       1 machine_scope.go:169] zhsunazure428-jvwvn-worker-westus-invalid-thmlm: status unchanged
I0428 07:16:42.810847       1 machine_scope.go:169] zhsunazure428-jvwvn-worker-westus-invalid-thmlm: status unchanged
I0428 07:16:42.810854       1 machine_scope.go:185] zhsunazure428-jvwvn-worker-westus-invalid-thmlm: patching machine
E0428 07:16:42.833770       1 actuator.go:79] Machine error: failed to reconcile machine "zhsunazure428-jvwvn-worker-westus-invalid-thmlm"s: failed to create nic zhsunazure428-jvwvn-worker-westus-invalid-thmlm-nic for machine zhsunazure428-jvwvn-worker-westus-invalid-thmlm: unable to create Public IP: machine public IP name is longer than 63 characters
W0428 07:16:42.833791       1 controller.go:312] zhsunazure428-jvwvn-worker-westus-invalid-thmlm: failed to create machine: requeue in: 20s
I0428 07:16:42.833803       1 controller.go:402] Actuator returned requeue-after error: requeue in: 20s
I0428 07:16:42.833882       1 recorder.go:52] controller-runtime/manager/events "msg"="Warning"  "message"="CreateError: failed to reconcile machine \"zhsunazure428-jvwvn-worker-westus-invalid-thmlm\"s: failed to create nic zhsunazure428-jvwvn-worker-westus-invalid-thmlm-nic for machine zhsunazure428-jvwvn-worker-westus-invalid-thmlm: unable to create Public IP: machine public IP name is longer than 63 characters" "object"={"kind":"Machine","namespace":"openshift-machine-api","name":"zhsunazure428-jvwvn-worker-westus-invalid-thmlm","uid":"066548e3-0097-402c-9913-23b600682cd8","apiVersion":"machine.openshift.io/v1beta1","resourceVersion":"119206"} "reason"="FailedCreate"


Expected results:
The machine phase is set "Failed"

Additional info:

Comment 3 Milind Yadav 2020-05-04 06:37:46 UTC
Validated on :

oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.5.0-0.nightly-2020-05-03-172622   True        False         70m     Cluster version is 4.5.0-0.nightly-2020-05-03-172622

Steps:
1.Create a machineset 
oc create -f newmachineset-publiciptrue.yml 
machineset.machine.openshift.io/miyadav-tg5rb-worker-westus-invalid-longername created

refer yaml :
apiVersion: machine.openshift.io/v1beta1
kind: MachineSet
metadata:
  annotations:
    machine.openshift.io/GPU: "0"
    machine.openshift.io/memoryMb: "8192"
    machine.openshift.io/vCPU: "2"
  creationTimestamp: "2020-05-04T04:35:28Z"
  generation: 1
  labels:
    machine.openshift.io/cluster-api-cluster: miyadav-tg5rb
    machine.openshift.io/cluster-api-machine-role: worker
    machine.openshift.io/cluster-api-machine-type: worker
  name: miyadav-tg5rb-worker-westus-invalid-longername
  namespace: openshift-machine-api
  resourceVersion: "20965"
  selfLink: /apis/machine.openshift.io/v1beta1/namespaces/openshift-machine-api/machinesets/miyadav-tg5rb-worker-westus
  uid: 6b4fdb36-103c-4f78-8e86-af051c086b0f
spec:
  replicas: 1
  selector:
    matchLabels:
      machine.openshift.io/cluster-api-cluster: miyadav-tg5rb
      machine.openshift.io/cluster-api-machineset: miyadav-tg5rb-worker-westus
  template:
    metadata:
      labels:
        machine.openshift.io/cluster-api-cluster: miyadav-tg5rb
        machine.openshift.io/cluster-api-machine-role: worker
        machine.openshift.io/cluster-api-machine-type: worker
        machine.openshift.io/cluster-api-machineset: miyadav-tg5rb-worker-westus
    spec:
      metadata: {}
      providerSpec:
        value:
          apiVersion: azureproviderconfig.openshift.io/v1beta1
          credentialsSecret:
            name: azure-cloud-credentials
            namespace: openshift-machine-api
          image:
            offer: ""
            publisher: ""
            resourceID: /resourceGroups/miyadav-tg5rb-rg/providers/Microsoft.Compute/images/miyadav-tg5rb
            sku: ""
            version: ""
          kind: AzureMachineProviderSpec
          location: westus
          managedIdentity: miyadav-tg5rb-identity
          metadata:
            creationTimestamp: null
          networkResourceGroup: miyadav-tg5rb-rg
          osDisk:
            diskSizeGB: 128
            managedDisk:
              storageAccountType: Premium_LRS
            osType: Linux
          publicIP: true
          resourceGroup: miyadav-tg5rb-rg
          subnet: miyadav-tg5rb-worker-subnet
          userDataSecret:
            name: worker-user-data
          vmSize: Standard_D2s_v3
          vnet: miyadav-tg5rb-vnet
          zone: ""
                  
2.check machine (oc get machines -n openshift-machine-api) and machine-controller logs ( oc logs -f machine-api-controllers-6d878fc87f-dq7kr -c machine-controller )
Actual & Expected:

oc get machine
NAME                                                   PHASE     TYPE              REGION   ZONE   AGE
miyadav-tg5rb-master-0                                 Running   Standard_D8s_v3   westus          109m
miyadav-tg5rb-master-1                                 Running   Standard_D8s_v3   westus          109m
miyadav-tg5rb-master-2                                 Running   Standard_D8s_v3   westus          109m
miyadav-tg5rb-worker-westus-2g8rr                      Running   Standard_D2s_v3   westus          97m
miyadav-tg5rb-worker-westus-ggjbv                      Running   Standard_D2s_v3   westus          97m
miyadav-tg5rb-worker-westus-invalid-longername-g2gmh   Failed                                      89s
miyadav-tg5rb-worker-westus-mrsrf                      Running   Standard_D2s_v3   westus          97m

Error logs:
.
.
.
E0504 06:23:49.337310       1 actuator.go:78] Machine error: failed to reconcile machine "miyadav-tg5rb-worker-westus-invalid-longername-g2gmh": failed to create nic miyadav-tg5rb-worker-westus-invalid-longername-g2gmh-nic for machine miyadav-tg5rb-worker-westus-invalid-longername-g2gmh: unable to create Public IP: machine public IP name is longer than 63 characters
W0504 06:23:49.337328       1 controller.go:312] miyadav-tg5rb-worker-westus-invalid-longername-g2gmh: failed to create machine: failed to reconcile machine "miyadav-tg5rb-worker-westus-invalid-longername-g2gmh": failed to create nic miyadav-tg5rb-worker-westus-invalid-longername-g2gmh-nic for machine miyadav-tg5rb-worker-westus-invalid-longername-g2gmh: unable to create Public IP: machine public IP name is longer than 63 characters
I0504 06:23:49.337336       1 controller.go:412] Actuator returned invalid configuration error: failed to reconcile machine "miyadav-tg5rb-worker-westus-invalid-longername-g2gmh": failed to create nic miyadav-tg5rb-worker-westus-invalid-longername-g2gmh-nic for machine miyadav-tg5rb-worker-westus-invalid-longername-g2gmh: unable to create Public IP: machine public IP name is longer than 63 characters
I0504 06:23:49.337345       1 controller.go:421] miyadav-tg5rb-worker-westus-invalid-longername-g2gmh: going into phase "Failed"
I0504 06:23:49.337464       1 recorder.go:52] controller-runtime/manager/events "msg"="Warning"  "message"="InvalidConfiguration: failed to reconcile machine \"miyadav-tg5rb-worker-westus-invalid-longername-g2gmh\": failed to create nic miyadav-tg5rb-worker-westus-invalid-longername-g2gmh-nic for machine miyadav-tg5rb-worker-westus-invalid-longername-g2gmh: unable to create Public IP: machine public IP name is longer than 63 characters" "object"={"kind":"Machine","namespace":"openshift-machine-api","name":"miyadav-tg5rb-worker-westus-invalid-longername-g2gmh","uid":"18f9fac0-bf83-4ea8-92e4-b9846346c8a6","apiVersion":"machine.openshift.io/v1beta1","resourceVersion":"59757"} "reason"="FailedCreate"
I0504 06:23:49.349557       1 controller.go:282] controller-runtime/controller "msg"="Successfully Reconciled"  "controller"="machine_controller" "request"={"Namespace":"openshift-machine-api","Name":"miyadav-tg5rb-worker-westus-invalid-longername-g2gmh"}
I0504 06:23:49.349610       1 controller.go:166] miyadav-tg5rb-worker-westus-invalid-longername-g2gmh: reconciling Machine
W0504 06:23:49.349626       1 controller.go:263] miyadav-tg5rb-worker-westus-invalid-longername-g2gmh: machine has gone "Failed" phase. It won't reconcile
I0504 06:23:49.349647       1 controller.go:282] controller-runtime/controller "msg"="Successfully Reconciled"  "controller"="machine_controller" "request"={"Namespace":"openshift-machine-api","Name":"miyadav-tg5rb-worker-westus-invalid-longername-g2gmh"}
.
.
.

Comment 4 Milind Yadav 2020-05-04 10:42:21 UTC
Moving to VERIFIED

Comment 5 errata-xmlrpc 2020-07-13 17:32:07 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2409


Note You need to log in before you can comment on or make changes to this bug.