Bug 1861974 - [vsphere] Machine stuck in provisioning phase , while using default vlaues to create machineset
Summary: [vsphere] Machine stuck in provisioning phase , while using default vlaues to...
Keywords:
Status: CLOSED DUPLICATE of bug 1876680
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cloud Compute
Version: 4.6
Hardware: Unspecified
OS: Unspecified
unspecified
low
Target Milestone: ---
: 4.6.0
Assignee: Alexander Demicev
QA Contact: Milind Yadav
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-07-30 04:56 UTC by Milind Yadav
Modified: 2020-09-22 14:51 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-09-22 14:51:48 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift machine-api-operator pull 665 0 None closed Bug 1861974: [vSphere] Improve vSphere webhook 2021-01-21 16:45:47 UTC

Description Milind Yadav 2020-07-30 04:56:47 UTC
[vsphere] Machine stuck in provisioning phase , while using default vlaues to create machineset

clusterVersion:4.6.0-0.nightly-2020-07-25-091217

Steps:
1.use below yaml to create machineset
http://pastebin.test.redhat.com/889005

2.Machineset gets created successfuly 

3.Validate machine created by machineset
[miyadav@miyadav debug]$ oc get machine
NAME                            PHASE          TYPE   REGION   ZONE   AGE
default-99f57                   Provisioning                          59m.
...
Expected - Machine should be in Running phase (after this much time)

Additional Info:
I0730 04:00:18.001631 1 reconciler.go:672] Adding device: eth card type: vmxnet3, network spec: &{NetworkName:VM Network}, device info: &{VirtualDeviceDeviceBackingInfo:{VirtualDeviceBackingInfo:{DynamicData:{}} DeviceName:VM Network UseAutoDetect:<nil>} Network:<nil> InPassthroughMode:<nil>}
I0730 04:00:18.005962 1 reconciler.go:608] default-99f57: running task: task-253977
I0730 04:00:18.005982 1 reconciler.go:708] default-99f57: Updating provider status
I0730 04:00:18.006003 1 machine_scope.go:102] default-99f57: patching machine
I0730 04:00:18.027152 1 controller.go:325] default-99f57: created instance, requeuing
I0730 04:00:18.027212 1 controller.go:169] default-99f57: reconciling Machine
I0730 04:00:18.027219 1 actuator.go:83] default-99f57: actuator checking if machine exists
I0730 04:00:18.039836 1 session.go:113] Find template by instance uuid: 995a97e3-ebd6-4d81-8917-d9f38d424d6b
I0730 04:00:18.066788 1 reconciler.go:175] default-99f57: does not exist
I0730 04:00:18.066812 1 controller.go:313] default-99f57: reconciling machine triggers idempotent create
I0730 04:00:18.066817 1 actuator.go:60] default-99f57: actuator creating machine
I0730 04:00:18.076353 1 reconciler.go:692] task: task-253977, state: error, description-id: VirtualMachine.clone
I0730 04:00:18.076372 1 session.go:113] Find template by instance uuid: 995a97e3-ebd6-4d81-8917-d9f38d424d6b
I0730 04:00:18.105180 1 reconciler.go:94] default-99f57: cloning
I0730 04:00:18.105230 1 session.go:110] Invalid UUID for VM "wduan0729a-5zjt4-rhcos": , trying to find by name
I0730 04:00:18.125514 1 reconciler.go:469] default-99f57: no snapshot name provided, getting snapshot using template
I0730 04:00:18.147996 1 reconciler.go:557] Getting network devices
I0730 04:00:18.148033 1 reconciler.go:640] Adding device: VM Network
I0730 04:00:18.152565 1 reconciler.go:672] Adding device: eth card type: vmxnet3, network spec: &{NetworkName:VM Network}, device info: &{VirtualDeviceDeviceBackingInfo:{VirtualDeviceBackingInfo:{DynamicData:{}} DeviceName:VM Network UseAutoDetect:<nil>} Network:<nil> InPassthroughMode:<nil>}

Will add must-gather logs later

Comment 5 Milind Yadav 2020-08-21 06:48:56 UTC
VALIDATED ON - Cluster version is 4.6.0-0.nightly-2020-08-18-165040


Steps:
1.use below yaml to create machineset
http://pastebin.test.redhat.com/895459

2.Machineset gets created successfuly 

3.Validate machine created by machineset
[miyadav@miyadav rhv]$ oc get machines -o wide   --config vsp
NAME                                        PHASE         TYPE   REGION   ZONE   AGE     NODE                                PROVIDERID                                       STATE
miyadav-vs2108-52mxn-master-0               Running                              74m     miyadav-vs2108-52mxn-master-0       vsphere://422be7bf-9dd2-9bf0-6f44-18318394b2f2   poweredOn
miyadav-vs2108-52mxn-master-1               Running                              74m     miyadav-vs2108-52mxn-master-1       vsphere://422b98db-0569-8d4f-7424-f023096cb763   poweredOn
miyadav-vs2108-52mxn-master-2               Running                              74m     miyadav-vs2108-52mxn-master-2       vsphere://422b4689-a8b3-e993-4f8b-a72fc55709de   poweredOn
miyadav-vs2108-52mxn-worker-default-8wnbx   Provisioned                          27m                                         vsphere://422b67d5-4bd2-3b6c-1465-9b6a0442eb45   poweredOn
miyadav-vs2108-52mxn-worker-qbbsp           Running                              65m     miyadav-vs2108-52mxn-worker-qbbsp   vsphere://422ba0c6-7c3f-98e0-caf7-68e7efe87c55   poweredOn
miyadav-vs2108-52mxn-worker-v8xq6           Running                              5m24s   miyadav-vs2108-52mxn-worker-v8xq6   vsphere://422b4257-0416-4b8d-1d87-2f46a4c60b86   poweredOn
miyadav-vs2108-52mxn-worker-xm42s           Running                              65m     miyadav-vs2108-52mxn-worker-xm42s   vsphere://422b06c9-78b5-aefc-bfee-3b4e7e253719   poweredOn

...
Expected - Machine should be in Running phase (after this much time)

Additional Info :
The exisiting machineset scales as expected with successful running status , its only when we use default yaml , it gets stuck shows invalid mount error on the VM console ...

Comment 6 Alexander Demicev 2020-09-16 13:37:20 UTC
There is a bug in defaulting webhook for vSphere. We are thinking about cutting out defaulting webhooks from this release.

Comment 7 Alberto 2020-09-16 13:41:33 UTC
Covered by https://github.com/openshift/machine-api-operator/pull/697

Comment 8 Alberto 2020-09-22 14:51:48 UTC

*** This bug has been marked as a duplicate of bug 1876680 ***


Note You need to log in before you can comment on or make changes to this bug.