Bug 1825323 - vSphere IPI Failure to clone VMs on vSAN storage
Summary: vSphere IPI Failure to clone VMs on vSAN storage
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 4.5
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.5.0
Assignee: Patrick Dillon
QA Contact: jima
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-04-17 17:28 UTC by davis phillips
Modified: 2020-07-23 12:26 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: upstream vsphere terraform provider shows that terraform plan is inconsistent when installing to a vcenter that forces thin provisioning Consequence: terraform plan is inconsistent and install fails during provisioning Fix: patched upstream provider so that disk type is not pre-set but instead computed on apply Result: disk type will be set based on policy (defaults to thick) and install succeeds
Clone Of:
Environment:
Last Closed: 2020-07-13 17:28:35 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift installer pull 3603 0 None closed Bug 1825323: replace terraform-provider-vsphere with OpenShift fork 2021-02-10 21:22:31 UTC
Github openshift terraform-provider-vsphere pull 1 0 None closed carry patch for skipping diskDiffOperation when cloning, partial fix of Bug 1825323 2021-02-10 21:22:31 UTC
Red Hat Product Errata RHBA-2020:2409 0 None None None 2020-07-13 17:28:58 UTC

Description davis phillips 2020-04-17 17:28:17 UTC
Description of problem:
After launching command "./openshift-install create cluster" to create cluster, master and bootstrap instances have already been cloned, and wait for Kubernets API up, the task of cloning master vm are launched again and again without stopping

Version-Release number of the following components:
rpm -q openshift-ansible
rpm -q ansible
ansible --version

Client Version: 4.5.0-0.nightly-2020-04-17-114625
Server Version: 4.5.0-0.nightly-2020-04-17-114625
Kubernetes Version: v1.18.0-rc.1
[root@rh8-tools ipi]# ./openshift-install version
./openshift-install 4.5.0-0.nightly-2020-04-17-114625
built from commit 19d22e67d372145f24bed3030aa049da9ebc398e
release image quay.io/openshift-release-dev/ocp-release-nightly@sha256:2357cbed27ba57912fc92392ef7cacb1a79546587946771ab6a71fbf397056a3

Default vSAN storage policy:
https://docs.vmware.com/en/VMware-vSphere/6.7/com.vmware.vsphere.virtualsan.doc/GUID-C228168F-6807-4C2A-9D74-E584CAF49A2A.html

How reproducible:


Steps to Reproduce:
1.create install_config.yaml target vSAN backed datastore.
2.run command "./openshift-install create cluster 
3.clone will fail vSAN default storage policy is thin provisioned.

Actual results:
VM clone fails

Expected results:
Clone is successful

Comment 1 davis phillips 2020-04-17 22:25:12 UTC
ERROR
ERROR Error: Provider produced inconsistent final plan
ERROR
ERROR When expanding the plan for module.master.vsphere_virtual_machine.vm[1] to
ERROR include new values learned so far during apply, provider
ERROR "registry.terraform.io/-/vsphere" produced an invalid new value for
ERROR .disk[0].thin_provisioned: was cty.False, but now cty.True.
ERROR
ERROR This is a bug in the provider, which should be reported in the provider's own
ERROR issue tracker.

Comment 2 Patrick Dillon 2020-05-06 20:45:33 UTC
Terraform plan is producing a plan inconsistent with the values found during apply. The plan shows the VM will be thick-provisioned, but the vSAN storage policy enforces a thin-provision. 

These docs provide background over the general problem: https://www.terraform.io/docs/extend/terraform-0.12-compatibility.html#inaccurate-plans

The document points out that the solution is: "If you see either of these errors, the remedy is the same: implement CustomizeDiff for the resource type that is causing the problem, and write logic to more accurately predict the outcome of any changes to Computed attributes."

Interestingly, the vSphere provider recently merged a pull request which implements import OVA functionality similar to our private vSphere provider. In that pull request they skip CustomizeDiff in the case where they import the OVA: https://github.com/terraform-providers/terraform-provider-vsphere/blob/master/vsphere/resource_vsphere_virtual_machine.go#L904-L907

Skipping this CustomizeDiff allows a proof of concept to install successfully.

Comment 4 Patrick Dillon 2020-05-14 15:54:29 UTC
https://github.com/openshift/terraform-provider-vsphere/pull/1 is a partial fix. Will need to vendor it in to installer once merged.

Comment 6 Abhinav Dahiya 2020-05-26 16:44:03 UTC
The installer PR has merged.

Comment 7 jima 2020-06-01 06:58:41 UTC
@davis phillips, there is no vSAN storage on QE vsphere env now. Do you have any vsphere server with vSAN storage to let me verify the issue?

Comment 8 jima 2020-06-10 02:40:01 UTC
verified the issue on nightly build 4.5.0-0.nightly-2020-06-09-030606,the task of cloning master vm is only launched once.

Comment 10 errata-xmlrpc 2020-07-13 17:28:35 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2409


Note You need to log in before you can comment on or make changes to this bug.