Bug 1825323

Summary: vSphere IPI Failure to clone VMs on vSAN storage
Product: OpenShift Container Platform Reporter: davis phillips <dphillip>
Component: InstallerAssignee: Patrick Dillon <padillon>
Installer sub component: openshift-installer QA Contact: jima
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: medium CC: adahiya, bleanhar, jima
Version: 4.5   
Target Milestone: ---   
Target Release: 4.5.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: upstream vsphere terraform provider shows that terraform plan is inconsistent when installing to a vcenter that forces thin provisioning Consequence: terraform plan is inconsistent and install fails during provisioning Fix: patched upstream provider so that disk type is not pre-set but instead computed on apply Result: disk type will be set based on policy (defaults to thick) and install succeeds
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-07-13 17:28:35 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description davis phillips 2020-04-17 17:28:17 UTC
Description of problem:
After launching command "./openshift-install create cluster" to create cluster, master and bootstrap instances have already been cloned, and wait for Kubernets API up, the task of cloning master vm are launched again and again without stopping

Version-Release number of the following components:
rpm -q openshift-ansible
rpm -q ansible
ansible --version

Client Version: 4.5.0-0.nightly-2020-04-17-114625
Server Version: 4.5.0-0.nightly-2020-04-17-114625
Kubernetes Version: v1.18.0-rc.1
[root@rh8-tools ipi]# ./openshift-install version
./openshift-install 4.5.0-0.nightly-2020-04-17-114625
built from commit 19d22e67d372145f24bed3030aa049da9ebc398e
release image quay.io/openshift-release-dev/ocp-release-nightly@sha256:2357cbed27ba57912fc92392ef7cacb1a79546587946771ab6a71fbf397056a3

Default vSAN storage policy:
https://docs.vmware.com/en/VMware-vSphere/6.7/com.vmware.vsphere.virtualsan.doc/GUID-C228168F-6807-4C2A-9D74-E584CAF49A2A.html

How reproducible:


Steps to Reproduce:
1.create install_config.yaml target vSAN backed datastore.
2.run command "./openshift-install create cluster 
3.clone will fail vSAN default storage policy is thin provisioned.

Actual results:
VM clone fails

Expected results:
Clone is successful

Comment 1 davis phillips 2020-04-17 22:25:12 UTC
ERROR
ERROR Error: Provider produced inconsistent final plan
ERROR
ERROR When expanding the plan for module.master.vsphere_virtual_machine.vm[1] to
ERROR include new values learned so far during apply, provider
ERROR "registry.terraform.io/-/vsphere" produced an invalid new value for
ERROR .disk[0].thin_provisioned: was cty.False, but now cty.True.
ERROR
ERROR This is a bug in the provider, which should be reported in the provider's own
ERROR issue tracker.

Comment 2 Patrick Dillon 2020-05-06 20:45:33 UTC
Terraform plan is producing a plan inconsistent with the values found during apply. The plan shows the VM will be thick-provisioned, but the vSAN storage policy enforces a thin-provision. 

These docs provide background over the general problem: https://www.terraform.io/docs/extend/terraform-0.12-compatibility.html#inaccurate-plans

The document points out that the solution is: "If you see either of these errors, the remedy is the same: implement CustomizeDiff for the resource type that is causing the problem, and write logic to more accurately predict the outcome of any changes to Computed attributes."

Interestingly, the vSphere provider recently merged a pull request which implements import OVA functionality similar to our private vSphere provider. In that pull request they skip CustomizeDiff in the case where they import the OVA: https://github.com/terraform-providers/terraform-provider-vsphere/blob/master/vsphere/resource_vsphere_virtual_machine.go#L904-L907

Skipping this CustomizeDiff allows a proof of concept to install successfully.

Comment 4 Patrick Dillon 2020-05-14 15:54:29 UTC
https://github.com/openshift/terraform-provider-vsphere/pull/1 is a partial fix. Will need to vendor it in to installer once merged.

Comment 6 Abhinav Dahiya 2020-05-26 16:44:03 UTC
The installer PR has merged.

Comment 7 jima 2020-06-01 06:58:41 UTC
@davis phillips, there is no vSAN storage on QE vsphere env now. Do you have any vsphere server with vSAN storage to let me verify the issue?

Comment 8 jima 2020-06-10 02:40:01 UTC
verified the issue on nightly build 4.5.0-0.nightly-2020-06-09-030606,the task of cloning master vm is only launched once.

Comment 10 errata-xmlrpc 2020-07-13 17:28:35 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2409