Bug 2001008 - [MachineSets] CloneMode defaults to linkedClone, but I don't have snapshot and should be fullClone
Summary: [MachineSets] CloneMode defaults to linkedClone, but I don't have snapshot an...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cloud Compute
Version: 4.8
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 4.10.0
Assignee: dmoiseev
QA Contact: sunzhaohua
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-09-03 14:24 UTC by Felipe Campos
Modified: 2022-03-10 16:07 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Previously, if vm template had a snapshots - incorrect disk size was picked due to incorrect usage of linkedClone in this case. Default clone mode was changed to fullClone for all situations, linkedClone need to be explicitly specified in provider spec by user.
Clone Of:
Environment:
Last Closed: 2022-03-10 16:07:08 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift machine-api-operator pull 959 0 None open Bug 2001008: Change default cloneMode to fullClone. 2021-11-23 14:10:35 UTC
Red Hat Product Errata RHSA-2022:0056 0 None None None 2022-03-10 16:07:24 UTC

Description Felipe Campos 2021-09-03 14:24:39 UTC
Description of problem:

Updating worker MachineSets providerSpec -> value -> diskGib (eg value 128) and scale up or delete a machine to recreate with disk size that is customized doesn't create Disks on VM with disk size customized, it keeps the template disk size (16gb)

Version-Release number of selected component (if applicable):
OCP 4.8.9 / kubernetes 1.21

How reproducible:

1 - Create or update a cluster to 4.8.9
2 - oc get machinesets -n openshift-machine-api ocp-idcluster-worker -oyaml

spec:
...
  template:
  ...
    spec:
    ...
      providerSpec:
        value:
          apiVersion: vsphereprovider.openshift.io/v1beta1
          diskGiB: 120

3 - Update diskGib: oc edit machinesets -n openshift-machine-api ocp-idcluster-worker

spec:
...
  template:
  ...
    spec:
    ...
      providerSpec:
        value:
          apiVersion: vsphereprovider.openshift.io/v1beta1
          diskGiB: 200

4 - Scale UP machinesets or delete a machine to take the new disk size.

5 - The machine (or VM) keeps the default disk 16Gb from the template.


Actual results:

VM/Machine is not updating the Disk size to 200gb, it keeps 16 gb, from the default template.

Expected results:

VM/Machine should be created with disk size specified in diskGib.

Additional info:

A workaround was add the cloneMode value - cloneMode: fullClone.

spec:
...
  template:
  ...
    spec:
    ...
      providerSpec:
        value:
          apiVersion: vsphereprovider.openshift.io/v1beta1
          cloneMode: fullClone
          diskGiB: 200

Andrew Sullivan in kubernetes slack #openshift-dev suggested me to open a BZ to verify if it's a bug.
I had a look at: https://github.com/openshift/machine-api-operator/blob/master/pkg/apis/vsphereprovider/v1beta1/vsphereproviderconfig_types.go

> // CloneMode specifies the type of clone operation.
>	// The LinkedClone mode is only support for templates that have at least
>	// one snapshot. If the template has no snapshots, then CloneMode defaults
>	// to FullClone.
>	// When LinkedClone mode is enabled the DiskGiB field is ignored as it is
>	// not possible to expand disks of linked clones.
>	// Defaults to LinkedClone, but fails gracefully to FullClone if the source
>	// of the clone operation has no snapshots.
>	// +optional
>	CloneMode CloneMode `json:"cloneMode,omitempty"`

Also at https://www.vmware.com/support/ws5/doc/ws_clone_typeofclone.html
Also at https://sourcegraph.com/github.com/kubernetes-sigs/cluster-api-provider-vsphere@556d6cc465b44e7f288c423d37ac9cfc7424d276/-/blob/api/v1alpha3/types.go?L43:25

const (
	// FullClone indicates a VM will have no relationship to the source of the
	// clone operation once the operation is complete. This is the safest clone
	// mode, but it is not the fastest.
	FullClone CloneMode = "fullClone"

	// LinkedClone means resulting VMs will be dependent upon the snapshot of
	// the source VM/template from which the VM was cloned. This is the fastest
	// clone mode, but it also prevents expanding a VMs disk beyond the size of
	// the source VM/template.
	LinkedClone CloneMode = "linkedClone"
)

Comment 1 dmoiseev 2021-11-23 14:09:08 UTC
I reproduced this.
linkedClone will be engaged in case if template have snapshots or existing Snapshot is defined in providerSpec.

To reproduce:
- install ocp cluster
- make a snapshot of vm template
- delete machine
New machine will be created with 16GiB disk, value from provider spec will be ignored.

Comment 4 sunzhaohua 2022-01-10 09:34:29 UTC
Verified
Reproduced in 4.8.9, New machine will be created with 16GiB disk, value from provider spec will be ignored.
Verified in clusterversion: 4.10.0-0.nightly-2022-01-09-195852

Steps:
- install ocp cluster
- update diskGib: oc edit machineset zhsun410-gwvp9-worker

      providerSpec:
        value:
          apiVersion: vsphereprovider.openshift.io/v1beta1
          diskGiB: 200
- make a snapshot of vm template
- delete machine

New machine will be created with updated diskGiB.

Comment 7 errata-xmlrpc 2022-03-10 16:07:08 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0056


Note You need to log in before you can comment on or make changes to this bug.