Bug 1908171 - GCP: Installation fails when installing cluster with n1-custom-4-16384custom type (n1-custom-4-16384)
Summary: GCP: Installation fails when installing cluster with n1-custom-4-16384custom ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 4.7
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.7.0
Assignee: Patrick Dillon
QA Contact: To Hung Sze
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-12-16 01:40 UTC by To Hung Sze
Modified: 2021-02-24 15:45 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: installing with a custom machine type fails because "terraform apply" command is used a second time in order to destroy bootstrap resources. Consequence: trying to install a gcp cluster with a custom machine type causes installation to fail because terraform implementation falsely interprets the second apply as a change to the machine type of the existing vm. Fix: we tell terraform to ignore the lifecycle change to machine types. Result: installation with custom machine types succeeds.
Clone Of:
Environment:
Last Closed: 2021-02-24 15:44:54 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
.openshift-install.log (128.16 KB, text/plain)
2020-12-16 01:40 UTC, To Hung Sze
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github openshift installer pull 4496 0 None closed Bug 1908171: fix Terraform issue with GCP custom machine types 2021-01-11 21:14:53 UTC
Red Hat Product Errata RHSA-2020:5633 0 None None None 2021-02-24 15:45:16 UTC

Description To Hung Sze 2020-12-16 01:40:25 UTC
Created attachment 1739499 [details]
.openshift-install.log

(this defect started as install failed with long name and custom type - turns out it is just n1-custom-4-16384)

Created attachment 1739499 [details]
.openshift-install.log

Version:
openshift-install-linux-4.7.0-0.nightly-2020-12-14-080124


Platform: GCP

#Please specify the platform type: aws, libvirt, openstack or baremetal etc.

Install-config used:
compute:
- architecture: amd64
  hyperthreading: Enabled
  name: worker
  platform: 
    gcp:
      type: n1-custom-4-16384
  replicas: 3
controlPlane:
  architecture: amd64
  hyperthreading: Enabled
  name: master
  platform: 
    gcp:
      type: n1-custom-4-16384
  replicas: 3
metadata:
  creationTimestamp: null
  name: tszegcp121520d-1234567890


Install fails with:
time="2020-12-15T20:32:26-05:00" level=error msg="Error: Changing the machine_type, min_cpu_platform, service_account, or enable display on a started instance requires stopping it. To acknowledge this, please set allow_stopping_for_update = true in your config. You can also stop it by setting desired_status = \"TERMINATED\", but the instance will not be restarted after the update."
time="2020-12-15T20:32:26-05:00" level=error
time="2020-12-15T20:32:26-05:00" level=error msg="  on ../../../../../tmp/openshift-install-672567699/master/main.tf line 31, in resource \"google_compute_instance\" \"master\":"
time="2020-12-15T20:32:26-05:00" level=error msg="  31: resource \"google_compute_instance\" \"master\" {"
time="2020-12-15T20:32:26-05:00" level=error
time="2020-12-15T20:32:26-05:00" level=error
time="2020-12-15T20:32:26-05:00" level=error
time="2020-12-15T20:32:26-05:00" level=error msg="Error: Changing the machine_type, min_cpu_platform, service_account, or enable display on a started instance requires stopping it. To acknowledge this, please set allow_stopping_for_update = true in your config. You can also stop it by setting desired_status = \"TERMINATED\", but the instance will not be restarted after the update."
time="2020-12-15T20:32:26-05:00" level=error
time="2020-12-15T20:32:26-05:00" level=error msg="  on ../../../../../tmp/openshift-install-672567699/master/main.tf line 31, in resource \"google_compute_instance\" \"master\":"
time="2020-12-15T20:32:26-05:00" level=error msg="  31: resource \"google_compute_instance\" \"master\" {"
time="2020-12-15T20:32:26-05:00" level=error
time="2020-12-15T20:32:26-05:00" level=error
time="2020-12-15T20:32:26-05:00" level=error
time="2020-12-15T20:32:26-05:00" level=error msg="Error: Changing the machine_type, min_cpu_platform, service_account, or enable display on a started instance requires stopping it. To acknowledge this, please set allow_stopping_for_update = true in your config. You can also stop it by setting desired_status = \"TERMINATED\", but the instance will not be restarted after the update."
time="2020-12-15T20:32:26-05:00" level=error
time="2020-12-15T20:32:26-05:00" level=error msg="  on ../../../../../tmp/openshift-install-672567699/master/main.tf line 31, in resource \"google_compute_instance\" \"master\":"
time="2020-12-15T20:32:26-05:00" level=error msg="  31: resource \"google_compute_instance\" \"master\" {"
time="2020-12-15T20:32:26-05:00" level=error
time="2020-12-15T20:32:26-05:00" level=error
time="2020-12-15T20:32:26-05:00" level=fatal msg="failed disabling bootstrap load balancing: failed to apply Terraform: failed to complete the change"


Note:
Installations with the long name and custom type separately both succeed.

Comment 1 To Hung Sze 2020-12-16 01:46:11 UTC
I have the must-gather.
Please let me know if it can help.

Comment 2 Patrick Dillon 2020-12-16 18:11:28 UTC
Note, this is happening during bootstrap destroy. 

Similar PR: https://github.com/openshift/installer/pull/2325 fixes BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1746119

Comment 3 Patrick Dillon 2020-12-18 22:48:03 UTC
I reproduced this with both a 3 character cluster name and a 25 character cluster name, so I think cluster name is unrelated. Should be similar fix as https://bugzilla.redhat.com/show_bug.cgi?id=1908171#c2 but for machine_type. PR should be up soon.

Comment 5 To Hung Sze 2021-01-07 17:45:57 UTC
Verified with 4.7 fc1
compute:
- architecture: amd64
  hyperthreading: Enabled
  name: worker
  platform:
    gcp:
      type: n1-custom-4-16384
  replicas: 3
controlPlane:
  architecture: amd64
  hyperthreading: Enabled
  name: master
  platform:
     gcp:
       type: n1-custom-4-16384
  replicas: 3
metadata:
  creationTimestamp: null
  name: tszegcp010720c-1234567890

Comment 8 errata-xmlrpc 2021-02-24 15:44:54 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633


Note You need to log in before you can comment on or make changes to this bug.