Bug 1807125 - Unable to add a new master gcp
Summary: Unable to add a new master gcp
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cloud Compute
Version: 4.4
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.5.0
Assignee: Alberto
QA Contact: sunzhaohua
URL:
Whiteboard:
Depends On:
Blocks: 1812815
TreeView+ depends on / blocked
 
Reported: 2020-02-25 16:28 UTC by Alay Patel
Modified: 2023-09-14 05:53 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1812815 (view as bug list)
Environment:
Last Closed: 2020-08-27 22:34:57 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift machine-api-operator pull 513 0 None closed BUG 1807125: Add GCP roles/compute.loadBalancerAdmin role 2020-09-16 06:31:02 UTC

Description Alay Patel 2020-02-25 16:28:17 UTC
Description of problem:

In order for the disaster recovery scenario of replacing one failed master (where the VM disappears from underlying infra) we need to be able to provision a new master node in case of IPI using the `machine` resource. I tried to create a machine resource in GCP but it fails with something like:


-----
I0220 16:36:10.612023       1 actuator.go:80] alpate-hqg7k-m-3: Checking if machine exists
I0220 16:36:11.043501       1 controller.go:260] alpate-hqg7k-m-3: reconciling machine triggers idempotent update
I0220 16:36:11.043706       1 actuator.go:98] alpate-hqg7k-m-3: Updating machine
I0220 16:36:11.225755       1 reconciler.go:372] alpate-hqg7k-m-3: reconciling instance for targetpool with cloud provider; desired state: true
I0220 16:36:11.751938       1 machine_scope.go:159] alpate-hqg7k-m-3: status unchanged
E0220 16:36:11.759837       1 controller.go:262] alpate-hqg7k-m-3: error updating machine: failed to add instance alpate-hqg7k-m-3 to target pool alpate-hqg7k-api: googleapi: Error 403: Required 'compute.targetPools.addInstance' permissio
n for 'projects/openshift-gce-devel/regions/us-east1/targetPools/alpate-hqg7k-api', forbidden


Note: A similar step works for AWS

Comment 5 sunzhaohua 2020-03-27 03:07:54 UTC
Verified
clusterversion: 4.5.0-0.nightly-2020-03-26-211208

Created a new master machine, machine was created successfully and joined the cluster.
$ oc get node
NAME                                             STATUS   ROLES    AGE     VERSION
zhsun5-6f4rk-m-0.c.openshift-qe.internal         Ready    master   45m     v1.17.1
zhsun5-6f4rk-m-00.c.openshift-qe.internal        Ready    master   9m48s   v1.17.1
zhsun5-6f4rk-m-1.c.openshift-qe.internal         Ready    master   45m     v1.17.1
zhsun5-6f4rk-m-2.c.openshift-qe.internal         Ready    master   45m     v1.17.1
zhsun5-6f4rk-w-a-92p5n.c.openshift-qe.internal   Ready    worker   33m     v1.17.1
zhsun5-6f4rk-w-b-txrc5.c.openshift-qe.internal   Ready    worker   33m     v1.17.1
zhsun5-6f4rk-w-c-zphwl.c.openshift-qe.internal   Ready    worker   33m     v1.17.1

I0327 02:52:01.131989       1 controller.go:163] zhsun5-6f4rk-m-00: reconciling Machine
I0327 02:52:01.143928       1 controller.go:282] controller-runtime/controller "msg"="Successfully Reconciled"  "controller"="machine_controller" "request"={"Namespace":"openshift-machine-api","Name":"zhsun5-6f4rk-m-00"}
I0327 02:52:01.143986       1 controller.go:163] zhsun5-6f4rk-m-00: reconciling Machine
I0327 02:52:01.143997       1 actuator.go:75] zhsun5-6f4rk-m-00: Checking if machine exists
I0327 02:52:01.550094       1 reconciler.go:302] zhsun5-6f4rk-m-00: Machine does not exist
I0327 02:52:01.550125       1 controller.go:419] zhsun5-6f4rk-m-00: going into phase "Provisioning"
I0327 02:52:01.558883       1 controller.go:307] zhsun5-6f4rk-m-00: reconciling machine triggers idempotent create
I0327 02:52:01.560790       1 actuator.go:57] zhsun5-6f4rk-m-00: Creating machine
I0327 02:52:03.426919       1 reconciler.go:168] zhsun5-6f4rk-m-00: Reconciling machine object with cloud state
I0327 02:52:03.632213       1 reconciler.go:216] zhsun5-6f4rk-m-00: machine status is "PROVISIONING", requeuing...
I0327 02:52:03.632329       1 machine_scope.go:161] "zhsun5-6f4rk-m-00": patching machine
W0327 02:52:03.650669       1 controller.go:309] zhsun5-6f4rk-m-00: failed to create machine: requeue in: 20s
I0327 02:52:03.650697       1 controller.go:400] Actuator returned requeue-after error: requeue in: 20s
I0327 02:52:03.650738       1 controller.go:163] zhsun5-6f4rk-m-00: reconciling Machine
I0327 02:52:03.650744       1 actuator.go:75] zhsun5-6f4rk-m-00: Checking if machine exists
I0327 02:52:03.655245       1 recorder.go:52] controller-runtime/manager/events "msg"="Warning"  "message"="requeue in: 20s" "object"={"kind":"Machine","namespace":"openshift-machine-api","name":"zhsun5-6f4rk-m-00","uid":"825a4ec6-a0d4-4701-9572-0f66180b2854","apiVersion":"machine.openshift.io/v1beta1","resourceVersion":"26443"} "reason"="FailedCreate"
I0327 02:52:03.963629       1 controller.go:271] zhsun5-6f4rk-m-00: reconciling machine triggers idempotent update
I0327 02:52:03.963660       1 actuator.go:92] zhsun5-6f4rk-m-00: Updating machine
I0327 02:52:04.188030       1 reconciler.go:372] zhsun5-6f4rk-m-00: reconciling instance for targetpool with cloud provider; desired state: true
I0327 02:52:04.826296       1 reconciler.go:168] zhsun5-6f4rk-m-00: Reconciling machine object with cloud state
I0327 02:52:04.948397       1 reconciler.go:216] zhsun5-6f4rk-m-00: machine status is "PROVISIONING", requeuing...

Comment 6 Luke Meyer 2020-08-27 22:34:57 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2409'

Comment 7 Red Hat Bugzilla 2023-09-14 05:53:21 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days


Note You need to log in before you can comment on or make changes to this bug.