Bug 2041882 - cloud-network-config operator can't work normal on GCP workload identity cluster
Summary: cloud-network-config operator can't work normal on GCP workload identity cluster
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.10
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.10.0
Assignee: Casey Callendrello
QA Contact: wang lin
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-01-18 13:03 UTC by wang lin
Modified: 2022-03-10 16:40 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-03-10 16:40:33 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cloud-network-config-controller pull 18 0 None open Bug 2041882: gcp: retrieve project from resources, not from the credentials 2022-01-18 14:36:00 UTC
Red Hat Product Errata RHSA-2022:0056 0 None None None 2022-03-10 16:40:44 UTC

Description wang lin 2022-01-18 13:03:20 UTC
Description of problem:
CNCC show below issue on a workload identity cluster:
E0118 11:36:06.891863       1 controller.go:165] error syncing 'lwanstsg0118f-tvpsk-master-2.c.openshift-qe.internal': error retrieving the private IP configuration for node: lwanstsg0118f-tvpsk-master-2.c.openshift-qe.internal, err: error retrieving the network interface subnets, err: googleapi: Error 400: Invalid resource field value in the request.
Details:
[
  {
    "@type": "type.googleapis.com/google.rpc.ErrorInfo",
    "domain": "googleapis.com",
    "metadata": {
      "method": "compute.v1.SubnetworksService.Get",
      "service": "compute.googleapis.com"
    },
    "reason": "RESOURCE_PROJECT_INVALID"
  }
]
, invalidParameter, requeuing in node workqueue


Version-Release number of selected component (if applicable):
4.10

How reproducible:
always

Steps to Reproduce:
1. install a workload identity cluster follow https://github.com/openshift/cloud-credential-operator/blob/master/docs/gcp_workload_identity.md
2.
3.

Actual results:
The CNCC pod shows below issue:
0118 11:36:06.891863       1 controller.go:165] error syncing 'lwanstsg0118f-tvpsk-master-2.c.openshift-qe.internal': error retrieving the private IP configuration for node: lwanstsg0118f-tvpsk-master-2.c.openshift-qe.internal, err: error retrieving the network interface subnets, err: googleapi: Error 400: Invalid resource field value in the request.
Details:
[
  {
    "@type": "type.googleapis.com/google.rpc.ErrorInfo",
    "domain": "googleapis.com",
    "metadata": {
      "method": "compute.v1.SubnetworksService.Get",
      "service": "compute.googleapis.com"
    },
    "reason": "RESOURCE_PROJECT_INVALID"
  }
]
, invalidParameter, requeuing in node workqueue

Expected results:
It should work as normal

Additional info:

Comment 4 wang lin 2022-01-20 09:05:35 UTC
Above issue has fixed, see the log

###
$ oc logs cloud-network-config-controller-9d4b9fcdb-5tblr
W0120 08:16:10.845142       1 client_config.go:617] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
I0120 08:16:10.855402       1 leaderelection.go:248] attempting to acquire leader lease openshift-cloud-network-config-controller/cloud-network-config-controller-lock...
E0120 08:16:40.856734       1 leaderelection.go:330] error retrieving resource lock openshift-cloud-network-config-controller/cloud-network-config-controller-lock: Get "https://api-int.lwangcp-sts-0120.qe.gcp.devcluster.openshift.com:6443/api/v1/namespaces/openshift-cloud-network-config-controller/configmaps/cloud-network-config-controller-lock": dial tcp: i/o timeout
E0120 08:17:40.765575       1 leaderelection.go:330] error retrieving resource lock openshift-cloud-network-config-controller/cloud-network-config-controller-lock: Get "https://api-int.lwangcp-sts-0120.qe.gcp.devcluster.openshift.com:6443/api/v1/namespaces/openshift-cloud-network-config-controller/configmaps/cloud-network-config-controller-lock": dial tcp: lookup api-int.lwangcp-sts-0120.qe.gcp.devcluster.openshift.com on 172.30.0.10:53: read udp 10.130.0.7:53589->172.30.0.10:53: i/o timeout
E0120 08:18:36.117027       1 leaderelection.go:330] error retrieving resource lock openshift-cloud-network-config-controller/cloud-network-config-controller-lock: Get "https://api-int.lwangcp-sts-0120.qe.gcp.devcluster.openshift.com:6443/api/v1/namespaces/openshift-cloud-network-config-controller/configmaps/cloud-network-config-controller-lock": dial tcp 10.0.0.2:6443: connect: connection refused
I0120 08:19:22.885594       1 leaderelection.go:258] successfully acquired lease openshift-cloud-network-config-controller/cloud-network-config-controller-lock
I0120 08:19:22.886712       1 controller.go:88] Starting node controller
I0120 08:19:22.886800       1 controller.go:91] Waiting for informer caches to sync for node workqueue
I0120 08:19:22.886857       1 controller.go:88] Starting secret controller
I0120 08:19:22.886896       1 controller.go:91] Waiting for informer caches to sync for secret workqueue
I0120 08:19:22.886941       1 controller.go:88] Starting cloud-private-ip-config controller
I0120 08:19:22.886969       1 controller.go:91] Waiting for informer caches to sync for cloud-private-ip-config workqueue
I0120 08:19:22.901687       1 controller.go:182] Assigning key: lwangcp-sts-0120-z7b69-master-0.c.openshift-qe.internal to node workqueue
I0120 08:19:22.901717       1 controller.go:182] Assigning key: lwangcp-sts-0120-z7b69-master-1.c.openshift-qe.internal to node workqueue
I0120 08:19:22.901726       1 controller.go:182] Assigning key: lwangcp-sts-0120-z7b69-master-2.c.openshift-qe.internal to node workqueue
I0120 08:19:22.987638       1 controller.go:96] Starting cloud-private-ip-config workers
I0120 08:19:22.987822       1 controller.go:102] Started cloud-private-ip-config workers
I0120 08:19:22.987734       1 controller.go:96] Starting node workers
I0120 08:19:22.987771       1 controller.go:96] Starting secret workers
I0120 08:19:22.987961       1 controller.go:102] Started secret workers
I0120 08:19:22.987904       1 controller.go:102] Started node workers
I0120 08:19:23.443729       1 node_controller.go:106] Setting annotation: 'cloud.network.openshift.io/egress-ipconfig: [{"interface":"nic0","ifaddr":{"ipv4":"10.0.0.0/17"},"capacity":{"ip":10}}]' on node: lwangcp-sts-0120-z7b69-master-0.c.openshift-qe.internal
I0120 08:19:23.500482       1 controller.go:160] Dropping key 'lwangcp-sts-0120-z7b69-master-0.c.openshift-qe.internal' from the node workqueue
I0120 08:19:23.508257       1 node_controller.go:106] Setting annotation: 'cloud.network.openshift.io/egress-ipconfig: [{"interface":"nic0","ifaddr":{"ipv4":"10.0.0.0/17"},"capacity":{"ip":10}}]' on node: lwangcp-sts-0120-z7b69-master-1.c.openshift-qe.internal
I0120 08:19:23.518226       1 node_controller.go:106] Setting annotation: 'cloud.network.openshift.io/egress-ipconfig: [{"interface":"nic0","ifaddr":{"ipv4":"10.0.0.0/17"},"capacity":{"ip":10}}]' on node: lwangcp-sts-0120-z7b69-master-2.c.openshift-qe.internal
I0120 08:19:23.574159       1 controller.go:160] Dropping key 'lwangcp-sts-0120-z7b69-master-1.c.openshift-qe.internal' from the node workqueue
I0120 08:19:23.576568       1 controller.go:160] Dropping key 'lwangcp-sts-0120-z7b69-master-2.c.openshift-qe.internal' from the node workqueue

Comment 7 errata-xmlrpc 2022-03-10 16:40:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0056


Note You need to log in before you can comment on or make changes to this bug.