2014622 – Special Resource Operator(SRO) - Fail to deploy simple-kmod on GCP

Bug 2014622 - Special Resource Operator(SRO) - Fail to deploy simple-kmod on GCP

Summary: Special Resource Operator(SRO) - Fail to deploy simple-kmod on GCP

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Special Resource Operator
Sub Component:
Version:	4.9
Hardware:	Unspecified
OS:	Unspecified
Priority:	urgent
Severity:	high
Target Milestone:	---
Target Release:	4.9.z
Assignee:	Pablo Acevedo
QA Contact:	liqcui
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	2014275 (view as bug list)
Depends On:	1996916
Blocks:
TreeView+	depends on / blocked

Reported:	2021-10-15 16:12 UTC by OpenShift BugZilla Robot
Modified:	2021-10-26 20:12 UTC (History)
CC List:	8 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:	Cause: Failed to connect to metadata.google.internal port 80: Connection refused Consequence: No Simple-kmod POD created Fix: Fixed in BZ 2014622 Result: Simple-kmod POD created
Clone Of:
Environment:
Last Closed:	2021-10-26 17:23:10 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	openshift special-resource-operator pull 61	0	None	open	[release-4.9] Bug 2014622: Special Resource Operator(SRO) - Fail to deploy simple-kmod on GCP	2021-10-15 23:55:49 UTC
Red Hat Product Errata	RHBA-2021:3935	0	None	None	None	2021-10-26 17:23:33 UTC

Description OpenShift BugZilla Robot 2021-10-15 16:12:15 UTC

+++ This bug was initially created as a clone of Bug #1996916 +++

Description of problem:


Version-Release number of selected component (if applicable):
4.9.0-fc.0

How reproducible:


Steps to Reproduce:
1. Setup OCP4.9 on GCP
2. Deploy SRO on OperatorHub, Create instance 


Actual results:
No Simple-kmod POD created

Expected results:
simple-kmod pod should be created normally

Additional info:
[ocpadmin@ec2-18-217-45-133 nto]$ oc logs -f special-resource-controller-manager-76758fbfd4-2fhrs -n openshift-operators
error: a container name must be specified for pod special-resource-controller-manager-76758fbfd4-2fhrs, choose one of: [kube-rbac-proxy manager]
[ocpadmin@ec2-18-217-45-133 nto]$ oc logs -f special-resource-controller-manager-76758fbfd4-2fhrs -n openshift-operators -c manager
I0823 15:00:02.375489       1 request.go:668] Waited for 1.047736909s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/apis/nfd.openshift.io/v1?timeout=32s
2021-08-23T15:00:03.725Z        INFO    controller-runtime.metrics      metrics server is starting to listen   {"addr": "127.0.0.1:8080"}
2021-08-23T15:00:03.780Z        INFO    setup   starting manager
I0823 15:00:03.780477       1 leaderelection.go:243] attempting to acquire leader lease openshift-operators/b6ae617b.openshift.io...
2021-08-23T15:00:03.780Z        INFO    controller-runtime.manager      starting metrics server        {"path": "/metrics"}
I0823 15:00:21.780840       1 leaderelection.go:253] successfully acquired lease openshift-operators/b6ae617b.openshift.io
2021-08-23T15:00:21.780Z        DEBUG   controller-runtime.manager.events       Normal{"object": {"kind":"ConfigMap","namespace":"openshift-operators","name":"b6ae617b.openshift.io","uid":"a4b4ea4a-c8a8-44e3-96c3-b3262b2ce7ad","apiVersion":"v1","resourceVersion":"204878"}, "reason": "LeaderElection", "message": "special-resource-controller-manager-76758fbfd4-2fhrs_0214bbac-9868-4be4-b314-20427b8be7d4 became leader"}
2021-08-23T15:00:21.781Z        DEBUG   controller-runtime.manager.events       Normal{"object": {"kind":"Lease","namespace":"openshift-operators","name":"b6ae617b.openshift.io","uid":"c6750a42-a7ed-498a-a10b-475413a715e7","apiVersion":"coordination.k8s.io/v1","resourceVersion":"204879"}, "reason": "LeaderElection", "message": "special-resource-controller-manager-76758fbfd4-2fhrs_0214bbac-9868-4be4-b314-20427b8be7d4 became leader"}
2021-08-23T15:00:21.781Z        INFO    controller-runtime.manager.controller.specialresource  Starting EventSource    {"reconciler group": "sro.openshift.io", "reconciler kind": "SpecialResource", "source": "kind source: /, Kind="}
2021-08-23T15:00:21.781Z        INFO    controller-runtime.manager.controller.specialresource  Starting EventSource    {"reconciler group": "sro.openshift.io", "reconciler kind": "SpecialResource", "source": "kind source: /, Kind="}
2021-08-23T15:00:21.781Z        INFO    controller-runtime.manager.controller.specialresource  Starting EventSource    {"reconciler group": "sro.openshift.io", "reconciler kind": "SpecialResource", "source": "kind source: /, Kind="}
2021-08-23T15:00:21.781Z        INFO    controller-runtime.manager.controller.specialresource  Starting EventSource    {"reconciler group": "sro.openshift.io", "reconciler kind": "SpecialResource", "source": "kind source: /, Kind="}
2021-08-23T15:00:21.781Z        INFO    controller-runtime.manager.controller.specialresource  Starting EventSource    {"reconciler group": "sro.openshift.io", "reconciler kind": "SpecialResource", "source": "kind source: /, Kind="}
2021-08-23T15:00:21.781Z        INFO    controller-runtime.manager.controller.specialresource  Starting EventSource    {"reconciler group": "sro.openshift.io", "reconciler kind": "SpecialResource", "source": "kind source: /, Kind="}
2021-08-23T15:00:21.781Z        INFO    controller-runtime.manager.controller.specialresource  Starting EventSource    {"reconciler group": "sro.openshift.io", "reconciler kind": "SpecialResource", "source": "kind source: /, Kind="}
2021-08-23T15:00:21.781Z        INFO    controller-runtime.manager.controller.specialresource  Starting EventSource    {"reconciler group": "sro.openshift.io", "reconciler kind": "SpecialResource", "source": "kind source: /, Kind="}
2021-08-23T15:00:21.781Z        INFO    controller-runtime.manager.controller.specialresource  Starting EventSource    {"reconciler group": "sro.openshift.io", "reconciler kind": "SpecialResource", "source": "kind source: /, Kind="}
2021-08-23T15:00:21.781Z        INFO    controller-runtime.manager.controller.specialresource  Starting EventSource    {"reconciler group": "sro.openshift.io", "reconciler kind": "SpecialResource", "source": "kind source: /, Kind="}
2021-08-23T15:00:21.781Z        INFO    controller-runtime.manager.controller.specialresource  Starting EventSource    {"reconciler group": "sro.openshift.io", "reconciler kind": "SpecialResource", "source": "kind source: /, Kind="}
2021-08-23T15:00:21.781Z        INFO    controller-runtime.manager.controller.specialresource  Starting EventSource    {"reconciler group": "sro.openshift.io", "reconciler kind": "SpecialResource", "source": "kind source: /, Kind="}
2021-08-23T15:00:21.781Z        INFO    controller-runtime.manager.controller.specialresource  Starting EventSource    {"reconciler group": "sro.openshift.io", "reconciler kind": "SpecialResource", "source": "kind source: /, Kind="}
2021-08-23T15:00:21.781Z        INFO    controller-runtime.manager.controller.specialresource  Starting EventSource    {"reconciler group": "sro.openshift.io", "reconciler kind": "SpecialResource", "source": "kind source: /, Kind="}
2021-08-23T15:00:21.781Z        INFO    controller-runtime.manager.controller.specialresource  Starting EventSource    {"reconciler group": "sro.openshift.io", "reconciler kind": "SpecialResource", "source": "kind source: /, Kind="}
2021-08-23T15:00:21.781Z        INFO    controller-runtime.manager.controller.specialresource  Starting Controller     {"reconciler group": "sro.openshift.io", "reconciler kind": "SpecialResource"}
2021-08-23T15:00:21.982Z        INFO    filter          CREATE IsSpecialResource      {"GenerationChanged": "simple-kmod"}
2021-08-23T15:00:22.380Z        INFO    filter          CREATE IsSpecialResource      {"GenerationChanged": "special-resource-controller-manager"}
2021-08-23T15:00:24.278Z        INFO    filter          CREATE IsSpecialResource      {"GenerationChanged": "special-resource-controller-manager-76758fbfd4-2fhrs"}
2021-08-23T15:00:25.373Z        INFO    controller-runtime.manager.controller.specialresource  Starting workers        {"reconciler group": "sro.openshift.io", "reconciler kind": "SpecialResource", "worker count": 1}
2021-08-23T15:00:25.374Z        INFO    preamble        Controller Request      {"Name": "simple-kmod", "Namespace": ""}
2021-08-23T15:00:25.577Z        INFO    cache   Nodes cached    {"name": "liqcui-ocgcp01-h25hk-worker-a-whrpr.c.openshift-qe.internal"}
2021-08-23T15:00:25.577Z        INFO    cache   Nodes cached    {"name": "liqcui-ocgcp01-h25hk-worker-b-4sjbz.c.openshift-qe.internal"}
2021-08-23T15:00:25.577Z        INFO    cache   Nodes cached    {"name": "liqcui-ocgcp01-h25hk-worker-c-lml7k.c.openshift-qe.internal"}
2021-08-23T15:00:25.577Z        INFO    cache   Node list:      {"length": 3}
2021-08-23T15:00:25.577Z        INFO    cache   Nodes   {"num": 3}
2021-08-23T15:00:25.776Z        INFO    upgrade         History {"entry": "registry.ci.openshift.org/ocp/release@sha256:7e7fdee3d2a4ea95db5e356fed533fc80e350544633f175d438072e51ae8f0ff"}
E0823 15:00:32.427535       1 aws_credentials.go:77] while getting AWS credentials NoCredentialProviders: no valid providers in chain. Deprecated.
        For verbose messaging see aws.Config.CredentialsChainVerboseErrors
2021-08-23T15:04:41.798Z        INFO    filter          DELETE IsSpecialResource      {"GenerationChanged": "simple-kmod"}
2021-08-23T15:05:02.150Z        INFO    filter          CREATE IsSpecialResource      {"GenerationChanged": "simple-kmod"}

--- Additional comment from pacevedo on 2021-08-27 09:51:45 UTC ---

After reproducing the issue, here is the sequence of events leading to this situation:
- After installing NFD and SRO, we get the simple-kmod created.

- While trying to reconcile we see the following in the logs:
2021-08-23T15:00:25.776Z        INFO    upgrade         History {"entry": "registry.ci.openshift.org/ocp/release@sha256:7e7fdee3d2a4ea95db5e356fed533fc80e350544633f175d438072e51ae8f0ff"}
E0823 15:00:32.427535       1 aws_credentials.go:77] while getting AWS credentials NoCredentialProviders: no valid providers in chain. Deprecated.
        For verbose messaging see aws.Config.CredentialsChainVerboseErrors

- When enabling the verbose errors, we see the following:
E0827 08:02:02.065439       1 aws_credentials.go:81] while getting AWS credentials NoCredentialProviders: no valid providers in chain
caused by: EnvAccessKeyNotFound: failed to find credentials in the environment.
SharedCredsLoad: failed to load profile, .
EC2RoleRequestError: no EC2 instance role found
caused by: RequestError: send request failed
caused by: Get "http://169.254.169.254/latest/meta-data/iam/security-credentials/": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
While this can be misleading, it does not relate to the original issue, since we are not in AWS. We can expect this to happen, but getting a log error is a bit confusing.

- After creating a simple-kmod CR we see nothing happens. We delete it and create it again and its not reconciled.

- Looking into the AWS log it seems there is some infinite loop in the driver toolkit version logic. When investigating, we can find the following:
I0827 08:01:55.732875       1 plugins.go:50] Creating DockerKeyring
I0827 08:01:55.732917       1 plugins.go:64] Trying out provider .dockercfg
I0827 08:01:55.732928       1 plugins.go:66] Registering credential provider
I0827 08:01:55.732937       1 plugins.go:64] Trying out provider amazon-ecr
E0827 08:02:02.065439       1 aws_credentials.go:81] while getting AWS credentials NoCredentialProviders: no valid providers in chain
caused by: EnvAccessKeyNotFound: failed to find credentials in the environment.
SharedCredsLoad: failed to load profile, .
EC2RoleRequestError: no EC2 instance role found
caused by: RequestError: send request failed
caused by: Get "http://169.254.169.254/latest/meta-data/iam/security-credentials/": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
I0827 08:02:02.065524       1 plugins.go:64] Trying out provider azure
I0827 08:02:02.065538       1 plugins.go:64] Trying out provider google-container-registry
I0827 08:02:02.065706       1 metadata.go:195] Reading URL with backoff. http://metadata.google.internal./computeMetadata/v1/instance/service-accounts/
I0827 08:02:03.123781       1 metadata.go:199] Unable to read URL http://metadata.google.internal./computeMetadata/v1/instance/service-accounts/. Error=Get "http://metadata.google.internal./computeMetadata/v1/instance/service-accounts/": dial tcp 169.254.169.254:80: connect: connection refused

So now we see some trouble with the metadata server in GCE. This is happening within and endless loop, as can be seen here: https://github.com/vdemeester/k8s-pkg-credentialprovider/blob/master/gcp/metadata.go#L193
This issue (https://github.com/vdemeester/k8s-pkg-credentialprovider/issues/14) looks similar, but the situation is not the same because the name is resolved to an IP. There is no DNS problem, its firewalling/routing rules in the network.
Since this is running in an endless backoff loop, the reconcile logic never gets past this function and no CR is going to be handled, hence the apparent inactivity we see.

- If we exec into the pod we can see the same situation:
❯ kubectl exec -it -n openshift-special-resource-operator special-resource-controller-manager-5f8b4449b5-wc7qw -c manager -- /bin/bash
bash-4.4$ cat /sys/class/dmi/id/product_name
Google Compute Engine
bash-4.4$ curl http://metadata.google.internal. -v
* Rebuilt URL to: http://metadata.google.internal./
*   Trying 169.254.169.254...
* TCP_NODELAY set
* connect to 169.254.169.254 port 80 failed: Connection refused
* Failed to connect to metadata.google.internal port 80: Connection refused
* Closing connection 0
curl: (7) Failed to connect to metadata.google.internal port 80: Connection refused

- If we do the same thing from one of the nodes the curl succeeds, so we are only allowed to do it from the VMs themselves.

This might be due to a faulty authorization config in GCP when setting up the cluster, can you please check it?

--- Additional comment from pacevedo on 2021-08-27 09:53:40 UTC ---

Some reference: https://cloud.google.com/kubernetes-engine/docs/how-to/protecting-cluster-metadata

--- Additional comment from liqcui on 2021-09-01 03:42:57 UTC ---

https://groups.google.com/g/openshift/c/MnESJ0MUw1Y, for my understanding the only way  mount the cluster's Google Cloud network inside of the pod.

--- Additional comment from pacevedo on 2021-09-02 11:53:36 UTC ---

After some investigation we found the cause.

Starting from a fresh 4.9 install in GCP we have the initial situation, which is the infinite loop:
I0902 07:21:04.441264       1 metadata.go:199] Unable to read URL http://metadata.google.internal./computeMetadata/v1/instance/service-accounts/. Error=Get "http://metadata.google.internal./computeMetadata/v1/instance/service-accounts/": dial tcp 169.254.169.254:80: connect: connection refused
I0902 07:21:05.593287       1 metadata.go:199] Unable to read URL http://metadata.google.internal./computeMetadata/v1/instance/service-accounts/. Error=Get "http://metadata.google.internal./computeMetadata/v1/instance/service-accounts/": dial tcp 169.254.169.254:80: connect: connection refused
I0902 07:21:06.809316       1 metadata.go:199] Unable to read URL http://metadata.google.internal./computeMetadata/v1/instance/service-accounts/. Error=Get "http://metadata.google.internal./computeMetadata/v1/instance/service-accounts/": dial tcp 169.254.169.254:80: connect: connection refused
I0902 07:21:08.217253       1 metadata.go:199] Unable to read URL http://metadata.google.internal./computeMetadata/v1/instance/service-accounts/. Error=Get "http://metadata.google.internal./computeMetadata/v1/instance/service-accounts/": dial tcp 169.254.169.254:80: connect: connection refused
I0902 07:21:10.073291       1 metadata.go:199] Unable to read URL http://metadata.google.internal./computeMetadata/v1/instance/service-accounts/. Error=Get "http://metadata.google.internal./computeMetadata/v1/instance/service-accounts/": dial tcp 169.254.169.254:80: connect: connection refused
I0902 07:21:12.697274       1 metadata.go:199] Unable to read URL http://metadata.google.internal./computeMetadata/v1/instance/service-accounts/. Error=Get "http://metadata.google.internal./computeMetadata/v1/instance/service-accounts/": dial tcp 169.254.169.254:80: connect: connection refused

Then we check the connection is allowed from the node:
[root@ci-ln-jbfhhcb-f76d1-gd5qt-worker-b-6kxcs /]# curl http://metadata.google.internal. -v -i
* Rebuilt URL to: http://metadata.google.internal./
*   Trying 169.254.169.254...
* TCP_NODELAY set
* Connected to metadata.google.internal (169.254.169.254) port 80 (#0)
> GET / HTTP/1.1
> Host: metadata.google.internal
> User-Agent: curl/7.61.1
> Accept: */*
>
< HTTP/1.1 200 OK
[...]

Then we go to the pod and we perform the same operation:
bash-4.4$ curl http://metadata.google.internal. -v -i
* Rebuilt URL to: http://metadata.google.internal./
*   Trying 169.254.169.254...
* TCP_NODELAY set
* connect to 169.254.169.254 port 80 failed: Connection refused
* Failed to connect to metadata.google.internal port 80: Connection refused
* Closing connection 0
curl: (7) Failed to connect to metadata.google.internal port 80: Connection refused

We spin up a pod for debug and install tcpdump and we find that making a curl while tcpdump is on only shows DNS traffic to resolve the metadata server IP:
09:23:14.955898 IP debug.50005 > dns-default.openshift-dns.svc.cluster.local.53: 27786+ A? metadata.google.internal.default.svc.cluster.local. (68)
09:23:14.956165 IP debug.50005 > dns-default.openshift-dns.svc.cluster.local.53: 28181+ AAAA? metadata.google.internal.default.svc.cluster.local. (68)
09:23:14.957172 IP dns-default.openshift-dns.svc.cluster.local.53 > debug.50005: 28181 NXDomain*- 0/1/1 (172)
09:23:14.957196 IP dns-default.openshift-dns.svc.cluster.local.53 > debug.50005: 27786 NXDomain*- 0/1/1 (172)
09:23:14.957292 IP debug.57561 > dns-default.openshift-dns.svc.cluster.local.53: 49982+ A? metadata.google.internal.svc.cluster.local. (60)
09:23:14.957504 IP dns-default.openshift-dns.svc.cluster.local.53 > debug.57561: 49982 NXDomain*- 0/1/1 (164)
09:23:14.957792 IP debug.57561 > dns-default.openshift-dns.svc.cluster.local.53: 50413+ AAAA? metadata.google.internal.svc.cluster.local. (60)
09:23:14.958003 IP dns-default.openshift-dns.svc.cluster.local.53 > debug.57561: 50413 NXDomain*- 0/1/1 (164)
09:23:14.958108 IP debug.46008 > dns-default.openshift-dns.svc.cluster.local.53: 19830+ A? metadata.google.internal.cluster.local. (56)
09:23:14.958183 IP debug.46008 > dns-default.openshift-dns.svc.cluster.local.53: 20167+ AAAA? metadata.google.internal.cluster.local. (56)
09:23:14.958950 IP dns-default.openshift-dns.svc.cluster.local.53 > debug.46008: 20167 NXDomain*- 0/1/1 (160)
09:23:14.959097 IP dns-default.openshift-dns.svc.cluster.local.53 > debug.46008: 19830 NXDomain*- 0/1/1 (160)
09:23:14.959214 IP debug.55770 > dns-default.openshift-dns.svc.cluster.local.53: 12236+ A? metadata.google.internal.c.openshift-gce-devel-ci.internal. (76)
09:23:14.959633 IP debug.55770 > dns-default.openshift-dns.svc.cluster.local.53: 12651+ AAAA? metadata.google.internal.c.openshift-gce-devel-ci.internal. (76)
09:23:14.963824 IP dns-default.openshift-dns.svc.cluster.local.53 > debug.55770: 12651 NXDomain 0/1/1 (192)
09:23:14.963880 IP dns-default.openshift-dns.svc.cluster.local.53 > debug.55770: 12236 NXDomain 0/1/1 (192)
09:23:14.964490 IP debug.37963 > dns-default.openshift-dns.svc.cluster.local.53: 41502+ A? metadata.google.internal.google.internal. (58)
09:23:14.964778 IP debug.37963 > dns-default.openshift-dns.svc.cluster.local.53: 41842+ AAAA? metadata.google.internal.google.internal. (58)
09:23:14.967874 IP dns-default.openshift-dns.svc.cluster.local.53 > debug.37963: 41502 NXDomain 0/1/1 (174)
09:23:14.968422 IP dns-default.openshift-dns.svc.cluster.local.53 > debug.37963: 41842 NXDomain 0/1/1 (174)
09:23:14.969681 IP debug.46076 > dns-default.openshift-dns.svc.cluster.local.53: 55746+ A? metadata.google.internal.c.openshift-gce-devel-c. (66)
09:23:14.969762 IP debug.46076 > dns-default.openshift-dns.svc.cluster.local.53: 56121+ AAAA? metadata.google.internal.c.openshift-gce-devel-c. (66)
09:23:15.011148 IP dns-default.openshift-dns.svc.cluster.local.53 > debug.46076: 55746 NXDomain 0/1/1 (152)
09:23:15.026148 IP dns-default.openshift-dns.svc.cluster.local.53 > debug.46076: 56121 NXDomain 0/1/1 (152)
09:23:15.026728 IP debug.45646 > dns-default.openshift-dns.svc.cluster.local.53: 61897+ A? metadata.google.internal. (42)
09:23:15.026954 IP dns-default.openshift-dns.svc.cluster.local.53 > debug.45646: 61897* 1/0/1 A 169.254.169.254 (93)
09:23:15.027205 IP debug.45646 > dns-default.openshift-dns.svc.cluster.local.53: 62240+ AAAA? metadata.google.internal. (42)
09:23:15.028764 IP dns-default.openshift-dns.svc.cluster.local.53 > debug.45646: 62240 0/0/1 (53)
09:23:15.061121 IP debug.58929 > dns-default.openshift-dns.svc.cluster.local.53: 3387+ PTR? 10.0.30.172.in-addr.arpa. (42)
09:23:15.061538 IP dns-default.openshift-dns.svc.cluster.local.53 > debug.58929: 3387*- 1/0/1 PTR dns-default.openshift-dns.svc.cluster.local. (134)

And nothing else. We turn to iptables rules and we find this in SDN:
https://github.com/openshift/sdn/blob/e17993d1116cce1372dfc680752e792e8e51e28a/pkg/cmd/openshift-sdn-cni/openshift-sdn.go#L126-L129

After asking SDN team they said it is mainly for security reasons.

--- Additional comment from bthurber on 2021-09-09 03:03:47 UTC ---

Is there anything to take action on for this BZ or is this trending as NOT A BUG?

--- Additional comment from pacevedo on 2021-09-09 07:29:57 UTC ---

We are looking into different alternatives for extracting the info we need without using the metadata server, but this is a bug when using SDN CNI on GCP.

--- Additional comment from pacevedo on 2021-10-07 08:15:49 UTC ---

Working in a solution already.

Comment 2 liqcui 2021-10-19 03:31:36 UTC

Verified Result:
[mirroradmin@ec2-18-217-45-133 ~]$ oc get infrastructure cluster -o yaml
apiVersion: config.openshift.io/v1
kind: Infrastructure
metadata:
  creationTimestamp: "2021-10-19T02:26:20Z"
  generation: 1
  name: cluster
  resourceVersion: "612"
  uid: 4f26bb2c-7ff0-4d29-baa0-f0c2d9ba4ee2
spec:
  cloudConfig:
    key: config
    name: cloud-provider-config
  platformSpec:
    type: GCP
status:
  apiServerInternalURI: https://api-int.liqcui-gcp01.qe.gcp.devcluster.openshift.com:6443
  apiServerURL: https://api.liqcui-gcp01.qe.gcp.devcluster.openshift.com:6443
  controlPlaneTopology: HighlyAvailable
  etcdDiscoveryDomain: ""
  infrastructureName: liqcui-gcp01-5c8g9
  infrastructureTopology: HighlyAvailable
  platform: GCP
  platformStatus:
    gcp:
      projectID: openshift-qe
      region: us-central1
    type: GCP
[mirroradmin@ec2-18-217-45-133 ~]$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.9.0-0.nightly-2021-10-18-182325   True        False         39m     Cluster version is 4.9.0-0.nightly-2021-10-18-182325
[mirroradmin@ec2-18-217-45-133 ~]$ oc get pods -n simple-kmod
NAME                                                  READY   STATUS      RESTARTS   AGE
simple-kmod-driver-build-bfb16b50984f16f0-1-build     0/1     Completed   0          8m8s
simple-kmod-driver-container-bfb16b50984f16f0-9pkrq   1/1     Running     0          9m
simple-kmod-driver-container-bfb16b50984f16f0-bfz78   1/1     Running     0          9m
simple-kmod-driver-container-bfb16b50984f16f0-h275w   1/1     Running     0          9m




Starting pod/liqcui-gcp01-5c8g9-worker-a-jbzvdcopenshift-qeinternal-debug ...
To use host binaries, run `chroot /host`
simple_procfs_kmod     16384  0
simple_kmod            16384  0

Removing debug pod ...
liqcui-gcp01-5c8g9-worker-b-6k87l.c.openshift-qe.internal
Starting pod/liqcui-gcp01-5c8g9-worker-b-6k87lcopenshift-qeinternal-debug ...
To use host binaries, run `chroot /host`
simple_procfs_kmod     16384  0
simple_kmod            16384  0

Removing debug pod ...
liqcui-gcp01-5c8g9-worker-c-r2997.c.openshift-qe.internal
Starting pod/liqcui-gcp01-5c8g9-worker-c-r2997copenshift-qeinternal-debug ...
To use host binaries, run `chroot /host`
simple_procfs_kmod     16384  0
simple_kmod            16384  0

Removing debug pod ...

Comment 5 Pablo Acevedo 2021-10-26 14:15:40 UTC

*** Bug 2014275 has been marked as a duplicate of this bug. ***

Comment 6 errata-xmlrpc 2021-10-26 17:23:10 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.9.4 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:3935

Note You need to log in before you can comment on or make changes to this bug.