Bug 1958861

Summary: [CCO] pod-identity-webhook certificate request failed
Product: OpenShift Container Platform Reporter: wang lin <lwan>
Component: Cloud Credential OperatorAssignee: Joel Diaz <jdiaz>
Status: CLOSED ERRATA QA Contact: wang lin <lwan>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.8CC: jdiaz, lwan, prubenda, xxia
Target Milestone: ---   
Target Release: 4.8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of:
: 2028615 (view as bug list) Environment:
Last Closed: 2021-07-27 23:07:32 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2028615    
Attachments:
Description Flags
pod-identity-webhook-5d75f69b47-vk8n5.log none

Description wang lin 2021-05-10 10:38:34 UTC
Created attachment 1781638 [details]
pod-identity-webhook-5d75f69b47-vk8n5.log

Description of problem:
Lots of csr created by pod-identity-webhook are in pending status, the below are the logs from identity pod. In addition, the cluster was installed successfully and no operators degraded, we can’t find this issue unless we check the pod log.
$ oc logs -f pod-identity-webhook-5d75f69b47-vk8n5
W0510 02:37:14.251986       1 client_config.go:615] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
I0510 02:37:14.267119       1 store.go:63] Fetched secret: openshift-cloud-credential-operator/pod-identity-webhook
I0510 02:37:14.267351       1 main.go:191] Creating server
I0510 02:37:14.267858       1 main.go:211] Listening on :9999 for metrics and healthz
I0510 02:37:14.268270       1 main.go:205] Listening on :6443
E0510 02:52:14.287539       1 certificate_manager.go:454] certificate request was not signed: timed out waiting for the condition
E0510 03:07:16.437722       1 certificate_manager.go:454] certificate request was not signed: timed out waiting for the condition
E0510 03:22:20.600397       1 certificate_manager.go:454] certificate request was not signed: timed out waiting for the condition
E0510 03:37:28.772817       1 certificate_manager.go:454] certificate request was not signed: timed out waiting for the condition
E0510 03:52:45.984399       1 certificate_manager.go:454] certificate request was not signed: timed out waiting for the condition
E0510 03:52:45.984427       1 certificate_manager.go:318] Reached backoff limit, still unable to rotate certs: timed out waiting for the condition
E0510 04:08:17.993632       1 certificate_manager.go:454] certificate request was not signed: timed out waiting for the condition

Version-Release number of selected component (if applicable):
4.8.0-0.nightly-2021-05-09-105430

How reproducible:
Always

Steps to Reproduce:
1.Install a 4.8 ocp cluster on aws
2.Check csr and pod-identity-webhook log

Actual result:
#Lots of csr are in pending status
$ oc get csr
NAME        AGE     SIGNERNAME                      REQUESTOR                                                                        CONDITION
csr-2mh2b   172m    kubernetes.io/kubelet-serving   system:serviceaccount:openshift-cloud-credential-operator:pod-identity-webhook   Pending
csr-2t7st   3h39m   kubernetes.io/kubelet-serving   system:serviceaccount:openshift-cloud-credential-operator:pod-identity-webhook   Pending
csr-4sm7x   3h8m    kubernetes.io/kubelet-serving   system:serviceaccount:openshift-cloud-credential-operator:pod-identity-webhook   Pending
csr-5bqxk   4h10m   kubernetes.io/kubelet-serving   system:serviceaccount:openshift-cloud-credential-operator:pod-identity-webhook   Pending
csr-5rq9c   5h42m   kubernetes.io/kubelet-serving   system:serviceaccount:openshift-cloud-credential-operator:pod-identity-webhook   Pending
csr-6hq4c   5h11m   kubernetes.io/kubelet-serving   system:serviceaccount:openshift-cloud-credential-operator:pod-identity-webhook   Pending
csr-7qcm9   5h57m   kubernetes.io/kubelet-serving   system:serviceaccount:openshift-cloud-credential-operator:pod-identity-webhook   Pending
csr-9rvhh   110m    kubernetes.io/kubelet-serving   system:serviceaccount:openshift-cloud-credential-operator:pod-identity-webhook   Pending

#one csr config
$ oc get csr csr-2mh2b -o yaml
apiVersion: certificates.k8s.io/v1
kind: CertificateSigningRequest
metadata:
  creationTimestamp: "2021-05-10T05:57:01Z"
  generateName: csr-
  name: csr-2mh2b
  resourceVersion: "97360"
  uid: 209b9297-7f25-4afa-b210-7573711a6bb4
spec:
  groups:
  - system:serviceaccounts
  - system:serviceaccounts:openshift-cloud-credential-operator
  - system:authenticated
  request: LS0tLS1CRUdJTiXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
  signerName: kubernetes.io/kubelet-serving
  uid: 880cf0b6-38f9-4310-a580-05d4f4527443
  usages:
  - digital signature
  - key encipherment
  - server auth
  username: system:serviceaccount:openshift-cloud-credential-operator:pod-identity-webhook
status: {}

Expected result:
certificate request should succeed

Comment 1 Xingxing Xia 2021-05-11 06:26:09 UTC
Hit same (and searched BZ then found this)

Comment 2 Joel Diaz 2021-05-12 18:01:11 UTC
*** Bug 1959954 has been marked as a duplicate of this bug. ***

Comment 4 wang lin 2021-05-18 02:22:33 UTC
The csr piled up issue has fixed on 4.8.0-0.nightly-2021-05-15-141455

$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.8.0-0.nightly-2021-05-15-141455   True        False         24h     Cluster version is 4.8.0-0.nightly-2021-05-15-141455
$ oc get csr
No resources found

Comment 7 errata-xmlrpc 2021-07-27 23:07:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438