Bug 1846264

Summary: OCPRHV-174: After successful cluster deployment, there are still pending CSRs
Product: OpenShift Container Platform Reporter: Jan Zmeskal <jzmeskal>
Component: InstallerAssignee: Roy Golan <rgolan>
Installer sub component: OpenShift on RHV QA Contact: Jan Zmeskal <jzmeskal>
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: unspecified CC: dougsland, gzaidman, ocprhvteam
Version: 4.5   
Target Milestone: ---   
Target Release: 4.6.0   
Hardware: Unspecified   
OS: Unspecified   
URL: https://issues.redhat.com/browse/OCPRHV-174
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-10-27 16:06:37 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jan Zmeskal 2020-06-11 08:56:01 UTC
Description of problem:
After having successfully deployed OCP4.5 cluster on RHV4.3.10, there is still a bunch of CSRs in Pending state. See here:

[root@ocp-qe-1 secondary]# oc get csr
NAME        AGE    SIGNERNAME                                    REQUESTOR                                                                   CONDITION
csr-26xx8   11m    kubernetes.io/kubelet-serving                 system:node:secondary-tgqwn-worker-0-rhhdw                                  Pending
csr-2c8bl   24m    kubernetes.io/kubelet-serving                 system:node:secondary-tgqwn-worker-0-qmjqp                                  Pending
csr-2vq57   39m    kubernetes.io/kubelet-serving                 system:node:secondary-tgqwn-master-2                                        Approved,Issued
csr-58lfs   22m    kubernetes.io/kubelet-serving                 system:node:secondary-tgqwn-worker-0-rdxmn                                  Pending
csr-dbzw7   26m    kubernetes.io/kubelet-serving                 system:node:secondary-tgqwn-worker-0-rhhdw                                  Pending
csr-dmndl   26m    kubernetes.io/kube-apiserver-client-kubelet   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Approved,Issued
csr-fcwvz   22m    kubernetes.io/kube-apiserver-client-kubelet   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Approved,Issued
csr-gpk8t   39m    kubernetes.io/kubelet-serving                 system:node:secondary-tgqwn-master-1                                        Approved,Issued
csr-gpnct   40m    kubernetes.io/kube-apiserver-client-kubelet   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Approved,Issued
csr-jhdrt   24m    kubernetes.io/kube-apiserver-client-kubelet   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Approved,Issued
csr-k6shf   40m    kubernetes.io/kube-apiserver-client-kubelet   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Approved,Issued
csr-vbbkm   40m    kubernetes.io/kube-apiserver-client-kubelet   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Approved,Issued
csr-wb2c6   7m5s   kubernetes.io/kubelet-serving                 system:node:secondary-tgqwn-worker-0-rdxmn                                  Pending
csr-x4gwm   39m    kubernetes.io/kubelet-serving                 system:node:secondary-tgqwn-master-0                                        Approved,Issued
csr-zwj6k   9m1s   kubernetes.io/kubelet-serving                 system:node:secondary-tgqwn-worker-0-qmjqp                                  Pending

They all seem to be requested by worker nodes. This can lead to issues such as described here: https://bugzilla.redhat.com/show_bug.cgi?id=1733331

Version-Release number of the following components:
4.5.0-0.nightly-2020-06-10-224736

How reproducible:
Saw it once, didn't attempt multiple reproductions

Steps to Reproduce:
1. openshift-install create cluster
2. Wait for the cluster deployment to finish successfully
3. oc get csr

Comment 2 Sandro Bonazzola 2020-06-18 06:45:41 UTC
due to capacity constraints we will be revisiting this bug in the upcoming sprint

Comment 3 Douglas Schilling Landgraf 2020-07-09 12:22:03 UTC
due to capacity constraints we will be revisiting this bug in the upcoming sprint

Comment 4 Gal Zaidman 2020-09-07 11:58:19 UTC
I was not able to reproduce this on my cluster.
Is this still relevant or can we close this?

Comment 5 Jan Zmeskal 2020-09-07 12:08:22 UTC
I'm not 100 % sure, but it's quite possible that this has been fixed by https://bugzilla.redhat.com/show_bug.cgi?id=1854787

Comment 6 Gal Zaidman 2020-09-07 12:31:19 UTC
can we close it?

Comment 7 Jan Zmeskal 2020-09-07 12:44:31 UTC
I think it'd be better to move to ON_QA, try deploying OCP4.5 cluster and make sure this does not reproduce.

Comment 8 Gal Zaidman 2020-09-07 12:58:10 UTC
ok Moving to ON_QA

Comment 10 Jan Zmeskal 2020-09-22 15:37:49 UTC
I verified with OCP4.5.11. My deployment wasn't 100 % successful as authentication was still updating, but I don't think that matters in scope of this bug.

# oc get co
NAME                                       VERSION   AVAILABLE   PROGRESSING   DEGRADED   SINCE
authentication                             4.5.11    True        False         False      109s
cloud-credential                           4.5.11    True        False         False      39m
cluster-autoscaler                         4.5.11    True        False         False      22m
config-operator                            4.5.11    True        False         False      22m
console                                    4.5.11    True        False         False      7m48s
csi-snapshot-controller                    4.5.11    True        False         False      19m
dns                                        4.5.11    True        False         False      31m
etcd                                       4.5.11    True        False         False      32m
image-registry                             4.5.11    True        False         False      27m
ingress                                    4.5.11    True        False         False      19m
insights                                   4.5.11    True        False         False      29m
kube-apiserver                             4.5.11    True        True          False      31m
kube-controller-manager                    4.5.11    True        False         False      31m
kube-scheduler                             4.5.11    True        False         False      31m
kube-storage-version-migrator              4.5.11    True        False         False      19m
machine-api                                4.5.11    True        False         False      25m
machine-approver                           4.5.11    True        False         False      30m
machine-config                             4.5.11    True        False         False      31m
marketplace                                4.5.11    True        False         False      9m35s
monitoring                                 4.5.11    True        False         False      12m
network                                    4.5.11    True        False         False      34m
node-tuning                                4.5.11    True        False         False      34m
openshift-apiserver                        4.5.11    True        False         False      14m
openshift-controller-manager               4.5.11    True        False         False      28m
openshift-samples                          4.5.11    True        False         False      14m
operator-lifecycle-manager                 4.5.11    True        False         False      32m
operator-lifecycle-manager-catalog         4.5.11    True        False         False      33m
operator-lifecycle-manager-packageserver   4.5.11    True        False         False      11m
service-ca                                 4.5.11    True        False         False      33m
storage                                    4.5.11    True        False         False      27m

# oc get csr
NAME        AGE   SIGNERNAME                                    REQUESTOR                                                                   CONDITION
csr-4vpfs   36m   kubernetes.io/kubelet-serving                 system:node:primary-rv827-master-1                                          Approved,Issued
csr-9kq2v   37m   kubernetes.io/kube-apiserver-client-kubelet   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Approved,Issued
csr-b9ln5   37m   kubernetes.io/kubelet-serving                 system:node:primary-rv827-master-0                                          Approved,Issued
csr-gjprw   37m   kubernetes.io/kube-apiserver-client-kubelet   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Approved,Issued
csr-gwf9z   22m   kubernetes.io/kube-apiserver-client-kubelet   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Approved,Issued
csr-hdqp5   37m   kubernetes.io/kube-apiserver-client-kubelet   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Approved,Issued
csr-mw9cw   19m   kubernetes.io/kube-apiserver-client-kubelet   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Approved,Issued
csr-n9gtt   22m   kubernetes.io/kubelet-serving                 system:node:primary-rv827-worker-0-n7xqw                                    Approved,Issued
csr-r4sgp   20m   kubernetes.io/kubelet-serving                 system:node:primary-rv827-worker-0-jh4dx                                    Approved,Issued
csr-vwplf   19m   kubernetes.io/kubelet-serving                 system:node:primary-rv827-worker-0-s8gc6                                    Approved,Issued
csr-xmzpg   21m   kubernetes.io/kube-apiserver-client-kubelet   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Approved,Issued
csr-zwcgv   37m   kubernetes.io/kubelet-serving                 system:node:primary-rv827-master-2                                          Approved,Issued

# oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version             False       True          40m     Unable to apply 4.5.11: the cluster operator authentication has not yet successfully rolled out

# oc project openshift-machine-api
Now using project "openshift-machine-api" on server "https://api.primary.ocp.rhev.lab.eng.brq.redhat.com:6443".

# oc get machine
NAME                           PHASE     TYPE   REGION   ZONE   AGE
primary-rv827-master-0         Running                          41m
primary-rv827-master-1         Running                          41m
primary-rv827-master-2         Running                          41m
primary-rv827-worker-0-jh4dx   Running                          30m
primary-rv827-worker-0-n7xqw   Running                          30m
primary-rv827-worker-0-s8gc6   Running                          30m

# oc get node
NAME                           STATUS   ROLES    AGE   VERSION
primary-rv827-master-0         Ready    master   38m   v1.18.3+47c0e71
primary-rv827-master-1         Ready    master   38m   v1.18.3+47c0e71
primary-rv827-master-2         Ready    master   38m   v1.18.3+47c0e71
primary-rv827-worker-0-jh4dx   Ready    worker   22m   v1.18.3+47c0e71
primary-rv827-worker-0-n7xqw   Ready    worker   24m   v1.18.3+47c0e71
primary-rv827-worker-0-s8gc6   Ready    worker   21m   v1.18.3+47c0e71

Comment 12 errata-xmlrpc 2020-10-27 16:06:37 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196