Description of problem: After deploying OCP 4.11 via IPI on Power, Some csr remain Pending: # oc get csr | grep Pending csr-8vv9t 51m kubernetes.io/kubelet-serving system:node:rdr-ocp-j17-pravind-i-hqvzv-worker-5v5pb <none> Pending csr-f7g26 36m kubernetes.io/kubelet-serving system:node:rdr-ocp-j17-pravind-i-hqvzv-worker-5v5pb <none> Pending csr-g89ws 51m kubernetes.io/kubelet-serving system:node:rdr-ocp-j17-pravind-i-hqvzv-worker-dwdng <none> Pending csr-gf8qt 6m26s kubernetes.io/kubelet-serving system:node:rdr-ocp-j17-pravind-i-hqvzv-worker-dwdng <none> Pending csr-j4h6b 21m kubernetes.io/kubelet-serving system:node:rdr-ocp-j17-pravind-i-hqvzv-worker-dwdng <none> Pending csr-mhz9p 36m kubernetes.io/kubelet-serving system:node:rdr-ocp-j17-pravind-i-hqvzv-worker-xcjd8 <none> Pending csr-p5m77 6m20s kubernetes.io/kubelet-serving system:node:rdr-ocp-j17-pravind-i-hqvzv-worker-xcjd8 <none> Pending csr-p5qmk 6m30s kubernetes.io/kubelet-serving system:node:rdr-ocp-j17-pravind-i-hqvzv-worker-5v5pb <none> Pending csr-qlmb8 21m kubernetes.io/kubelet-serving system:node:rdr-ocp-j17-pravind-i-hqvzv-worker-5v5pb <none> Pending csr-sn8ms 51m kubernetes.io/kubelet-serving system:node:rdr-ocp-j17-pravind-i-hqvzv-worker-xcjd8 <none> Pending csr-t5cbh 36m kubernetes.io/kubelet-serving system:node:rdr-ocp-j17-pravind-i-hqvzv-worker-dwdng <none> Pending csr-ww2rj 21m kubernetes.io/kubelet-serving system:node:rdr-ocp-j17-pravind-i-hqvzv-worker-xcjd8 <none> Pending Cluster status: # oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.11.0-0.nightly-ppc64le-2022-06-16-003709 True False 18m Cluster version is 4.11.0-0.nightly-ppc64le-2022-06-16-003709 # oc get nodes NAME STATUS ROLES AGE VERSION rdr-ocp-j17-pravind-i-hqvzv-master-0 Ready master 58m v1.24.0+cb71478 rdr-ocp-j17-pravind-i-hqvzv-master-1 Ready master 58m v1.24.0+cb71478 rdr-ocp-j17-pravind-i-hqvzv-master-2 Ready master 58m v1.24.0+cb71478 rdr-ocp-j17-pravind-i-hqvzv-worker-5v5pb Ready worker 27m v1.24.0+cb71478 rdr-ocp-j17-pravind-i-hqvzv-worker-dwdng Ready worker 27m v1.24.0+cb71478 rdr-ocp-j17-pravind-i-hqvzv-worker-xcjd8 Ready worker 27m v1.24.0+cb71478 # oc get co NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE authentication 4.11.0-0.nightly-ppc64le-2022-06-16-003709 True False False 13m baremetal 4.11.0-0.nightly-ppc64le-2022-06-16-003709 True False False 50m cloud-controller-manager 4.11.0-0.nightly-ppc64le-2022-06-16-003709 True False False 54m cloud-credential 4.11.0-0.nightly-ppc64le-2022-06-16-003709 True False False 71m cluster-autoscaler 4.11.0-0.nightly-ppc64le-2022-06-16-003709 True False False 50m config-operator 4.11.0-0.nightly-ppc64le-2022-06-16-003709 True False False 52m console 4.11.0-0.nightly-ppc64le-2022-06-16-003709 True False False 19m csi-snapshot-controller 4.11.0-0.nightly-ppc64le-2022-06-16-003709 True False False 50m dns 4.11.0-0.nightly-ppc64le-2022-06-16-003709 True False False 50m etcd 4.11.0-0.nightly-ppc64le-2022-06-16-003709 True False False 50m image-registry 4.11.0-0.nightly-ppc64le-2022-06-16-003709 True False False 22m ingress 4.11.0-0.nightly-ppc64le-2022-06-16-003709 True False False 22m insights 4.11.0-0.nightly-ppc64le-2022-06-16-003709 True False False 24m kube-apiserver 4.11.0-0.nightly-ppc64le-2022-06-16-003709 True False False 40m kube-controller-manager 4.11.0-0.nightly-ppc64le-2022-06-16-003709 True False False 48m kube-scheduler 4.11.0-0.nightly-ppc64le-2022-06-16-003709 True False False 48m kube-storage-version-migrator 4.11.0-0.nightly-ppc64le-2022-06-16-003709 True False False 51m machine-api 4.11.0-0.nightly-ppc64le-2022-06-16-003709 True False False 46m machine-approver 4.11.0-0.nightly-ppc64le-2022-06-16-003709 True False False 50m machine-config 4.11.0-0.nightly-ppc64le-2022-06-16-003709 True False False 49m marketplace 4.11.0-0.nightly-ppc64le-2022-06-16-003709 True False False 50m monitoring 4.11.0-0.nightly-ppc64le-2022-06-16-003709 True False False 19m network 4.11.0-0.nightly-ppc64le-2022-06-16-003709 True False False 52m node-tuning 4.11.0-0.nightly-ppc64le-2022-06-16-003709 True False False 50m openshift-apiserver 4.11.0-0.nightly-ppc64le-2022-06-16-003709 True False False 46m openshift-controller-manager 4.11.0-0.nightly-ppc64le-2022-06-16-003709 True False False 47m openshift-samples 4.11.0-0.nightly-ppc64le-2022-06-16-003709 True False False 45m operator-lifecycle-manager 4.11.0-0.nightly-ppc64le-2022-06-16-003709 True False False 51m operator-lifecycle-manager-catalog 4.11.0-0.nightly-ppc64le-2022-06-16-003709 True False False 51m operator-lifecycle-manager-packageserver 4.11.0-0.nightly-ppc64le-2022-06-16-003709 True False False 46m service-ca 4.11.0-0.nightly-ppc64le-2022-06-16-003709 True False False 52m storage 4.11.0-0.nightly-ppc64le-2022-06-16-003709 True False False 52m # oc get pods -A | grep -v Running| grep -v Completed NAMESPACE NAME READY STATUS RESTARTS AGE openshift-kube-apiserver installer-6-rdr-ocp-j17-pravind-i-hqvzv-master-1 0/1 Error 0 36m openshift-kube-controller-manager installer-7-rdr-ocp-j17-pravind-i-hqvzv-master-0 0/1 Error 0 44m openshift-kube-scheduler installer-5-rdr-ocp-j17-pravind-i-hqvzv-master-1 0/1 Error 0 50m openshift-operator-lifecycle-manager collect-profiles-27591150-8rsbg 0/1 Error 0 7m1s How reproducible: Always Steps to Reproduce: 1. Deploy cluster via IPI 2. Check the csr via oc get csr Actual results: Worker related csr remain Pending. Expected results: csr must be auto-approved.
Serving certs not being approved likely means that the IPs Kubelet is reporting do not match the IPs that the Machine API Provider is reporting, I would suggest looking at the machine-approver logs to be certain
Yeah, On debugging this I made following observatoins cluster-machine-approver logs I0621 06:49:29.065796 1 controller.go:121] Reconciling CSR: csr-zwsfb I0621 06:49:29.105553 1 csr_check.go:157] csr-zwsfb: CSR does not appear to be client csr E0621 06:49:29.110604 1 csr_check.go:420] csr-zwsfb: IP address '192.168.0.81' not in machine addresses: I0621 06:49:29.113715 1 controller.go:233] csr-zwsfb: CSR not authorized 1. Its a server csr request, for it to approve csr it has few conditions to meet(https://github.com/openshift/cluster-machine-approver#node-server-csr-approval-workflow) 2. One of this is to match machine internalIP with csr request IP 3. Currently machine does not have the InternalIP set karthikkn@Karthiks-MacBook-Pro .ssh % oc -n openshift-machine-api describe machine rdr-kn24-f9jtx-master-0 Status: Addresses: Address: rdr-kn24-f9jtx7mkm5-ks54l Type: InternalDNS 4. But CSR expects this karthikkn@Karthiks-MacBook-Pro karthik-openshift-workspace % oc describe csr csr-zwsfb Name: csr-zwsfb Labels: <none> Annotations: <none> CreationTimestamp: Mon, 20 Jun 2022 15:07:31 +0530 Requesting User: system:node:rdr-kn24-f9jtx7mkm5-ks54l Signer: kubernetes.io/kubelet-serving Status: Pending Subject: Common Name: system:node:rdr-kn24-f9jtx7mkm5-ks54l Serial Number: Organization: system:nodes Subject Alternative Names: DNS Names: rdr-kn24-f9jtx7mkm5-ks54l IP Addresses: 192.168.0.81 So will be making a necessary changes in machine-api-provider Power VS to add required fields
Verified with OCP 4.11.0-rc.1 No Pending csr seen post deployment. # oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.11.0-rc.1 True False 6m2s Cluster version is 4.11.0-rc.1 # oc get csr NAME AGE SIGNERNAME REQUESTOR REQUESTEDDURATION CONDITION csr-2ml94 37m kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper <none> Approved,Issued csr-9lwd8 36m kubernetes.io/kubelet-serving system:node:rdr-ipi-jl12-pravin-s-psmq6-master-1 <none> Approved,Issued csr-9srxt 13m kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper <none> Approved,Issued csr-crnmx 36m kubernetes.io/kubelet-serving system:node:rdr-ipi-jl12-pravin-s-psmq6-master-2 <none> Approved,Issued csr-dhq98 13m kubernetes.io/kubelet-serving system:node:rdr-ipi-jl12-pravin-s-psmq6-worker-9n9sk <none> Approved,Issued csr-dk5zd 37m kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper <none> Approved,Issued csr-gfwdg 14m kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper <none> Approved,Issued csr-nhp6p 13m kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper <none> Approved,Issued csr-nzb7d 13m kubernetes.io/kubelet-serving system:node:rdr-ipi-jl12-pravin-s-psmq6-worker-dkkvk <none> Approved,Issued csr-sjwct 36m kubernetes.io/kubelet-serving system:node:rdr-ipi-jl12-pravin-s-psmq6-master-0 <none> Approved,Issued csr-w5jp7 14m kubernetes.io/kubelet-serving system:node:rdr-ipi-jl12-pravin-s-psmq6-worker-q75v6 <none> Approved,Issued csr-w899t 37m kubernetes.io/kube-apiserver-client-kubelet system:serviceaccount:openshift-machine-config-operator:node-bootstrapper <none> Approved,Issued system:openshift:openshift-authenticator-2dhbp 34m kubernetes.io/kube-apiserver-client system:serviceaccount:openshift-authentication-operator:authentication-operator <none> Approved,Issued system:openshift:openshift-monitoring-7hnqf 33m kubernetes.io/kube-apiserver-client system:serviceaccount:openshift-monitoring:cluster-monitoring-operator <none> Approved,Issued
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5069