Bug 2036870

Summary: Certificate related errors can be seen in kube-apiserver and kube-scheduler logs while the secrets that contain certificate info are all in place after a long-term shutdown
Product: OpenShift Container Platform Reporter: yhe
Component: kube-apiserverAssignee: Abu Kashem <akashem>
Status: CLOSED WONTFIX QA Contact: Ke Wang <kewang>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.8CC: akashem, aos-bugs, mfojtik, oarribas, xxia
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-01-16 11:57:46 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description yhe 2022-01-04 09:42:02 UTC
Description of problem:
The customer has stoped their cluster for about 1 month, and after restarting the cluster, several pods get stuck at Pending status.

The following error can be seen in the logs of kube-scheduler pod:

2021-12-23T01:35:45.459672477Z E1223 01:35:45.459639       1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.ReplicationController: failed to list *v1.ReplicationController: Unauthorized

After checking the logs of kube-apiserver pod, I found a mass of the following error messages:

2021-12-23T01:35:26.664312153Z E1223 01:35:26.664214      20 authentication.go:63] "Unable to authenticate the request" err="[x509: certificate signed by unknown authority, verifying certificate SN=6101806303185982722, SKID=, AKID=EC:24:00:41:35:C8:A1:50:6D:2D:34:50:D6:66:59:F1:81:6F:EA:18 failed: x509: certificate signed by unknown authority (possibly because of \"crypto/rsa: verification error\" while trying to verify candidate authority certificate \"openshift-kube-apiserver-operator_kube-control-plane-signer@1640165082\")]"

It appears that there is something wrong with the certificates but everything seems to be fine when checking the secrets that contain certificate info.

Version-Release number of selected component (if applicable):

How reproducible:
Unsure

Steps to Reproduce:
Unsure

Actual results:
Errors can be seen in kube-apiserver and kube-scheduler logs and several pods are stuck at Pending status.

Expected results:
No errors are in kube-apiserver and kube-scheduler logs and pods get scheduled correctly

Additional info:

Comment 5 Michal Fojtik 2023-01-16 11:57:46 UTC
Dear reporter, we greatly appreciate the bug you have reported here. Unfortunately, due to migration to a new issue-tracking system (https://issues.redhat.com/), we cannot continue triaging bugs reported in Bugzilla. Since this bug has been stale for multiple days, we, therefore, decided to close this bug.
If you think this is a mistake or this bug has a higher priority or severity as set today, please feel free to reopen this bug and tell us why. We are going to move every re-opened bug to https://issues.redhat.com. 

Thank you for your patience and understanding.

Comment 6 Red Hat Bugzilla 2023-09-18 04:29:51 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days