Bug 1785610 - kube-apiserver /%!(EXTRA *errors.StatusError=secrets "user-serving-cert" not found
Summary: kube-apiserver /%!(EXTRA *errors.StatusError=secrets "user-serving-cert" not ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: kube-apiserver
Version: 4.2.z
Hardware: All
OS: Unspecified
low
medium
Target Milestone: ---
: 4.2.z
Assignee: Stefan Schimanski
QA Contact: Ke Wang
URL:
Whiteboard:
: 1806089 (view as bug list)
Depends On: 1780243
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-12-20 12:49 UTC by Alexander Klein
Modified: 2023-09-07 21:19 UTC (History)
14 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-04-02 11:02:40 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
apiserver log (124.11 KB, application/octet-stream)
2019-12-20 12:49 UTC, Alexander Klein
no flags Details
cert syncer log (149.38 KB, application/octet-stream)
2020-01-03 12:29 UTC, dfroehli
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-kube-apiserver-operator pull 766 0 None closed Bug 1803750: [release-4.2] Improve node controller condition message 2021-02-03 11:21:19 UTC
Red Hat Bugzilla 1772190 0 medium CLOSED Excess events written for file that exists, poorly formatted event message 2024-02-04 04:25:30 UTC
Red Hat Product Errata RHBA-2020:0936 0 None None None 2020-04-02 11:02:49 UTC

Internal Links: 1806089

Description Alexander Klein 2019-12-20 12:49:55 UTC
Created attachment 1646796 [details]
apiserver log

Description of problem:

cluster installed with installer version  4.2.12-s390x
rhcos version 4.2.10-s390x


every ~5 minutes the kube-apiserver throw the event
Removed file for secret: /%!(EXTRA *errors.StatusError=secrets "user-serving-cert" not found)

it does automatically recover but something seems to be broken here

Comment 1 Carvel Baus 2020-01-02 21:03:23 UTC
logfile attached does not contain the error as described - is this the correct log file?

Comment 2 dfroehli 2020-01-03 12:29:55 UTC
Created attachment 1649409 [details]
cert syncer log

Comment 3 dfroehli 2020-01-03 12:31:30 UTC
Happy new year!
I see the same message on OCP V4.12 X86_64. Message is in cert-syncer container.
My system is on Red Hat VPN, please ping me on Hangouts if you would like to see yourself.
Cheers Dan

Comment 4 Carvel Baus 2020-01-03 14:57:34 UTC
If the same thing is happening on x86, then that consistency would suggest its not S/390 specific - I will dig a little deeper and see whats happening. Before the error message on each line is "type: 'Warning'" so it may be just that, or the error is getting incorrectly wrapped as a warning for the event. Will look into it further.

Comment 5 dfroehli 2020-01-03 16:01:36 UTC
Thx for looking into this. What makes my cluster a little bit special is that I use a Non-Public-CA as a signer for the ingress router server cert. I.e. *.apps.ocp4... is served with a cert which is signed by a CA which is per default NOT in the the RHCOS trust store (actually, it is signed by the Red Hat Internal CA). 
I followed the installation guidance to add the CA Trust Chain as described here:
https://docs.openshift.com/container-platform/4.2/networking/configuring-a-custom-pki.html
However, this is NOT true for api.ocp4..., that is using certs from openshift internal.
Just wanted to mention this, as the message is about certs.

Comment 6 Carvel Baus 2020-01-03 17:51:08 UTC
This appears to be underway already in the linked bug. That PR is for 4.3 and there is an open/pending discussion about back porting to 4.2.x

Comment 7 dfroehli 2020-01-03 20:11:20 UTC
Thx. I dont need a backport for this. 4.3 is (hopefully) coming soon, and this seems not to have any impact besides being annoying. Thx again! Dan

Comment 8 pk 2020-02-17 09:33:31 UTC
Hi, is this PR available for 4.2.x?

Comment 9 cheeva 2020-02-17 09:54:47 UTC
I have this issue in OCP 4.2.16.

Comment 10 Carvel Baus 2020-02-17 17:07:22 UTC
I am not aware of a fix or this in 4.2.x. According to the linked bug above, it appears to be corrected for 4.3.

Comment 11 pk 2020-02-18 03:01:35 UTC
Hi Carvel,

There are some customers in our region who cant upgrade their cluster to 4.3 due to application compatibility issue. Can we request for this bug fix to be back-ported to v 4.2.x?

Comment 12 Carvel Baus 2020-02-18 12:55:29 UTC
This bug racks the issue for non-x86 architectures. I think you'll need to open a new bug for the backport for x86 and reference the bug with the actual fix.

Comment 14 David Hernández Fernández 2020-02-21 09:20:35 UTC
@Ashish Prajapati: Have you opened the bug for 4.2.z on x86? Let us know in order to not duplicate the issue.

Comment 17 Stefan Schimanski 2020-02-24 12:56:47 UTC
*** Bug 1806089 has been marked as a duplicate of this bug. ***

Comment 19 Xingxing Xia 2020-03-10 02:53:22 UTC
(Just fyi) per above comments (TL;DR), not sure if this bug is duplicated with bug 1780243#c2 .

Comment 20 Ke Wang 2020-03-18 11:04:11 UTC
Verified with OCP build 4.2.0-0.nightly-2020-03-16-141929,

$  oc get events |grep "Removed file for secret"

Nothing found. So move the bug verify.

Comment 23 errata-xmlrpc 2020-04-02 11:02:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0936


Note You need to log in before you can comment on or make changes to this bug.