Bug 1785610

Summary: kube-apiserver /%!(EXTRA *errors.StatusError=secrets "user-serving-cert" not found
Product: OpenShift Container Platform Reporter: Alexander Klein <alklein>
Component: kube-apiserverAssignee: Stefan Schimanski <sttts>
Status: CLOSED ERRATA QA Contact: Ke Wang <kewang>
Severity: medium Docs Contact:
Priority: low    
Version: 4.2.zCC: amccrae, aos-bugs, aprajapa, cbaus, cheeva.tee, crawford, dahernan, dbenoit, dfroehli, dgilmore, mfojtik, pchoo, sttts, xxia
Target Milestone: ---   
Target Release: 4.2.z   
Hardware: All   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-04-02 11:02:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1780243    
Bug Blocks:    
Attachments:
Description Flags
apiserver log
none
cert syncer log none

Description Alexander Klein 2019-12-20 12:49:55 UTC
Created attachment 1646796 [details]
apiserver log

Description of problem:

cluster installed with installer version  4.2.12-s390x
rhcos version 4.2.10-s390x


every ~5 minutes the kube-apiserver throw the event
Removed file for secret: /%!(EXTRA *errors.StatusError=secrets "user-serving-cert" not found)

it does automatically recover but something seems to be broken here

Comment 1 Carvel Baus 2020-01-02 21:03:23 UTC
logfile attached does not contain the error as described - is this the correct log file?

Comment 2 dfroehli 2020-01-03 12:29:55 UTC
Created attachment 1649409 [details]
cert syncer log

Comment 3 dfroehli 2020-01-03 12:31:30 UTC
Happy new year!
I see the same message on OCP V4.12 X86_64. Message is in cert-syncer container.
My system is on Red Hat VPN, please ping me on Hangouts if you would like to see yourself.
Cheers Dan

Comment 4 Carvel Baus 2020-01-03 14:57:34 UTC
If the same thing is happening on x86, then that consistency would suggest its not S/390 specific - I will dig a little deeper and see whats happening. Before the error message on each line is "type: 'Warning'" so it may be just that, or the error is getting incorrectly wrapped as a warning for the event. Will look into it further.

Comment 5 dfroehli 2020-01-03 16:01:36 UTC
Thx for looking into this. What makes my cluster a little bit special is that I use a Non-Public-CA as a signer for the ingress router server cert. I.e. *.apps.ocp4... is served with a cert which is signed by a CA which is per default NOT in the the RHCOS trust store (actually, it is signed by the Red Hat Internal CA). 
I followed the installation guidance to add the CA Trust Chain as described here:
https://docs.openshift.com/container-platform/4.2/networking/configuring-a-custom-pki.html
However, this is NOT true for api.ocp4..., that is using certs from openshift internal.
Just wanted to mention this, as the message is about certs.

Comment 6 Carvel Baus 2020-01-03 17:51:08 UTC
This appears to be underway already in the linked bug. That PR is for 4.3 and there is an open/pending discussion about back porting to 4.2.x

Comment 7 dfroehli 2020-01-03 20:11:20 UTC
Thx. I dont need a backport for this. 4.3 is (hopefully) coming soon, and this seems not to have any impact besides being annoying. Thx again! Dan

Comment 8 pk 2020-02-17 09:33:31 UTC
Hi, is this PR available for 4.2.x?

Comment 9 cheeva 2020-02-17 09:54:47 UTC
I have this issue in OCP 4.2.16.

Comment 10 Carvel Baus 2020-02-17 17:07:22 UTC
I am not aware of a fix or this in 4.2.x. According to the linked bug above, it appears to be corrected for 4.3.

Comment 11 pk 2020-02-18 03:01:35 UTC
Hi Carvel,

There are some customers in our region who cant upgrade their cluster to 4.3 due to application compatibility issue. Can we request for this bug fix to be back-ported to v 4.2.x?

Comment 12 Carvel Baus 2020-02-18 12:55:29 UTC
This bug racks the issue for non-x86 architectures. I think you'll need to open a new bug for the backport for x86 and reference the bug with the actual fix.

Comment 14 David Hernández Fernández 2020-02-21 09:20:35 UTC
@Ashish Prajapati: Have you opened the bug for 4.2.z on x86? Let us know in order to not duplicate the issue.

Comment 17 Stefan Schimanski 2020-02-24 12:56:47 UTC
*** Bug 1806089 has been marked as a duplicate of this bug. ***

Comment 19 Xingxing Xia 2020-03-10 02:53:22 UTC
(Just fyi) per above comments (TL;DR), not sure if this bug is duplicated with bug 1780243#c2 .

Comment 20 Ke Wang 2020-03-18 11:04:11 UTC
Verified with OCP build 4.2.0-0.nightly-2020-03-16-141929,

$  oc get events |grep "Removed file for secret"

Nothing found. So move the bug verify.

Comment 23 errata-xmlrpc 2020-04-02 11:02:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0936