Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1746521

Summary: csr approver might not approve server csr
Product: OpenShift Container Platform Reporter: Michael Gugino <mgugino>
Component: Cloud ComputeAssignee: Andrew McDermott <amcdermo>
Status: CLOSED ERRATA QA Contact: Jianwei Hou <jhou>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 4.2.0CC: agarcial, amcdermo, brad.ison
Target Milestone: ---   
Target Release: 4.2.0   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-10-16 06:38:14 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Michael Gugino 2019-08-28 16:34:46 UTC
Description of problem:
Sometimes, when adding a new machine via the machine-api, the server-side cert might not get approved.

This appears to be related to a race condition between the csr-approver and the node-link controller.  If there is even the slightest delay in the node-link controller adding the node-ref to the machine-object, the csr-approver will disregard the CSR after 5 retries.  The problem stems from that there is no back-off time for those retries, thus the 5 retries can be exhausted in milliseconds.

Sometimes, but not always, this results in a server-side CSR not getting approved.  Most of the time, the nodelink controller is fast enough to prevent this from occurring.

Version-Release number of selected component (if applicable):


How reproducible:
~10%

Steps to Reproduce:
1. Install new cluster
2. Look at csr-approver logs

Actual results:
CSR might not be approved

Expected results:
All csrs approved

Additional info:

Timestamps demonstrating how fast the loop is:

I0828 04:56:27.129796       1 main.go:107] CSR csr-jzpbj added
I0828 04:56:27.159898       1 main.go:147] CSR csr-jzpbj approved
I0828 04:56:27.532187       1 main.go:107] CSR csr-bsss4 added
I0828 04:56:27.545479       1 main.go:132] CSR csr-bsss4 not authorized: No target machine
I0828 04:56:27.545689       1 main.go:164] Error syncing csr csr-bsss4: No target machine
I0828 04:56:27.551968       1 main.go:107] CSR csr-bsss4 added
I0828 04:56:27.581094       1 main.go:132] CSR csr-bsss4 not authorized: No target machine
I0828 04:56:27.581145       1 main.go:164] Error syncing csr csr-bsss4: No target machine
I0828 04:56:27.591878       1 main.go:107] CSR csr-bsss4 added
I0828 04:56:27.642413       1 main.go:147] CSR csr-bsss4 approved

Comment 4 errata-xmlrpc 2019-10-16 06:38:14 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:2922