Description of problem: Sometimes, when adding a new machine via the machine-api, the server-side cert might not get approved. This appears to be related to a race condition between the csr-approver and the node-link controller. If there is even the slightest delay in the node-link controller adding the node-ref to the machine-object, the csr-approver will disregard the CSR after 5 retries. The problem stems from that there is no back-off time for those retries, thus the 5 retries can be exhausted in milliseconds. Sometimes, but not always, this results in a server-side CSR not getting approved. Most of the time, the nodelink controller is fast enough to prevent this from occurring. Version-Release number of selected component (if applicable): How reproducible: ~10% Steps to Reproduce: 1. Install new cluster 2. Look at csr-approver logs Actual results: CSR might not be approved Expected results: All csrs approved Additional info: Timestamps demonstrating how fast the loop is: I0828 04:56:27.129796 1 main.go:107] CSR csr-jzpbj added I0828 04:56:27.159898 1 main.go:147] CSR csr-jzpbj approved I0828 04:56:27.532187 1 main.go:107] CSR csr-bsss4 added I0828 04:56:27.545479 1 main.go:132] CSR csr-bsss4 not authorized: No target machine I0828 04:56:27.545689 1 main.go:164] Error syncing csr csr-bsss4: No target machine I0828 04:56:27.551968 1 main.go:107] CSR csr-bsss4 added I0828 04:56:27.581094 1 main.go:132] CSR csr-bsss4 not authorized: No target machine I0828 04:56:27.581145 1 main.go:164] Error syncing csr csr-bsss4: No target machine I0828 04:56:27.591878 1 main.go:107] CSR csr-bsss4 added I0828 04:56:27.642413 1 main.go:147] CSR csr-bsss4 approved
https://github.com/openshift/cluster-machine-approver/pull/41
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2922