Bug 1717610
| Summary: | autoapproval tolerances too strict for some large scale ups | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Justin Pierce <jupierce> |
| Component: | Cloud Compute | Assignee: | Michael Gugino <mgugino> |
| Status: | CLOSED ERRATA | QA Contact: | Jianwei Hou <jhou> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 4.1.0 | CC: | agarcial, aos-bugs, brad.ison, brad.williams, nagrawal, sttts, zhsun |
| Target Milestone: | --- | ||
| Target Release: | 4.2.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2019-10-16 06:31:10 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Justin Pierce
2019-06-05 19:44:58 UTC
Do we have logs of the approver? Justin Pierce, can we get an answer for "did this happen because they have Disabled bootstrap csr approval?" and machine approval logs? Thanks! @Alberto - csr approval was not disabled at any time. Disabling csr approval in https://bugzilla.redhat.com/show_bug.cgi?id=1717602 was only mentioned as a means to reproduce that specific problem easily. The behaviors described in this BZ are by design in the approval code itself and reproducible. Those behaviors are inconsistent with [1] rapid scale ups since they can create > 100 pending CSRs and [2] slow scale ups where a cloud provider may not be able to supply an instance in a timely manner since the machine timestamp and the CSR timestamp will be significantly different. Both issues can be identified in this log snippet: I0705 14:37:01.863893 1 main.go:164] Error syncing csr csr-4j59g: CSR csr-4j59g creation time 2019-07-05 14:37:01 +0000 UTC not in range (2019-07-05 14:15:07 +0000 UTC, 2019-07-05 14:25:17 +0000 UTC) I0705 14:37:01.944105 1 main.go:107] CSR csr-4j59g added I0705 14:37:01.985857 1 main.go:132] CSR csr-4j59g not authorized: CSR csr-4j59g creation time 2019-07-05 14:37:01 +0000 UTC not in range (2019-07-05 14:15:07 +0000 UTC, 2019-07-05 14:25:17 +0000 UTC) E0705 14:37:01.985885 1 main.go:174] CSR csr-4j59g creation time 2019-07-05 14:37:01 +0000 UTC not in range (2019-07-05 14:15:07 +0000 UTC, 2019-07-05 14:25:17 +0000 UTC) I0705 14:37:01.985894 1 main.go:175] Dropping CSR "csr-4j59g" out of the queue: CSR csr-4j59g creation time 2019-07-05 14:37:01 +0000 UTC not in range (2019-07-05 14:15:07 +0000 UTC, 2019-07-05 14:25:17 +0000 UTC) I0705 14:37:08.135062 1 main.go:107] CSR csr-qmnck added I0705 14:37:08.135138 1 main.go:115] ignoring all CSRs as too many recent pending CSRs seen: 101 I0705 14:37:11.790037 1 main.go:107] CSR csr-tcxpt added I0705 14:37:11.790112 1 main.go:115] ignoring all CSRs as too many recent pending CSRs seen: 102 I0705 14:37:52.393836 1 main.go:107] CSR csr-gk2ql added I0705 14:37:52.393911 1 main.go:115] ignoring all CSRs as too many recent pending CSRs seen: 103 I0705 14:37:59.393294 1 main.go:107] CSR csr-z9qgx added Created attachment 1587735 [details]
approver logs
Verified in 4.2.0-0.nightly-2019-09-08-180038 Scaled over 100 nodes, did not see the problem again. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2922 |