Bug 1359900
Summary: | With multiple API servers they race to bootstrap policy | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Jeremy Eder <jeder> | ||||||
Component: | apiserver-auth | Assignee: | Jordan Liggitt <jliggitt> | ||||||
Status: | CLOSED ERRATA | QA Contact: | weiwei jiang <wjiang> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | medium | ||||||||
Version: | 3.3.0 | CC: | abutcher, aos-bugs, bleanhar, ekuric, ghuang, gpei, jliggitt, jokerman, mifiedle, mmccomas, tdawson, tstclair, wsun | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | All | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: |
Cause:
Multiple API servers starting simultaneously with an empty etcd datastore would race to populate the default system policy.
Consequence:
A partially created policy could result, leaving a new cluster with a policy that would forbid system components from making some API calls.
Fix:
The policy APIs were updated to perform the same resourceVersion checking as other APIs, and fault-tolerant logic was added to the initial policy population step.
Result:
New clusters populate default policy as expected.
|
Story Points: | --- | ||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2016-09-27 09:41:34 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Comment 1
Scott Dodson
2016-07-26 18:55:25 UTC
My test environment was bad. Now that I've re-provisioned the environment I can no longer re-produce this when specifying the inventory name as an ip. Created attachment 1184683 [details]
/etc/origin, master log and ansible inventory
/etc/origin, master log and ansible inventory attached
`oadm policy reconcile-cluster-role-bindings` fixed the issue, existing nodes immediately registered themselves. Now as to why that's necessary, we're still not sure. This seems to be the result of 3 API servers starting for the first time at the same time. We can work around this in the installer but it'd be nice if the product itself prevented that from being a problem via some sort of locking mechanism. I'll attach logs. Ansible work-around https://github.com/openshift/openshift-ansible/pull/2233 Created attachment 1185156 [details]
api server logs
*** Bug 1361313 has been marked as a duplicate of this bug. *** Fixed upstream in https://github.com/openshift/origin/pull/10099 This has been merged and is in OSE v3.3.0.14 or newer. Verified with openshift v3.3.0.14 Successed to install HA env and make S2I build. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2016:1933 |