Bug 1464569 - custom role bindings did not survive a cluster reboot
custom role bindings did not survive a cluster reboot
Product: OpenShift Container Platform
Classification: Red Hat
Component: Auth (Show other bugs)
x86_64 Linux
unspecified Severity high
: ---
: 3.4.z
Assigned To: Jordan Liggitt
Chuan Yu
: OpsBlocker
Depends On:
  Show dependency treegraph
Reported: 2017-06-23 15:22 EDT by Sten Turpin
Modified: 2017-08-16 18:02 EDT (History)
10 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2017-08-16 18:02:42 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description Sten Turpin 2017-06-23 15:22:57 EDT
Description of problem: We patched our OpenShift Dedicated clusters for StackGuard. Since this was a kernel + glibc update, all nodes and masters were rebooted in series. All services should have remained available during the reboots, however after the reboots it appears custom rolebindings are missing. 

Version-Release number of selected component (if applicable):

How reproducible: two clusters so far

Steps to Reproduce:
1. Apply RHSA-2017:1481, RHSA-2017:1484; reboot all hosts in series
2. Create application from template

Actual results:
esb-template-nodejs   Service               Warning   CreatingLoadBalancerFailed   {service-controller }   Error creating load balancer (will retry): Failed to create load balancer for service esb-templates-sit/esb-template-nodejs: error describing subnets: error listing AWS subnets: UnauthorizedOperation: You are not authorized to perform this operation.

Expected results:
Operations that worked prior to reboot should continue to work. 

Additional info:
Comment 1 David Eads 2017-06-23 16:04:31 EDT
The openshift apiserver does not modify rolebindings as part of startup if any such resources already exist.  If the cluster is large enough, you may be waiting for the authorization cache to re-prime.

Could you check to see if the rolebinding in question (and its role) still exists?
Comment 2 Jordan Liggitt 2017-06-24 11:04:53 EDT
Can you provide the commands used to create the custom bindings, and the following info about them:

1. were they rolebindings or clusterrolebindings? if rolebindings, in what namespace and to what role?
2. what command/manifest was used to create the binding?
3. can you provide the output of the following from the current cluster:
    oc get clusterpolicy/default -o yaml
    oc get clusterpolicybinding/:default -o yaml

  and if the missing binding was a rolebinding:
    oc get policy/default -o yaml -n <binding-namespace>
    oc get policybinding/:default -o yaml -n <binding-namespace>

are there any role-related commands run as part of a shutdown or bring-up script?
Comment 10 Eric Paris 2017-08-16 18:02:42 EDT
At this point we have been unable to reproduce the problem and do not have adequate data to make further progress. I apologize but am closing this bug as UNSUFFICIENT_DATA since we will be unable to resolve the issue.

Note You need to log in before you can comment on or make changes to this bug.