Bug 1464569 - custom role bindings did not survive a cluster reboot
Summary: custom role bindings did not survive a cluster reboot
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: apiserver-auth
Version: 3.4.1
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
: 3.4.z
Assignee: Jordan Liggitt
QA Contact: Chuan Yu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-06-23 19:22 UTC by Sten Turpin
Modified: 2020-09-10 10:46 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-08-16 22:02:42 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Sten Turpin 2017-06-23 19:22:57 UTC
Description of problem: We patched our OpenShift Dedicated clusters for StackGuard. Since this was a kernel + glibc update, all nodes and masters were rebooted in series. All services should have remained available during the reboots, however after the reboots it appears custom rolebindings are missing. 


Version-Release number of selected component (if applicable): 3.4.1.18-1.git.0.0f9d380.el7


How reproducible: two clusters so far


Steps to Reproduce:
1. Apply RHSA-2017:1481, RHSA-2017:1484; reboot all hosts in series
2. Create application from template

Actual results:
esb-template-nodejs   Service               Warning   CreatingLoadBalancerFailed   {service-controller }   Error creating load balancer (will retry): Failed to create load balancer for service esb-templates-sit/esb-template-nodejs: error describing subnets: error listing AWS subnets: UnauthorizedOperation: You are not authorized to perform this operation.

Expected results:
Operations that worked prior to reboot should continue to work. 

Additional info:

Comment 1 David Eads 2017-06-23 20:04:31 UTC
The openshift apiserver does not modify rolebindings as part of startup if any such resources already exist.  If the cluster is large enough, you may be waiting for the authorization cache to re-prime.

Could you check to see if the rolebinding in question (and its role) still exists?

Comment 2 Jordan Liggitt 2017-06-24 15:04:53 UTC
Can you provide the commands used to create the custom bindings, and the following info about them:

1. were they rolebindings or clusterrolebindings? if rolebindings, in what namespace and to what role?
2. what command/manifest was used to create the binding?
3. can you provide the output of the following from the current cluster:
    oc get clusterpolicy/default -o yaml
    oc get clusterpolicybinding/:default -o yaml

  and if the missing binding was a rolebinding:
    oc get policy/default -o yaml -n <binding-namespace>
    oc get policybinding/:default -o yaml -n <binding-namespace>

are there any role-related commands run as part of a shutdown or bring-up script?

Comment 10 Eric Paris 2017-08-16 22:02:42 UTC
At this point we have been unable to reproduce the problem and do not have adequate data to make further progress. I apologize but am closing this bug as UNSUFFICIENT_DATA since we will be unable to resolve the issue.


Note You need to log in before you can comment on or make changes to this bug.