Description of problem: We patched our OpenShift Dedicated clusters for StackGuard. Since this was a kernel + glibc update, all nodes and masters were rebooted in series. All services should have remained available during the reboots, however after the reboots it appears custom rolebindings are missing. Version-Release number of selected component (if applicable): 3.4.1.18-1.git.0.0f9d380.el7 How reproducible: two clusters so far Steps to Reproduce: 1. Apply RHSA-2017:1481, RHSA-2017:1484; reboot all hosts in series 2. Create application from template Actual results: esb-template-nodejs Service Warning CreatingLoadBalancerFailed {service-controller } Error creating load balancer (will retry): Failed to create load balancer for service esb-templates-sit/esb-template-nodejs: error describing subnets: error listing AWS subnets: UnauthorizedOperation: You are not authorized to perform this operation. Expected results: Operations that worked prior to reboot should continue to work. Additional info:
The openshift apiserver does not modify rolebindings as part of startup if any such resources already exist. If the cluster is large enough, you may be waiting for the authorization cache to re-prime. Could you check to see if the rolebinding in question (and its role) still exists?
Can you provide the commands used to create the custom bindings, and the following info about them: 1. were they rolebindings or clusterrolebindings? if rolebindings, in what namespace and to what role? 2. what command/manifest was used to create the binding? 3. can you provide the output of the following from the current cluster: oc get clusterpolicy/default -o yaml oc get clusterpolicybinding/:default -o yaml and if the missing binding was a rolebinding: oc get policy/default -o yaml -n <binding-namespace> oc get policybinding/:default -o yaml -n <binding-namespace> are there any role-related commands run as part of a shutdown or bring-up script?
At this point we have been unable to reproduce the problem and do not have adequate data to make further progress. I apologize but am closing this bug as UNSUFFICIENT_DATA since we will be unable to resolve the issue.