Description of problem: The logs show: Copying system trust bundle cp: cannot remove '/etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem': Read-only file system Root cause is the customer was PoCing a security product StackRox. The product was correctly creating it's own scc, but they had given their SCC a priority of 100 with RunAsAny and readOnlyRootFilesystem: true. This put it's priority ahead of anyuid for certain operators causing them to crash as seen above. Similarly to the DefaultSecurityContextConstraints_Mutated alerting, how can we prevent an ill-advised SCC from negatively impacting the platform and ensure ISVs are configuring things correctly. Version-Release number of selected component (if applicable): 4.3.x How reproducible: Always Steps to Reproduce: 1. Create a scc as noted above 2. Observe operators like authentication 3. Actual results: Operators in crashloop without an obviously reason related to the SCC changes. Expected results: Warning or guidance around these types of platform impacting changes. Additional info: Setting the priority to less than anyuid fixed the issue.
DefaultSecurityContextConstraints_Mutated is going to reverted. PRs merged. Next z stream release should have it removed. The other topic must be analyzed. Have you done a comparison between the original and the installed SCC? Hard to believe that equal SCCs behave differently.
It's not that 2 equal SCCs are behaving differently, it's the impact of a 3rd party SCC can have on the platform components. A default install: oc get pod authentication-operator-7fb9bc495c-5pt9p -o yaml | grep scc openshift.io/scc: anyuid oc get pod oauth-openshift-594478b797-xkgxj -o yaml | grep scc openshift.io/scc: anyuid 3rd party tool comes along and creates its own SCC, as it should, but the SCC creates a conflict with anyuid. oc apply -f securitycontextconstraints-collector.yaml securitycontextconstraints.security.openshift.io/collector created The full scc is attached above. For a while, nothing may change as all of the pods are already running. An oauth change happens and the oauth pods start rolling: oc get pods oauth-openshift-594478b797-9gc98 -o yaml | grep scc openshift.io/scc: collector The first pods goes into a crashloopbackoff because now its using the collector SCC (because it has has a higher priority and it setting readonly) instead of anyuid which leads to the pods failing. This would be a bigger issue during an upgrade event. There are 4 operators and the oauth pods that use the anyuid SCC: authentication-operator, oauth-openshift, cluster-node-tuning-operator, openshift-service-catalog-apiserver-operator and openshift-service-catalog-controller-manager-operator
This is caused by the oauth-server pods not being specific enough about their security context and their service-account's privileges being too broad
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2409