Bug 2024659

Summary: e2e-metal-ipi-ovn-ipv6 failing TestAllowedSCCViaRBAC
Product: OpenShift Container Platform Reporter: OpenShift BugZilla Robot <openshift-bugzilla-robot>
Component: Test FrameworkAssignee: Derek Higgins <derekh>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.10CC: derekh, dgoodwin, pmuller
Target Milestone: ---Keywords: Triaged
Target Release: 4.9.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-02-24 11:44:49 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2022797    
Bug Blocks: 2027492    

Description OpenShift BugZilla Robot 2021-11-18 15:51:01 UTC
+++ This bug was initially created as a clone of Bug #2022797 +++

We're seeing occasional failures of the test
"[sig-auth][Feature:SecurityContextConstraints]  TestAllowedSCCViaRBAC [Suite:openshift/conformance/parallel]"

in the job 4.10-e2e-metal-ipi-ovn-ipv6, the test fails with the error

[BeforeEach] [sig-auth][Feature:SecurityContextConstraints]
  github.com/openshift/origin/test/extended/util/client.go:116
Nov 11 02:19:52.233: INFO: configPath is now "/tmp/configfile3736488686"
Nov 11 02:19:52.233: INFO: The user is now "e2e-test-scc-w6h76-user"
Nov 11 02:19:52.233: INFO: Creating project "e2e-test-scc-w6h76"
Nov 11 02:19:52.638: INFO: Waiting on permissions in project "e2e-test-scc-w6h76" ...
Nov 11 02:19:52.720: INFO: Waiting for ServiceAccount "default" to be provisioned...
Nov 11 02:19:52.902: INFO: Waiting for ServiceAccount "deployer" to be provisioned...
Nov 11 02:19:53.086: INFO: Waiting for ServiceAccount "builder" to be provisioned...
Nov 11 02:19:53.270: INFO: Waiting for RoleBinding "system:image-pullers" to be provisioned...
Nov 11 02:19:53.433: INFO: Waiting for RoleBinding "system:image-builders" to be provisioned...
Nov 11 02:19:53.597: INFO: Waiting for RoleBinding "system:deployers" to be provisioned...
Nov 11 02:19:54.362: INFO: Project "e2e-test-scc-w6h76" has been fully provisioned.
[It] TestAllowedSCCViaRBAC [Suite:openshift/conformance/parallel]
  github.com/openshift/origin/test/extended/security/scc.go:91
Nov 11 02:19:54.369: INFO: Creating project "e2e-test-scc-8fx79"
Nov 11 02:19:54.861: INFO: Waiting on permissions in project "e2e-test-scc-8fx79" ...
W1111 02:20:12.803936   57093 warnings.go:70] would violate "latest" version of "baseline" PodSecurity profile: host namespaces (hostPID=true)
[AfterEach] [sig-auth][Feature:SecurityContextConstraints]
  github.com/openshift/origin/test/extended/util/client.go:140
STEP: Collecting events from namespace "e2e-test-scc-w6h76".
STEP: Found 1 events.
Nov 11 02:20:13.234: INFO: At 0001-01-01 00:00:00 +0000 UTC - event for test3: { } FailedScheduling: 0/6 nodes are available: 3 node(s) didn't match Pod's node affinity/selector, 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate.
Nov 11 02:20:13.319: INFO: POD    NODE  PHASE    GRACE  CONDITIONS
Nov 11 02:20:13.319: INFO: test3        Pending         [{PodScheduled False 0001-01-01 00:00:00 +0000 UTC 2021-11-11 02:20:12 +0000 UTC Unschedulable 0/6 nodes are available: 3 node(s) didn't match Pod's node affinity/selector, 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate.}]
Nov 11 02:20:13.319: INFO:
Nov 11 02:20:13.563: INFO: skipping dumping cluster info - cluster too large
Nov 11 02:20:13.903: INFO: Deleted {user.openshift.io/v1, Resource=users  e2e-test-scc-w6h76-user}, err: <nil>

--- Additional comment from derekh on 2021-11-12 15:35:13 UTC ---

In tests that fail the following line is common
W1112 12:20:13.512073 1825198 warnings.go:70] would violate "latest" version of "baseline" PodSecurity profile: host namespaces (hostPID=true)



the problem with the job appears to be coming from the test "[sig-cli] oc adm cluster-role-reapers [Suite:openshift/conformance/parallel]"
this prune command https://github.com/openshift/origin/blob/master/test/extended/cli/admin.go#L355
        o.Expect(oc.Run("adm", "prune", "auth").Args("clusterrole/edit").Execute()).To(o.Succeed())
prunes out the rolebinding for the other test that is running at the same time,this could be causing more then just the TestAllowedSCCViaRBAC job to fail,
I'm thinking we move to to be a serial job

Comment 1 Petr Muller 2021-11-23 13:51:10 UTC
Reassigning to Test Framework who own the OCP testsuite

Comment 3 Derek Higgins 2021-11-30 14:12:01 UTC
Setting to verified based on observed improvements in 4.10 ci results.