Bug 2022797 - e2e-metal-ipi-ovn-ipv6 failing TestAllowedSCCViaRBAC
Summary: e2e-metal-ipi-ovn-ipv6 failing TestAllowedSCCViaRBAC
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Test Infrastructure
Version: 4.10
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 4.10.0
Assignee: Derek Higgins
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 2024659
TreeView+ depends on / blocked
 
Reported: 2021-11-12 15:32 UTC by Derek Higgins
Modified: 2022-03-10 23:27 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-03-10 23:27:06 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift origin pull 26603 0 None Merged Bug 2022797: Switch the cluster-role-reapers test to Serial 2022-03-10 23:26:20 UTC

Description Derek Higgins 2021-11-12 15:32:53 UTC
We're seeing occasional failures of the test
"[sig-auth][Feature:SecurityContextConstraints]  TestAllowedSCCViaRBAC [Suite:openshift/conformance/parallel]"

in the job 4.10-e2e-metal-ipi-ovn-ipv6, the test fails with the error

[BeforeEach] [sig-auth][Feature:SecurityContextConstraints]
  github.com/openshift/origin/test/extended/util/client.go:116
Nov 11 02:19:52.233: INFO: configPath is now "/tmp/configfile3736488686"
Nov 11 02:19:52.233: INFO: The user is now "e2e-test-scc-w6h76-user"
Nov 11 02:19:52.233: INFO: Creating project "e2e-test-scc-w6h76"
Nov 11 02:19:52.638: INFO: Waiting on permissions in project "e2e-test-scc-w6h76" ...
Nov 11 02:19:52.720: INFO: Waiting for ServiceAccount "default" to be provisioned...
Nov 11 02:19:52.902: INFO: Waiting for ServiceAccount "deployer" to be provisioned...
Nov 11 02:19:53.086: INFO: Waiting for ServiceAccount "builder" to be provisioned...
Nov 11 02:19:53.270: INFO: Waiting for RoleBinding "system:image-pullers" to be provisioned...
Nov 11 02:19:53.433: INFO: Waiting for RoleBinding "system:image-builders" to be provisioned...
Nov 11 02:19:53.597: INFO: Waiting for RoleBinding "system:deployers" to be provisioned...
Nov 11 02:19:54.362: INFO: Project "e2e-test-scc-w6h76" has been fully provisioned.
[It] TestAllowedSCCViaRBAC [Suite:openshift/conformance/parallel]
  github.com/openshift/origin/test/extended/security/scc.go:91
Nov 11 02:19:54.369: INFO: Creating project "e2e-test-scc-8fx79"
Nov 11 02:19:54.861: INFO: Waiting on permissions in project "e2e-test-scc-8fx79" ...
W1111 02:20:12.803936   57093 warnings.go:70] would violate "latest" version of "baseline" PodSecurity profile: host namespaces (hostPID=true)
[AfterEach] [sig-auth][Feature:SecurityContextConstraints]
  github.com/openshift/origin/test/extended/util/client.go:140
STEP: Collecting events from namespace "e2e-test-scc-w6h76".
STEP: Found 1 events.
Nov 11 02:20:13.234: INFO: At 0001-01-01 00:00:00 +0000 UTC - event for test3: { } FailedScheduling: 0/6 nodes are available: 3 node(s) didn't match Pod's node affinity/selector, 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate.
Nov 11 02:20:13.319: INFO: POD    NODE  PHASE    GRACE  CONDITIONS
Nov 11 02:20:13.319: INFO: test3        Pending         [{PodScheduled False 0001-01-01 00:00:00 +0000 UTC 2021-11-11 02:20:12 +0000 UTC Unschedulable 0/6 nodes are available: 3 node(s) didn't match Pod's node affinity/selector, 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate.}]
Nov 11 02:20:13.319: INFO:
Nov 11 02:20:13.563: INFO: skipping dumping cluster info - cluster too large
Nov 11 02:20:13.903: INFO: Deleted {user.openshift.io/v1, Resource=users  e2e-test-scc-w6h76-user}, err: <nil>

Comment 1 Derek Higgins 2021-11-12 15:35:13 UTC
In tests that fail the following line is common
W1112 12:20:13.512073 1825198 warnings.go:70] would violate "latest" version of "baseline" PodSecurity profile: host namespaces (hostPID=true)



the problem with the job appears to be coming from the test "[sig-cli] oc adm cluster-role-reapers [Suite:openshift/conformance/parallel]"
this prune command https://github.com/openshift/origin/blob/master/test/extended/cli/admin.go#L355
        o.Expect(oc.Run("adm", "prune", "auth").Args("clusterrole/edit").Execute()).To(o.Succeed())
prunes out the rolebinding for the other test that is running at the same time,this could be causing more then just the TestAllowedSCCViaRBAC job to fail,
I'm thinking we move to to be a serial job

Comment 2 Derek Higgins 2021-11-18 16:59:11 UTC
Based on anecdotal evidence (recent CI results) this appears to have helped reliability, marking as verified.

Comment 3 W. Trevor King 2022-03-10 23:27:06 UTC
targets 4.10, not attached to errata, seems happy based on comment 2 -> moving to CURRENTRELEASE.


Note You need to log in before you can comment on or make changes to this bug.