Bug 1733600

Summary: Failing tests: [sig-storage] In-tree Volumes [Driver: local][LocalVolumeType: gce-localssd-scsi-fs] [Serial] [Testpattern: Pre-provisioned PV (default fs)] volumes should be mountable
Product: OpenShift Container Platform Reporter: Hongkai Liu <hongkliu>
Component: apiserver-authAssignee: Stefan Schimanski <sttts>
Status: CLOSED CURRENTRELEASE QA Contact: Wei Sun <wsun>
Severity: low Docs Contact:
Priority: unspecified    
Version: 4.2.0CC: eparis, jsafrane, mfojtik, vlaad, xtian
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: buildcop
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-08-25 22:06:13 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Hongkai Liu 2019-07-26 17:47:39 UTC
Description of problem:


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:
https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-serial-4.2/2525

Only saw this once.

Jul 26 16:18:50.442: INFO: namespace e2e-volume-4361 deletion completed in 8.313529351s
Jul 26 16:18:50.451: INFO: Running AfterSuite actions on all nodes
Jul 26 16:18:50.451: INFO: Running AfterSuite actions on node 1
fail [k8s.io/kubernetes/test/e2e/storage/drivers/in_tree.go:1686]: Unexpected error:
    <exec.CodeExitError>: {
        Err: {
            s: "error running &{/usr/bin/kubectl [kubectl --server=https://api.ci-op-d9y1trpl-ce7d8.origin-ci-int-aws.dev.rhcloud.com:6443 --kubeconfig=/tmp/admin.kubeconfig exec --namespace=e2e-volume-4361 hostexec-ip-10-0-140-56.ec2.internal -- nsenter --mount=/rootfs/proc/1/ns/mnt -- sh -c ls -1 /mnt/disks/by-uuid/google-local-ssds-scsi-fs/ | wc -l] []  <nil>  Error from server (Forbidden): pods \"hostexec-ip-10-0-140-56.ec2.internal\" is forbidden: exec operation is not allowed because the pod's security context exceeds your permissions: pods \"hostexec-ip-10-0-140-56.ec2.internal\" is forbidden: no SecurityContextConstraints found in cluster\n [] <nil> 0xc00246a810 exit status 1 <nil> <nil> true [0xc001b729c0 0xc001b72a00 0xc001b72a18] [0xc001b729c0 0xc001b72a00 0xc001b72a18] [0xc001b729e8 0xc001b72a10] [0x93f720 0x93f720] 0xc001b8d5c0 <nil>}:\nCommand stdout:\n\nstderr:\nError from server (Forbidden): pods \"hostexec-ip-10-0-140-56.ec2.internal\" is forbidden: exec operation is not allowed because the pod's security context exceeds your permissions: pods \"hostexec-ip-10-0-140-56.ec2.internal\" is forbidden: no SecurityContextConstraints found in cluster\n\nerror:\nexit status 1\n",
        },
        Code: 1,
    }
    error running &{/usr/bin/kubectl [kubectl --server=https://api.ci-op-d9y1trpl-ce7d8.origin-ci-int-aws.dev.rhcloud.com:6443 --kubeconfig=/tmp/admin.kubeconfig exec --namespace=e2e-volume-4361 hostexec-ip-10-0-140-56.ec2.internal -- nsenter --mount=/rootfs/proc/1/ns/mnt -- sh -c ls -1 /mnt/disks/by-uuid/google-local-ssds-scsi-fs/ | wc -l] []  <nil>  Error from server (Forbidden): pods "hostexec-ip-10-0-140-56.ec2.internal" is forbidden: exec operation is not allowed because the pod's security context exceeds your permissions: pods "hostexec-ip-10-0-140-56.ec2.internal" is forbidden: no SecurityContextConstraints found in cluster
     [] <nil> 0xc00246a810 exit status 1 <nil> <nil> true [0xc001b729c0 0xc001b72a00 0xc001b72a18] [0xc001b729c0 0xc001b72a00 0xc001b72a18] [0xc001b729e8 0xc001b72a10] [0x93f720 0x93f720] 0xc001b8d5c0 <nil>}:
    Command stdout:
    
    stderr:
    Error from server (Forbidden): pods "hostexec-ip-10-0-140-56.ec2.internal" is forbidden: exec operation is not allowed because the pod's security context exceeds your permissions: pods "hostexec-ip-10-0-140-56.ec2.internal" is forbidden: no SecurityContextConstraints found in cluster
    
    error:
    exit status 1
    
occurred


Jul 26 17:21:08.778 I ns/openshift-machine-config-operator pod/machine-config-daemon-kd75j Created container machine-config-daemon
Jul 26 17:21:08.804 I ns/openshift-machine-config-operator pod/machine-config-daemon-kd75j Started container machine-config-daemon
 Failing tests:
 [sig-storage] In-tree Volumes [Driver: local][LocalVolumeType: gce-localssd-scsi-fs] [Serial] [Testpattern: Pre-provisioned PV (default fs)] volumes should be mountable [Suite:openshift/conformance/serial] [Suite:k8s]
 Writing JUnit report to /tmp/artifacts/junit/junit_e2e_20190726-172110.xml
 error: 1 fail, 50 pass, 166 skip (1h2m45s)
2019/07/26 17:21:11 Container test in pod e2e-aws-serial failed, exit code 1, reason Error
2019/07/26 17:27:34 Copied 188.83Mi of artifacts from e2e-aws-serial to /logs/artifacts/e2e-aws-serial
2019/07/26 17:27:40 Ran for 1h39m12s
error: could not run steps: step e2e-aws-serial failed: template pod "e2e-aws-serial" failed: the pod ci-op-d9y1trpl/e2e-aws-serial failed after 1h37m17s (failed containers: test): ContainerFailed one or more containers exited
 Container test exited with code 1, reason Error
---
-config-operator pod/machine-config-daemon-s626m Stopping container machine-config-daemon
Jul 26 17:20:56.039 I ns/openshift-machine-config-operator pod/machine-config-daemon-s626m Marking for deletion Pod openshift-machine-config-operator/machine-config-daemon-s626m
Jul 26 17:20:56.041 I ns/openshift-image-registry pod/node-ca-5t2bq Stopping container node-ca
Jul 26 17:20:56.043 I ns/openshift-image-registry daemonset/node-ca Deleted pod: node-ca-5t2bq
Jul 26 17:20:56.044 I ns/openshift-image-registry pod/node-ca-5t2bq Marking for deletion Pod openshift-image-registry/node-ca-5t2bq
Jul 26 17:21:08.134 W ns/openshift-machine-config-operator pod/machine-config-daemon-s626m node/ip-10-0-158-160.ec2.internal deleted
Jul 26 17:21:08.148 I ns/openshift-machine-config-operator pod/machine-config-daemon-kd75j node/ created
Jul 26 17:21:08.152 I ns/openshift-machine-config-operator daemonset/machine-config-daemon Created pod: machine-config-daemon-kd75j
Jul 26 17:21:08.159 I ns/openshift-machine-config-operator pod/machine-config-daemon-kd75j Successfully assigned openshift-machine-config-operator/machine-config-daemon-kd75j to ip-10-0-158-160.ec2.internal
Jul 26 17:21:08.641 I ns/openshift-machine-config-operator pod/machine-config-daemon-kd75j Container image "registry.svc.ci.openshift.org/ocp/4.2-2019-07-26-154430@sha256:0b9956c2bad93bc30dccdfe9039e2756eba1fe68500334f7873fb752063f265e" already present on machine
Jul 26 17:21:08.778 I ns/openshift-machine-config-operator pod/machine-config-daemon-kd75j Created container machine-config-daemon
Jul 26 17:21:08.804 I ns/openshift-machine-config-operator pod/machine-config-daemon-kd75j Started container machine-config-daemon
 Failing tests:
 [sig-storage] In-tree Volumes [Driver: local][LocalVolumeType: gce-localssd-scsi-fs] [Serial] [Testpattern: Pre-provisioned PV (default fs)] volumes should be mountable [Suite:openshift/conformance/serial] [Suite:k8s]
 Writing JUnit report to /tmp/artifacts/junit/junit_e2e_20190726-172110.xml
 error: 1 fail, 50 pass, 166 skip (1h2m45s)

Comment 1 Jan Safranek 2019-07-30 13:16:16 UTC
This looks like SCC admission issue. The cluster did not contain *any* SCCs:

sccs, err := c.sccLister.List(labels.Everything())
if err != nil {
    return nil, "", nil, admission.NewForbidden(a, err)
}
if len(sccs) == 0 {
    return nil, "", nil, admission.NewForbidden(a, fmt.Errorf("no SecurityContextConstraints found in cluster"))
}

https://github.com/openshift/origin/blob/master/vendor/k8s.io/kubernetes/openshift-kube-apiserver/admission/security/sccadmission/admission.go#L141

Comment 2 Stefan Schimanski 2019-08-01 09:06:55 UTC
Possibly fixing this: https://github.com/openshift/origin/pull/23532

Comment 3 Chuan Yu 2019-08-27 00:20:47 UTC
Verified.
The jobs in latest 2 weeks , no such issue.