Bug 1908675

Summary: Reenable [sig-storage] CSI mock volume CSI FSGroupPolicy [LinuxOnly] should modify fsGroup if fsGroupPolicy=default [Suite:openshift/conformance/parallel] [Suite:k8s]
Product: OpenShift Container Platform Reporter: Tomáš Nožička <tnozicka>
Component: StorageAssignee: Hemant Kumar <hekumar>
Storage sub component: Kubernetes External Components QA Contact: Wei Duan <wduan>
Status: CLOSED ERRATA Docs Contact:
Severity: urgent    
Priority: unspecified CC: aos-bugs, hekumar, wduan
Version: 4.7   
Target Milestone: ---   
Target Release: 4.7.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-02-24 15:46:22 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Tomáš Nožička 2020-12-17 10:23:09 UTC
[sig-storage] CSI mock volume CSI FSGroupPolicy [LinuxOnly] should modify fsGroup if fsGroupPolicy=default [Suite:openshift/conformance/parallel] [Suite:k8s]

is not working on the rebase to kube 1.20 and has been disabled. Please fix the test and reenable it.

Comment 2 Hemant Kumar 2020-12-17 22:59:54 UTC
There is nothing that stands out as such problematic with e2e. Here is the timeline of events:

1. Test is started: 21:57:56.

Dec 16 21:57:56.352: INFO: About to run a Kube e2e test, ensuring namespace is privileged

2. Mount succeeds on kubelet: 21:58:11.

Dec 16 21:58:11.691073 ci-op-j5f786m2-2a78c-rxlfg-worker-b-9qdvz hyperkube[1532]: I1216 21:58:11.691000    1532 operation_generator.go:672] MountVolume.SetUp succeeded for volume "pvc-1d0f509c-6768-46d9-9bfd-7c3d9630d1e5" (UniqueName: "kubernetes.io/csi/csi-mock-e2e-csi-mock-volumes-7777^4") pod "pvc-volume-tester-z6z25" (UID: "e5ff07c0-b6d7-4b5c-b0fb-c5dffa9bcad2")

3. container is started for the pod: 21:58:15.

Dec 16 21:58:15.584635 ci-op-j5f786m2-2a78c-rxlfg-worker-b-9qdvz hyperkube[1532]: I1216 21:58:15.584336    1532 kubelet.go:1930] SyncLoop (PLEG): "pvc-volume-tester-z6z25_e2e-csi-mock-volumes-7777(e5ff07c0-b6d7-4b5c-b0fb-c5dffa9bcad2)", event: &pleg.PodLifecycleEvent{ID:"e5ff07c0-b6d7-4b5c-b0fb-c5dffa9bcad2", Type:"ContainerStarted", Data:"439823c1384a4896147aec8bea6620e141c7f60063290d62cd755fce6a1d244d"}

4. Test runs kubectl exec equivalent: 21:58:17.

Dec 16 21:58:17.395: INFO: ExecWithOptions {Command:[/bin/sh -c mkdir /mnt/test/e2e-csi-mock-volumes-7777] Namespace:e2e-csi-mock-volumes-7777 PodName:pvc-volume-tester-z6z25 ContainerName:volume-tester Stdin:<nil> CaptureStdout:true CaptureStderr:true PreserveWhitespace:false Quiet:false}

5. Second kubectl exec equivalent: 21:58:17

Dec 16 21:58:17.677: INFO: ExecWithOptions {Command:[/bin/sh -c echo 'filecontents' > '/mnt/test/e2e-csi-mock-volumes-7777/e2e-csi-mock-volumes-7777'; sync] Namespace:e2e-csi-mock-volumes-7777 PodName:pvc-volume-tester-z6z25 ContainerName:volume-tester Stdin:<nil> CaptureStdout:true CaptureStderr:true Pres
        │ erveWhitespace:false Quiet:false}

6. Nothing further happens and previous command just hangs and then test is aborted: 22:12:54.

Dec 16 22:12:54.397: INFO: Running AfterSuite actions on all nodes

It could be a networking issue or something similar.  I could not reproduce the failure in local environment as well. I am going to re-enable the test and see.

Comment 3 Hemant Kumar 2020-12-17 23:00:14 UTC
There is nothing that stands out as such problematic with e2e. Here is the timeline of events:

1. Test is started: 21:57:56.

Dec 16 21:57:56.352: INFO: About to run a Kube e2e test, ensuring namespace is privileged

2. Mount succeeds on kubelet: 21:58:11.

Dec 16 21:58:11.691073 ci-op-j5f786m2-2a78c-rxlfg-worker-b-9qdvz hyperkube[1532]: I1216 21:58:11.691000    1532 operation_generator.go:672] MountVolume.SetUp succeeded for volume "pvc-1d0f509c-6768-46d9-9bfd-7c3d9630d1e5" (UniqueName: "kubernetes.io/csi/csi-mock-e2e-csi-mock-volumes-7777^4") pod "pvc-volume-tester-z6z25" (UID: "e5ff07c0-b6d7-4b5c-b0fb-c5dffa9bcad2")

3. container is started for the pod: 21:58:15.

Dec 16 21:58:15.584635 ci-op-j5f786m2-2a78c-rxlfg-worker-b-9qdvz hyperkube[1532]: I1216 21:58:15.584336    1532 kubelet.go:1930] SyncLoop (PLEG): "pvc-volume-tester-z6z25_e2e-csi-mock-volumes-7777(e5ff07c0-b6d7-4b5c-b0fb-c5dffa9bcad2)", event: &pleg.PodLifecycleEvent{ID:"e5ff07c0-b6d7-4b5c-b0fb-c5dffa9bcad2", Type:"ContainerStarted", Data:"439823c1384a4896147aec8bea6620e141c7f60063290d62cd755fce6a1d244d"}

4. Test runs kubectl exec equivalent: 21:58:17.

Dec 16 21:58:17.395: INFO: ExecWithOptions {Command:[/bin/sh -c mkdir /mnt/test/e2e-csi-mock-volumes-7777] Namespace:e2e-csi-mock-volumes-7777 PodName:pvc-volume-tester-z6z25 ContainerName:volume-tester Stdin:<nil> CaptureStdout:true CaptureStderr:true PreserveWhitespace:false Quiet:false}

5. Second kubectl exec equivalent: 21:58:17

Dec 16 21:58:17.677: INFO: ExecWithOptions {Command:[/bin/sh -c echo 'filecontents' > '/mnt/test/e2e-csi-mock-volumes-7777/e2e-csi-mock-volumes-7777'; sync] Namespace:e2e-csi-mock-volumes-7777 PodName:pvc-volume-tester-z6z25 ContainerName:volume-tester Stdin:<nil> CaptureStdout:true CaptureStderr:true Pres
        │ erveWhitespace:false Quiet:false}

6. Nothing further happens and previous command just hangs and then test is aborted: 22:12:54.

Dec 16 22:12:54.397: INFO: Running AfterSuite actions on all nodes

It could be a networking issue or something similar.  I could not reproduce the failure in local environment as well(after enabling the test). I am going to re-enable the test in CI and see.

Comment 8 errata-xmlrpc 2021-02-24 15:46:22 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633