Bug 1945104

Summary: In k8s 1.21 bump '[sig-storage] [cis-hostpath] [Testpattern: Generic Ephemeral-volume' tests are disabled
Product: OpenShift Container Platform Reporter: Maciej Szulik <maszulik>
Component: StorageAssignee: Jan Safranek <jsafrane>
Storage sub component: Storage QA Contact: Wei Duan <wduan>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: unspecified CC: aos-bugs, jsafrane, pprinett
Version: 4.8   
Target Milestone: ---   
Target Release: 4.8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-07-27 22:56:48 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Maciej Szulik 2021-03-31 11:09:11 UTC
In https://github.com/openshift/kubernetes/pull/641 which brings k8s 1.21 a set of tests matching 

`\[sig-storage\].*\[cis-hostpath\] \[Testpattern: Generic Ephemeral-volume`,

are disabled.

Comment 1 Matthew Booth 2021-03-31 13:49:08 UTC
These tests were disabled by this commit: https://github.com/soltysh/kubernetes/commit/4236a24a1fe648e7912c982b55174b29effbbce6

This is my first exposure to the rebase process, so please exuse my ignorance. I'm guessing these tests were disabled because they're new in 1.21 and were failing? Is this correct, and if so are there any logs?

Comment 2 Matthew Booth 2021-04-06 10:02:28 UTC
Context from speaking to Maciej: these tests were disabled after discussion with jsafrane because they were permanently failing in the rebase branch. The agreement was to disable the tests and work to re-enable them as soon as possible after the the k8s bump lands. The simplest approach will be to open a PR re-enabling the tests and check what's failing, but we need to wait until the k8s bump lands first.

Comment 3 Matthew Booth 2021-04-06 10:41:22 UTC
Looking at the test failure history here: https://prow.ci.openshift.org/pr-history/?org=openshift&repo=kubernetes&pr=641

It seems the exclusion rule was added in commit 4236a24. It was not present in 41cd155. I'm expecting to be able to find relevant failure logs for the 41cd155 test run.

Comment 4 Matthew Booth 2021-04-06 10:47:04 UTC
(In reply to Matthew Booth from comment #3)
> Looking at the test failure history here:
> https://prow.ci.openshift.org/pr-history/
> ?org=openshift&repo=kubernetes&pr=641
> 
> It seems the exclusion rule was added in commit 4236a24. It was not present
> in 41cd155. I'm expecting to be able to find relevant failure logs for the
> 41cd155 test run.

@jsafrane do you remember which run executes tests matching `\[sig-storage\].*\[cis-hostpath\] \[Testpattern: Generic Ephemeral-volume`?

Incidentally, I don't see any test runs against OpenStack at all there. Do I need to reassign this to another cloud provider?

Comment 5 Matthew Booth 2021-04-06 10:55:32 UTC
Found some test failures here: https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_kubernetes/641/pull-ci-openshift-kubernetes-master-e2e-aws-fips/1377194444905779200

I'm reassigning this to 'Storage' because it doesn't look like OpenStack is involved anywhere in these tests.

Comment 6 Jan Safranek 2021-04-15 18:39:40 UTC
The test fail when KCM creates an ephemeral PVC for a test pod:

  Warning  FailedBinding     4m42s (x17 over 10m)  ephemeral_volume   ephemeral volume my-volume-0: create PVC inline-volume-tester-9bmlv-my-volume-0: persistentvolumeclaims "inline-volume-tester-9bmlv-my-volume-0" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , <nil>


It seems that the controller RBAC rules are wrong ?!

Comment 7 Jan Safranek 2021-04-15 18:52:27 UTC
These permissions need to be added to role system:controller:ephemeral-volume-controller:

- apiGroups:
  - ""
  resources:
  - pods/finalizers
  verbs:
  - update

(i.e. the controller must be able to edit "finalizers" subresource on Pod to be able to refer to them in OwnerReferences).

Comment 8 Jan Safranek 2021-04-16 09:11:33 UTC
Filed PR upstream: https://github.com/kubernetes/kubernetes/pull/101186

Comment 11 Maciej Szulik 2021-04-26 10:34:51 UTC
https://github.com/openshift/origin/pull/26054 is the origin k8s bump

Comment 12 Jan Safranek 2021-05-26 13:24:36 UTC
This should be fixed by https://github.com/openshift/origin/pull/26178

Comment 14 Wei Duan 2021-06-01 03:00:35 UTC
Verified on latest nightly CI, few cases failed but not caused by RBAC issue, marked as VERIFIED.

Comment 17 errata-xmlrpc 2021-07-27 22:56:48 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438