Bug 1700098
Summary: | NFS tests are failing in baremetal 4.1 clusters | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Hemant Kumar <hekumar> |
Component: | Storage | Assignee: | Bradley Childs <bchilds> |
Status: | CLOSED DUPLICATE | QA Contact: | Wenqi He <wehe> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 4.1.0 | CC: | aos-bugs, aos-storage-staff, jsafrane |
Target Milestone: | --- | ||
Target Release: | 4.1.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2019-04-16 15:22:39 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Hemant Kumar
2019-04-15 20:39:50 UTC
There are many NFS tests that passed in the test run, so it's not about missing NFS utils. Kernel logs listed above are IMO harmless. I ran the tests manually on a bare metal and it passed. In addition, the test creates a PVC + PV and checks they're bound together. NFS is not involved here yet, it would be used later, if they were Bound. PV and PVC are (from test teardown): Apr 15 23:10:25.120: INFO: Deleting PersistentVolumeClaim "pvc-dvmt7" Apr 15 23:10:25.142: INFO: Deleting PersistentVolume "nfs-8wxmm" controller-manager logs shows that the PVC can't find its PV: I0415 23:08:42.998947 1 event.go:221] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"e2e-tests-pv-4tk65", Name:"pvc-dvmt7", UID:"3b64829e-5fd3-11e9-b3fc-0cc47a18ab96", APIVersion:"v1", ResourceVersion:"61900", FieldPath:""}): type: 'Normal' reason: 'FailedBinding' no persistent volumes available for this claim and no storage class is set And PV is not to be found, because it got bound to PVC from a different test: I0415 23:07:28.009082 1 pv_controller.go:874] claim "e2e-tests-statefulset-946xl/datadir-ss-0" bound to volume "nfs-8wxmm" I0415 23:07:28.012629 1 pv_controller.go:824] volume "nfs-8wxmm" entered phase "Bound" I0415 23:07:28.012648 1 pv_controller.go:963] volume "nfs-8wxmm" bound to claim "e2e-tests-statefulset-946xl/datadir-ss-0" The test apparently races with StatefulSet test "[It] should perform rolling updates and roll backs of template modifications with PVCs [Suite:openshift/conformance/parallel] [Suite:k8s]". That one expects that there is a default storage class and it would get its PV provisioned dynamically. Its PVC steals PV from the other test instead. Even if there was a default storage class + dynamic provisioning, there would still be (short) window of opportunity: 1. NFS test creates PV 2. StatefulSet test creates PVC 3. PV controller sees available PV from 1. and binds it to PVC from 2. instead of dynamic provisioning of a new PV for StatefulSet test. These two tests should use a different storage class. > Even if there was a default storage class + dynamic provisioning, there would still be (short) window of opportunity: > > 1. NFS test creates PV > 2. StatefulSet test creates PVC > 3. PV controller sees available PV from 1. and binds it to PVC from 2. instead of dynamic provisioning of a new PV for StatefulSet test. > > These two tests should use a different storage class. False alarm, they *do* use a different storage class (when there is one). NFS PV tests explicitly set StorageClassName: "" in PVCs and they don't get the default one assigned by our default storage class admission plugin: https://github.com/kubernetes/kubernetes/blob/252cabf155308b43c8c612f482855dc0cfa2e29c/test/e2e/storage/persistent_volumes.go#L140 I think that skipping tests that need default storage class (bug #1700076) would be enough to fix also these NFS flakes. *** This bug has been marked as a duplicate of bug 1700076 *** |