Bug 1821347 - Lots of storage-related test failures in release-openshift-ocp-installer-e2e-aws-rhel7-workers-4.2
Summary: Lots of storage-related test failures in release-openshift-ocp-installer-e2e-...
Keywords:
Status: CLOSED DUPLICATE of bug 1823374
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Node
Version: 4.2.z
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 4.5.0
Assignee: Ryan Phillips
QA Contact: Sunil Choudhary
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-04-06 15:56 UTC by Jonathan Lebon
Modified: 2020-05-11 17:27 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-05-11 17:27:27 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Jonathan Lebon 2020-04-06 15:56:58 UTC
Description of problem:

Seeing many test failures related to storage/filesystems. The following tests have highest flake ratings:

[sig-storage] In-tree Volumes [Driver: local][LocalVolumeType: blockfs] [Testpattern: Pre-provisioned PV (default fs)] subPath should support readOnly file specified in the volumeMount [Suite:openshift/conformance/parallel] [Suite:k8s]

[sig-storage] PersistentVolumes-local [Volume type: blockfswithformat] Set fsGroup for local volume should set same fsGroup for two pods simultaneously [Suite:openshift/conformance/parallel] [Suite:k8s]

See: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.2-informing#release-openshift-ocp-installer-e2e-aws-rhel7-workers-4.2&sort-by-flakiness=

Example jobs:

https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-aws-rhel7-workers-4.2/352
https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-aws-rhel7-workers-4.2/354
https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-aws-rhel7-workers-4.2/355
https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-aws-rhel7-workers-4.2/356

Many of the failures seem linked to a `rm` hitting `EBUSY`. E.g.:

- rm: cannot remove '/tmp/local-driver-a7012570-763e-11ea-affc-0a58ac105853': Device or resource busy
- rm: cannot remove '/tmp/local-volume-test-e247402f-763f-11ea-9129-0a58ac105853': Device or resource busy
- rm: cannot remove '/tmp/local-driver-43fcbeab-763e-11ea-87fc-0a58ac105853': Device or resource busy

Comment 2 Jonathan Lebon 2020-04-06 16:10:49 UTC
Yes, indeed looks like a dupe of 1820717. Thanks!

*** This bug has been marked as a duplicate of bug 1820717 ***

Comment 3 Miciah Dashiel Butler Masters 2020-04-24 14:54:27 UTC
This report was closed as a duplicate of bug 1820717, which cites this failure:

> [sig-storage] In-tree Volumes [Driver: nfs] [Testpattern: Pre-provisioned PV (block volmode)] volumeMode should fail to create pod by failing to mount volume [Suite:openshift/conformance/parallel] [Suite:k8s]

I no longer see that failure in CI.  However, I still see the failures that were reported in comment 0 in these jobs:

> [sig-storage] In-tree Volumes [Driver: local][LocalVolumeType: blockfs] [Testpattern: Pre-provisioned PV (default fs)] subPath should support readOnly file specified in the volumeMount [Suite:openshift/conformance/parallel] [Suite:k8s]
> 
> [sig-storage] PersistentVolumes-local [Volume type: blockfswithformat] Set fsGroup for local volume should set same fsGroup for two pods simultaneously [Suite:openshift/conformance/parallel] [Suite:k8s]

In these recent failures, I still see the following error messages:

    rm: cannot remove '/tmp/local-driver-eabf8dc1-862d-11ea-8344-0a58ac1071e9': Device or resource busy
    rm: cannot remove '/tmp/local-volume-test-eaee3143-862d-11ea-824a-0a58ac1071e9': Device or resource busy

These are some recent failures:

https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-aws-rhel7-workers-4.2/385
https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-aws-rhel7-workers-4.2/391
https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-aws-rhel7-workers-4.2/392

It appears that the originally reported failures were not in fact the same issue as bug 1820717, so I am reopening this report.

Comment 4 Stephen Cuppett 2020-04-24 15:18:12 UTC
Setting target release to current development version (4.5) for investigation. Where fixes (if any) are required/requested for prior versions, cloned BZs will be created when appropriate.

Comment 6 Ryan Phillips 2020-05-11 17:27:27 UTC
Closing as a dupe, since 4.4 has a fix. Backports to 4.2 are for CVE fixes mostly.

*** This bug has been marked as a duplicate of bug 1823374 ***


Note You need to log in before you can comment on or make changes to this bug.