Bug 1690031
Summary: | Error deleting EBS volume "x" since volume is currently attached to "y" | ||||||
---|---|---|---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Corey Daley <cdaley> | ||||
Component: | Storage | Assignee: | aos-storage-staff <aos-storage-staff> | ||||
Storage sub component: | Storage | QA Contact: | Qin Ping <piqin> | ||||
Status: | CLOSED ERRATA | Docs Contact: | |||||
Severity: | low | ||||||
Priority: | medium | CC: | aos-bugs, aos-storage-staff, chaoyang, hekumar, hripps, jokerman, jsafrane, lxia, mmccomas, nagrawal, rgudimet, wking | ||||
Version: | 4.1.0 | ||||||
Target Milestone: | --- | ||||||
Target Release: | 4.4.0 | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2020-05-04 11:12:48 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Corey Daley
2019-03-18 15:51:19 UTC
Created attachment 1546728 [details] Occurences of this error in CI from 2019-03-19T12:28 to 2019-03-21T14:53Z Generated with [1]: $ deck-build-log-plot 'Error deleting EBS volume .* since volume is currently attached' This error is currently in 184 out of 816 *-e2e-aws* failures across our whole CI system over the past 48 hours. [1]: https://github.com/wking/openshift-release/tree/debug-scripts/deck-build-log Sending to Storage since they handle EBS attachment and PV management but I don't think the reason for any failure. I think it is just that the attach/detach controller has not yet removed the EBS volume from the instance. The warnings are unrelated and happen to be interspersed with the test failures. Detach and Delete operations are done async by two separate controllers so the message
> Error deleting EBS volume "x" since volume is currently attached to "y"
just means that the pv controller responsible for deleting "x" attempted to do so before the attach-detach controller successfully detached "x" from "y."
Note also that some of these failing tests don't involve PVs, e.g. ResourceQuota should create a ResourceQuota and capture the life of a pod.
I am not sure what other team would be best equipped to look into these failures.
Since the timeout error is not very descriptive, here is what the code says: https://github.com/openshift/origin/blob/master/vendor/k8s.io/kubernetes/test/e2e/scheduling/resource_quota.go#L75 "// wait for resource quota status to show the expected used resources value" https://github.com/openshift/origin/blob/master/vendor/k8s.io/kubernetes/test/e2e/scheduling/resource_quota.go#L1513 Don't know what component updates resource quota statuses bump. This continues to be a recurring failure in our 4.2 release stream. The failure may be benign, but it makes our error rates noisy and makes it difficult to understand if we have a stable product. These failures consume some of our "failed deletion attempt" quotas, increasing the chance that AWS throttling cause noticeable issues. But I don't have any specific runs I can link demonstrating that connection. We've fixed most of the API throttling as part of bug #1698829, it should be much better now. In this bug we focus on the warning event sent by PV controller: W persistentvolume/pvc-4697ff0a-dd8f-44b0-8ce4-26a504215483 Error deleting EBS volume "vol-0f67031cbcacc5515" since volume is currently attached to "i-0ae5f32d0a0298f99" Moved corresponding event from Warning to Normal, as it is part of normal operation: https://github.com/kubernetes/kubernetes/pull/86250 Create 100 volumes and did not hit this issue. Update the bug status to verified on version 4.4.0-0.nightly-2020-02-04-171905 True False 171m Cluster version is 4.4.0-0.nightly-2020-02-04-171905 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0581 |