Bug 1732193
| Summary: | Master controllers try to detach a removed cinder volume every 2 minutes with "Resource not found" message. | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Daein Park <dapark> |
| Component: | Storage | Assignee: | Jan Safranek <jsafrane> |
| Status: | CLOSED ERRATA | QA Contact: | Chao Yang <chaoyang> |
| Severity: | medium | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 3.11.0 | CC: | aos-bugs, aos-storage-staff, jsafrane, mfojtik |
| Target Milestone: | --- | ||
| Target Release: | 4.2.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2019-10-16 06:30:44 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
I can see that Cinder detach is not idempotent, it fails when detaching already detached volume. Filed https://github.com/kubernetes/kubernetes/pull/80518 upstream. It is passed on 4.2.0-0.nightly-2019-08-27-072819 1.Create pvc, pod 2.Detach and delete volume from openstack webconsole 3.Delete pod 4.Check logs in the openshift-kube-controller-manager Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2922 |
Description of problem: Master controllers pod try to detach a removed cinder volume and it's generating spamming logs about it as follows[0]. The following "pvc-aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa" PV has already removed in "oc get pv" and "cccccccc-cccc-cccc-cccc-cccccccccccc" cinder volume also does not exist in "openstack volume list". This PV and volume is removed and detached 10 days ago without any errors. [0] master controllers logs ~~~ W0719 16:33:55.048164 1 reconciler.go:231] attacherDetacher.DetachVolume started for volume "pvc-aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa" (UniqueName: "kubernetes.io/cinder/cccccccc-cccc-cccc-cccc-cccccccccccc") on node "worker01.ocp.example.com" This volume is not safe to detach, but maxWaitForUnmountDuration 6m0s expired, force detaching E0719 16:33:55.126275 1 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/cinder/cccccccc-cccc-cccc-cccc-cccccccccccc\"" failed. No retries permitted until 2019-07-19 16:35:57.126225996 +0900 JST m=+3102241.249061221 (durationBeforeRetry 2m2s). Error: "DetachVolume.Detach failed for volume \"pvc-aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa\" (UniqueName: \"kubernetes.io/cinder/cccccccc-cccc-cccc-cccc-cccccccccccc\") on node \"worker01.ocp.example.com\" : error occurred getting volume by ID: cccccccc-cccc-cccc-cccc-cccccccccccc, err: Resource not found" W0719 16:35:57.202790 1 reconciler.go:231] attacherDetacher.DetachVolume started for volume "pvc-aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa" (UniqueName: "kubernetes.io/cinder/cccccccc-cccc-cccc-cccc-cccccccccccc") on node "worker01.ocp.example.com" This volume is not safe to detach, but maxWaitForUnmountDuration 6m0s expired, force detaching E0719 16:35:57.728152 1 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/cinder/cccccccc-cccc-cccc-cccc-cccccccccccc\"" failed. No retries permitted until 2019-07-19 16:37:59.72812061 +0900 JST m=+3102363.850955837 (durationBeforeRetry 2m2s). Error: "DetachVolume.Detach failed for volume \"pvc-aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa\" (UniqueName: \"kubernetes.io/cinder/cccccccc-cccc-cccc-cccc-cccccccccccc\") on node \"worker01.ocp.example.com\" : error occurred getting volume by ID: cccccccc-cccc-cccc-cccc-cccccccccccc, err: Resource not found" ~~~ This node has been configured with "volumes.kubernetes.io/controller-managed-attach-detach=true" annotation and cloud provider is RHOSP 13. Additionally PV is provided by dynamic provider which is configured with Cinder. Version-Release number of selected component (if applicable): OpenStack RHOSP 13 Openshift v3.11.98 Kubernetes v1.11.0+d4cacc0 How reproducible: This issue can not reproduce, it's occurring on CU's environment. Steps to Reproduce: 1. 2. 3. Actual results: Master controllers try to detach a removed cinder volume through PV and it's generating spamming error messages: "Resource not found". Expected results: No messages and errors to detach a removed volumes. Additional info: All failure-domain.beta.kubernetes.io/region and zone are same in the OCP cluster.