Bug 2079981 - PVs not deleting on azure (or very slow to delete) since CSI migration to azuredisk [NEEDINFO]
Summary: PVs not deleting on azure (or very slow to delete) since CSI migration to azu...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Storage
Version: 4.11
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.13.0
Assignee: Fabio Bertinatto
QA Contact: Wei Duan
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-04-28 16:49 UTC by Dennis Periquet
Modified: 2023-05-17 22:46 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-05-17 22:46:32 UTC
Target Upstream Version:
Embargoed:
wduan: needinfo? (fbertina)


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2023:1326 0 None None None 2023-05-17 22:46:45 UTC

Description Dennis Periquet 2022-04-28 16:49:50 UTC
Description of problem:

starting with 4.11.0-0.nightly-2022-04-27-150207 payload, we are seeing several tests that fail like this:

```
[sig-storage] In-tree Volumes [Driver: azure-disk] [Testpattern: Dynamic PV (default fs)] subPath should support readOnly directory specified in the volumeMount [Suite:openshift/conformance/parallel] [Suite:k8s]
...
msg: "persistent Volume pvc-878c9188-d685-4aac-80a6-0880a73cabea not deleted by dynamic provisioner: PersistentVolume pvc-878c9188-d685-4aac-80a6-0880a73cabea still exists within 5m0s",
err: { s: "PersistentVolume pvc-878c9188-d685-4aac-80a6-0880a73cabea still exists within 5m0s", },
```

Example jobs:

  https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.11-e2e-azure-ovn-upgrade/1519331870091776000
  https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.11-e2e-azure-ovn-upgrade/1519331873451413504

(referring to the first job above) the log shows:

```
Apr 27 17:31:30.776: INFO: Deleting pod "pod-subpath-test-dynamicpv-q5rw" in namespace "e2e-provisioning-2958"
STEP: Deleting pvc
Apr 27 17:31:30.908: INFO: Deleting PersistentVolumeClaim "azure-diskr6dvg"
Apr 27 17:31:30.962: INFO: Waiting up to 5m0s for PersistentVolume pvc-878c9188-d685-4aac-80a6-0880a73cabea to get deleted
Apr 27 17:31:31.003: INFO: PersistentVolume pvc-878c9188-d685-4aac-80a6-0880a73cabea found and phase=Bound (41.146629ms)
Apr 27 17:31:36.049: INFO: PersistentVolume pvc-878c9188-d685-4aac-80a6-0880a73cabea found and phase=Bound (5.087352238s)
...
Apr 27 17:36:23.628: INFO: PersistentVolume pvc-878c9188-d685-4aac-80a6-0880a73cabea found and phase=Released (4m52.666461963s)
Apr 27 17:36:28.670: INFO: PersistentVolume pvc-878c9188-d685-4aac-80a6-0880a73cabea found and phase=Released (4m57.70851994s)
STEP: Deleting sc
```

Is this just azure being azure (slow)? since it's failing payloads, is there something we can do in the test to get it to pass consistently?

I don't know if raising timeouts makes sense given the job already takes about 3.5 hours which is close to the 4 hour max job time limit in prow.
Version-Release number of selected component (if applicable):

How reproducible:
  Reproducible on 3 4.11 nightly payloads so far:
    4.11.0-0.nightly-2022-04-26-220706  <-- failure did not happen here
    4.11.0-0.nightly-2022-04-27-150207  <-- failure started happening here
    4.11.0-0.nightly-2022-04-27-234931
    4.11.0-0.nightly-2022-04-28-055842

Steps to Reproduce:
1. Run the blocking jobs on the 4.11 nightly payloads
2.
3.

Actual results:

Tests fail due to pvcs not being cleaned up


Expected results:

Tests pass

Master Log:

Node Log (of failed PODs):

PV Dump:

PVC Dump:

StorageClass Dump (if StorageClass used by PV/PVC):

Additional info:

In appears (in the log) that the PV was successfully deleted (but well after the test's timeout) when
looking at these logs:

17:37:08.422245 1 azure_controller_common.go:365] azureDisk - detach disk(pvc-878c9188-d685-4aac-80a6-0880a73cabea, /subscriptions/72e3a972-58b0-4afc-bd4f-da89b39ccebd/resourceGroups/ci-op-565jn1js-99831-vs422-rg/providers/Microsoft.Compute/disks/pvc-878c9188-d685-4aac-80a6-0880a73cabea) succeeded
17:40:19.581010 1 controller.go:1486] delete "pvc-878c9188-d685-4aac-80a6-0880a73cabea": volume deleted
17:40:19.594622 1 controller.go:1531] delete "pvc-878c9188-d685-4aac-80a6-0880a73cabea": persistentvolume deleted
17:40:19.581010 1 controller.go:1486] delete "pvc-878c9188-d685-4aac-80a6-0880a73cabea": volume deleted
17:40:19.594622 1 controller.go:1531] delete "pvc-878c9188-d685-4aac-80a6-0880a73cabea": persistentvolume deleted
17:40:19.581010 1 controller.go:1486] delete "pvc-878c9188-d685-4aac-80a6-0880a73cabea": volume deleted
17:40:19.594622 1 controller.go:1531] delete "pvc-878c9188-d685-4aac-80a6-0880a73cabea": persistentvolume deleted


From loki, I pulled all logs that match pvc-878c9188-d685-4aac-80a6-0880a73cabea.  I then did this grep to highlight the
what that pv went through:

cat pvc-id-pvc-878c9188-d685-4aac-80a6-0880a73cabea.log |sed 's/{.*}//g'|sed 's/ \+/ /g'| grep --color pvc-878c9188-d685-4aac-80a6-0880a73cabea |grep -e "persistentvolume deleted" -e "begin to create azure disk" -e "successfully created PV" -e "Successfully provisioned volume" -e "volume deletion failed" -e "VolumeFailedDelete" -e "azureDisk - detach disk" -e "volume deleted"


2022-04-27T17:26:44Z I0427 17:26:44.180486 1 controllerserver.go:170] begin to create azure disk(pvc-878c9188-d685-4aac-80a6-0880a73cabea) account type(StandardSSD_LRS) rg(ci-op-565jn1js-99831-vs422-rg) location(centralus) size(1) diskZone(centralus-3) maxShares(0)
2022-04-27T17:26:44Z I0427 17:26:44.180486 1 controllerserver.go:170] begin to create azure disk(pvc-878c9188-d685-4aac-80a6-0880a73cabea) account type(StandardSSD_LRS) rg(ci-op-565jn1js-99831-vs422-rg) location(centralus) size(1) diskZone(centralus-3) maxShares(0)
2022-04-27T17:26:44Z I0427 17:26:44.180486 1 controllerserver.go:170] begin to create azure disk(pvc-878c9188-d685-4aac-80a6-0880a73cabea) account type(StandardSSD_LRS) rg(ci-op-565jn1js-99831-vs422-rg) location(centralus) size(1) diskZone(centralus-3) maxShares(0)
2022-04-27T17:26:44Z I0427 17:26:44.180486 1 controllerserver.go:170] begin to create azure disk(pvc-878c9188-d685-4aac-80a6-0880a73cabea) account type(StandardSSD_LRS) rg(ci-op-565jn1js-99831-vs422-rg) location(centralus) size(1) diskZone(centralus-3) maxShares(0)
2022-04-27T17:26:44Z I0427 17:26:44.180486 1 controllerserver.go:170] begin to create azure disk(pvc-878c9188-d685-4aac-80a6-0880a73cabea) account type(StandardSSD_LRS) rg(ci-op-565jn1js-99831-vs422-rg) location(centralus) size(1) diskZone(centralus-3) maxShares(0)
2022-04-27T17:26:46Z I0427 17:26:46.552140 1 controller.go:858] successfully created PV pvc-878c9188-d685-4aac-80a6-0880a73cabea for PVC azure-diskr6dvg and csi volume name /subscriptions/72e3a972-58b0-4afc-bd4f-da89b39ccebd/resourceGroups/ci-op-565jn1js-99831-vs422-rg/providers/Microsoft.Compute/disks/pvc-878c9188-d685-4aac-80a6-0880a73cabea
2022-04-27T17:26:46Z ): type: 'Normal' reason: 'ProvisioningSucceeded' Successfully provisioned volume pvc-878c9188-d685-4aac-80a6-0880a73cabea
2022-04-27T17:26:46Z I0427 17:26:46.552140 1 controller.go:858] successfully created PV pvc-878c9188-d685-4aac-80a6-0880a73cabea for PVC azure-diskr6dvg and csi volume name /subscriptions/72e3a972-58b0-4afc-bd4f-da89b39ccebd/resourceGroups/ci-op-565jn1js-99831-vs422-rg/providers/Microsoft.Compute/disks/pvc-878c9188-d685-4aac-80a6-0880a73cabea
2022-04-27T17:26:46Z ): type: 'Normal' reason: 'ProvisioningSucceeded' Successfully provisioned volume pvc-878c9188-d685-4aac-80a6-0880a73cabea
2022-04-27T17:26:46Z I0427 17:26:46.552140 1 controller.go:858] successfully created PV pvc-878c9188-d685-4aac-80a6-0880a73cabea for PVC azure-diskr6dvg and csi volume name /subscriptions/72e3a972-58b0-4afc-bd4f-da89b39ccebd/resourceGroups/ci-op-565jn1js-99831-vs422-rg/providers/Microsoft.Compute/disks/pvc-878c9188-d685-4aac-80a6-0880a73cabea
2022-04-27T17:26:46Z ): type: 'Normal' reason: 'ProvisioningSucceeded' Successfully provisioned volume pvc-878c9188-d685-4aac-80a6-0880a73cabea
2022-04-27T17:31:43Z E0427 17:31:43.332116 1 controller.go:1481] delete "pvc-878c9188-d685-4aac-80a6-0880a73cabea": volume deletion failed: persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:31:43Z ): type: 'Warning' reason: 'VolumeFailedDelete' persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:31:43Z E0427 17:31:43.332116 1 controller.go:1481] delete "pvc-878c9188-d685-4aac-80a6-0880a73cabea": volume deletion failed: persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:31:43Z E0427 17:31:43.332116 1 controller.go:1481] delete "pvc-878c9188-d685-4aac-80a6-0880a73cabea": volume deletion failed: persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:31:43Z ): type: 'Warning' reason: 'VolumeFailedDelete' persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:31:43Z ): type: 'Warning' reason: 'VolumeFailedDelete' persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:31:44Z E0427 17:31:44.332831 1 controller.go:1481] delete "pvc-878c9188-d685-4aac-80a6-0880a73cabea": volume deletion failed: persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:31:44Z ): type: 'Warning' reason: 'VolumeFailedDelete' persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:31:44Z E0427 17:31:44.332831 1 controller.go:1481] delete "pvc-878c9188-d685-4aac-80a6-0880a73cabea": volume deletion failed: persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:31:44Z E0427 17:31:44.332831 1 controller.go:1481] delete "pvc-878c9188-d685-4aac-80a6-0880a73cabea": volume deletion failed: persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:31:44Z ): type: 'Warning' reason: 'VolumeFailedDelete' persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:31:44Z ): type: 'Warning' reason: 'VolumeFailedDelete' persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:31:46Z E0427 17:31:46.334087 1 controller.go:1481] delete "pvc-878c9188-d685-4aac-80a6-0880a73cabea": volume deletion failed: persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:31:46Z ): type: 'Warning' reason: 'VolumeFailedDelete' persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:31:46Z E0427 17:31:46.334087 1 controller.go:1481] delete "pvc-878c9188-d685-4aac-80a6-0880a73cabea": volume deletion failed: persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:31:46Z ): type: 'Warning' reason: 'VolumeFailedDelete' persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:31:46Z E0427 17:31:46.334087 1 controller.go:1481] delete "pvc-878c9188-d685-4aac-80a6-0880a73cabea": volume deletion failed: persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:31:46Z ): type: 'Warning' reason: 'VolumeFailedDelete' persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:31:50Z E0427 17:31:50.334962 1 controller.go:1481] delete "pvc-878c9188-d685-4aac-80a6-0880a73cabea": volume deletion failed: persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:31:50Z ): type: 'Warning' reason: 'VolumeFailedDelete' persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:31:50Z E0427 17:31:50.334962 1 controller.go:1481] delete "pvc-878c9188-d685-4aac-80a6-0880a73cabea": volume deletion failed: persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:31:50Z E0427 17:31:50.334962 1 controller.go:1481] delete "pvc-878c9188-d685-4aac-80a6-0880a73cabea": volume deletion failed: persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:31:50Z ): type: 'Warning' reason: 'VolumeFailedDelete' persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:31:50Z ): type: 'Warning' reason: 'VolumeFailedDelete' persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:31:58Z E0427 17:31:58.335321 1 controller.go:1481] delete "pvc-878c9188-d685-4aac-80a6-0880a73cabea": volume deletion failed: persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:31:58Z ): type: 'Warning' reason: 'VolumeFailedDelete' persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:31:58Z E0427 17:31:58.335321 1 controller.go:1481] delete "pvc-878c9188-d685-4aac-80a6-0880a73cabea": volume deletion failed: persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:31:58Z ): type: 'Warning' reason: 'VolumeFailedDelete' persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:31:58Z E0427 17:31:58.335321 1 controller.go:1481] delete "pvc-878c9188-d685-4aac-80a6-0880a73cabea": volume deletion failed: persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:31:58Z ): type: 'Warning' reason: 'VolumeFailedDelete' persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:32:14Z E0427 17:32:14.336346 1 controller.go:1481] delete "pvc-878c9188-d685-4aac-80a6-0880a73cabea": volume deletion failed: persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:32:14Z ): type: 'Warning' reason: 'VolumeFailedDelete' persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:32:14Z E0427 17:32:14.336346 1 controller.go:1481] delete "pvc-878c9188-d685-4aac-80a6-0880a73cabea": volume deletion failed: persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:32:14Z E0427 17:32:14.336346 1 controller.go:1481] delete "pvc-878c9188-d685-4aac-80a6-0880a73cabea": volume deletion failed: persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:32:14Z ): type: 'Warning' reason: 'VolumeFailedDelete' persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:32:14Z ): type: 'Warning' reason: 'VolumeFailedDelete' persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:32:46Z E0427 17:32:46.336752 1 controller.go:1481] delete "pvc-878c9188-d685-4aac-80a6-0880a73cabea": volume deletion failed: persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:32:46Z ): type: 'Warning' reason: 'VolumeFailedDelete' persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:32:46Z E0427 17:32:46.336752 1 controller.go:1481] delete "pvc-878c9188-d685-4aac-80a6-0880a73cabea": volume deletion failed: persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:32:46Z ): type: 'Warning' reason: 'VolumeFailedDelete' persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:32:46Z E0427 17:32:46.336752 1 controller.go:1481] delete "pvc-878c9188-d685-4aac-80a6-0880a73cabea": volume deletion failed: persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:32:46Z ): type: 'Warning' reason: 'VolumeFailedDelete' persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:33:50Z E0427 17:33:50.337233 1 controller.go:1481] delete "pvc-878c9188-d685-4aac-80a6-0880a73cabea": volume deletion failed: persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:33:50Z ): type: 'Warning' reason: 'VolumeFailedDelete' persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:33:50Z E0427 17:33:50.337233 1 controller.go:1481] delete "pvc-878c9188-d685-4aac-80a6-0880a73cabea": volume deletion failed: persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:33:50Z E0427 17:33:50.337233 1 controller.go:1481] delete "pvc-878c9188-d685-4aac-80a6-0880a73cabea": volume deletion failed: persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:33:50Z ): type: 'Warning' reason: 'VolumeFailedDelete' persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:33:50Z ): type: 'Warning' reason: 'VolumeFailedDelete' persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:35:58Z E0427 17:35:58.337514 1 controller.go:1481] delete "pvc-878c9188-d685-4aac-80a6-0880a73cabea": volume deletion failed: persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:35:58Z ): type: 'Warning' reason: 'VolumeFailedDelete' persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:35:58Z E0427 17:35:58.337514 1 controller.go:1481] delete "pvc-878c9188-d685-4aac-80a6-0880a73cabea": volume deletion failed: persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:35:58Z ): type: 'Warning' reason: 'VolumeFailedDelete' persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:35:58Z E0427 17:35:58.337514 1 controller.go:1481] delete "pvc-878c9188-d685-4aac-80a6-0880a73cabea": volume deletion failed: persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:35:58Z ): type: 'Warning' reason: 'VolumeFailedDelete' persistentvolume pvc-878c9188-d685-4aac-80a6-0880a73cabea is still attached to node ci-op-565jn1js-99831-vs422-worker-centralus3-z67fj
2022-04-27T17:36:52Z I0427 17:36:52.902091 1 azure_controller_standard.go:154] azureDisk - detach disk: name pvc-878c9188-d685-4aac-80a6-0880a73cabea uri /subscriptions/72e3a972-58b0-4afc-bd4f-da89b39ccebd/resourcegroups/ci-op-565jn1js-99831-vs422-rg/providers/microsoft.compute/disks/pvc-878c9188-d685-4aac-80a6-0880a73cabea
2022-04-27T17:36:52Z I0427 17:36:52.902091 1 azure_controller_standard.go:154] azureDisk - detach disk: name pvc-878c9188-d685-4aac-80a6-0880a73cabea uri /subscriptions/72e3a972-58b0-4afc-bd4f-da89b39ccebd/resourcegroups/ci-op-565jn1js-99831-vs422-rg/providers/microsoft.compute/disks/pvc-878c9188-d685-4aac-80a6-0880a73cabea
2022-04-27T17:36:53Z I0427 17:36:52.902091 1 azure_controller_standard.go:154] azureDisk - detach disk: name pvc-878c9188-d685-4aac-80a6-0880a73cabea uri /subscriptions/72e3a972-58b0-4afc-bd4f-da89b39ccebd/resourcegroups/ci-op-565jn1js-99831-vs422-rg/providers/microsoft.compute/disks/pvc-878c9188-d685-4aac-80a6-0880a73cabea
2022-04-27T17:36:53Z I0427 17:36:52.902091 1 azure_controller_standard.go:154] azureDisk - detach disk: name pvc-878c9188-d685-4aac-80a6-0880a73cabea uri /subscriptions/72e3a972-58b0-4afc-bd4f-da89b39ccebd/resourcegroups/ci-op-565jn1js-99831-vs422-rg/providers/microsoft.compute/disks/pvc-878c9188-d685-4aac-80a6-0880a73cabea
2022-04-27T17:36:53Z I0427 17:36:52.902091 1 azure_controller_standard.go:154] azureDisk - detach disk: name pvc-878c9188-d685-4aac-80a6-0880a73cabea uri /subscriptions/72e3a972-58b0-4afc-bd4f-da89b39ccebd/resourcegroups/ci-op-565jn1js-99831-vs422-rg/providers/microsoft.compute/disks/pvc-878c9188-d685-4aac-80a6-0880a73cabea
2022-04-27T17:37:08Z I0427 17:37:08.422245 1 azure_controller_common.go:365] azureDisk - detach disk(pvc-878c9188-d685-4aac-80a6-0880a73cabea, /subscriptions/72e3a972-58b0-4afc-bd4f-da89b39ccebd/resourceGroups/ci-op-565jn1js-99831-vs422-rg/providers/Microsoft.Compute/disks/pvc-878c9188-d685-4aac-80a6-0880a73cabea) succeeded
2022-04-27T17:37:08Z I0427 17:37:08.422245 1 azure_controller_common.go:365] azureDisk - detach disk(pvc-878c9188-d685-4aac-80a6-0880a73cabea, /subscriptions/72e3a972-58b0-4afc-bd4f-da89b39ccebd/resourceGroups/ci-op-565jn1js-99831-vs422-rg/providers/Microsoft.Compute/disks/pvc-878c9188-d685-4aac-80a6-0880a73cabea) succeeded
2022-04-27T17:37:08Z I0427 17:37:08.422245 1 azure_controller_common.go:365] azureDisk - detach disk(pvc-878c9188-d685-4aac-80a6-0880a73cabea, /subscriptions/72e3a972-58b0-4afc-bd4f-da89b39ccebd/resourceGroups/ci-op-565jn1js-99831-vs422-rg/providers/Microsoft.Compute/disks/pvc-878c9188-d685-4aac-80a6-0880a73cabea) succeeded
2022-04-27T17:37:08Z I0427 17:37:08.422245 1 azure_controller_common.go:365] azureDisk - detach disk(pvc-878c9188-d685-4aac-80a6-0880a73cabea, /subscriptions/72e3a972-58b0-4afc-bd4f-da89b39ccebd/resourceGroups/ci-op-565jn1js-99831-vs422-rg/providers/Microsoft.Compute/disks/pvc-878c9188-d685-4aac-80a6-0880a73cabea) succeeded
2022-04-27T17:37:08Z I0427 17:37:08.422245 1 azure_controller_common.go:365] azureDisk - detach disk(pvc-878c9188-d685-4aac-80a6-0880a73cabea, /subscriptions/72e3a972-58b0-4afc-bd4f-da89b39ccebd/resourceGroups/ci-op-565jn1js-99831-vs422-rg/providers/Microsoft.Compute/disks/pvc-878c9188-d685-4aac-80a6-0880a73cabea) succeeded
2022-04-27T17:40:19Z I0427 17:40:19.581010 1 controller.go:1486] delete "pvc-878c9188-d685-4aac-80a6-0880a73cabea": volume deleted
2022-04-27T17:40:19Z I0427 17:40:19.594622 1 controller.go:1531] delete "pvc-878c9188-d685-4aac-80a6-0880a73cabea": persistentvolume deleted
2022-04-27T17:40:19Z I0427 17:40:19.581010 1 controller.go:1486] delete "pvc-878c9188-d685-4aac-80a6-0880a73cabea": volume deleted
2022-04-27T17:40:19Z I0427 17:40:19.594622 1 controller.go:1531] delete "pvc-878c9188-d685-4aac-80a6-0880a73cabea": persistentvolume deleted
2022-04-27T17:40:19Z I0427 17:40:19.581010 1 controller.go:1486] delete "pvc-878c9188-d685-4aac-80a6-0880a73cabea": volume deleted
2022-04-27T17:40:19Z I0427 17:40:19.594622 1 controller.go:1531] delete "pvc-878c9188-d685-4aac-80a6-0880a73cabea": persistentvolume deleted

Comment 1 Fabio Bertinatto 2022-07-08 13:27:24 UTC
According to this search result [0], this issue is no longer happening in 4.11, most likely due to the timeout bump done here [1].

That PR hasn't been ported to 4.10, as a result, this issue is still happening there [3]. I requested a backport here [4].

We know for sure that Azure Disk can be very slow at times, so I believe bumping the timeouts is the only thing we can do.

[0] https://search.ci.openshift.org/?search=persistent+Volume.*not+deleted+by+dynamic+provisioner%3A+PersistentVolume&maxAge=336h&context=1&type=bug%2Bissue%2Bjunit&name=4.11.*csi&excludeName=&maxMatches=5&maxBytes=20971520&groupBy=job
[1] https://github.com/openshift/azure-disk-csi-driver-operator/pull/45
[3] https://search.ci.openshift.org/?search=persistent+Volume.*not+deleted+by+dynamic+provisioner%3A+PersistentVolume&maxAge=336h&context=1&type=bug%2Bissue%2Bjunit&name=4.10.*csi&excludeName=&maxMatches=5&maxBytes=20971520&groupBy=job
[4] https://bugzilla.redhat.com/show_bug.cgi?id=2062152

Comment 2 Fabio Bertinatto 2022-07-08 13:30:00 UTC
Just to clarify, in comment #1 I'm talking about _CSI_ jobs.

The timeouts bump has been done in 4.11 CSI jobs, which solved the issue.

The next step for this ticket is to do the same for in-tree jobs.

Comment 3 Fabio Bertinatto 2022-07-08 17:18:43 UTC
Upstream PR: https://github.com/kubernetes/kubernetes/pull/111034

Once that merges, we need to backport it to OCP.

Comment 4 Fabio Bertinatto 2022-07-15 19:01:23 UTC
Backport PR: https://github.com/openshift/kubernetes/pull/1324

Comment 6 Fabio Bertinatto 2022-11-08 16:57:43 UTC
The kube v1.25.0 rebase should bring in a few patches that should improve the situation both with in-tree jobs and CSI jobs.

This fix properly increase timeouts for Azure Disk *in-tree* tests:

https://github.com/kubernetes/kubernetes/pull/113208/files

And these 2 should fix some hardcoded timeout values and use the custom timeouts instead. This will affect both *in-tree* and *CSI* Azure Disk jobs:

https://github.com/kubernetes/kubernetes/pull/112074/files
https://github.com/kubernetes/kubernetes/pull/109342/files

The kube rebase landed and was reverted a few times already. Once it lands for good and CI jobs pick up the patches above we should be able to have a better assessment of the improvements.

Comment 7 Fabio Bertinatto 2022-11-08 18:21:07 UTC
Moving to MODIFIED because all of the improvements above have been available in openshift/origin since the v1.25 rebase landed [1].

[1] https://github.com/openshift/origin/pull/27526

Comment 9 Wei Duan 2022-11-09 10:29:46 UTC
Hi Fabio, the https://github.com/openshift/origin/pull/27526 is merged to master two days ago, so I understand this fix is for 4.13, right?

Comment 15 errata-xmlrpc 2023-05-17 22:46:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.13.0 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:1326


Note You need to log in before you can comment on or make changes to this bug.