Bug 2062152

Summary: Azure CI can't provision volumes in parallel
Product: OpenShift Container Platform Reporter: Jan Safranek <jsafrane>
Component: StorageAssignee: Fabio Bertinatto <fbertina>
Storage sub component: Kubernetes External Components QA Contact: Wei Duan <wduan>
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: medium CC: aos-bugs, dperique
Version: 4.10Keywords: Rebase
Target Milestone: ---   
Target Release: 4.11.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-09-07 20:49:23 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2105304    

Description Jan Safranek 2022-03-09 10:07:13 UTC
Description of problem:

Following tests fail in 4.11 CI:

External Storage [Driver: disk.csi.azure.com] [Testpattern: Dynamic PV (default fs)] provisioning should provision storage with pvc data source in parallel [Slow]
External Storage [Driver: disk.csi.azure.com] [Testpattern: Dynamic PV (block volmode)] provisioning should provision storage with pvc data source in parallel [Slow]

Events in the test namespace suggest that there might be something wrong with the CSI idempotency - first few provisioning times out and then the driver reports the volume already exists.

ProvisioningFailed: failed to provision volume with StorageClass "e2e-provisioning-5808-e2e-sc8rlfx": rpc error: code = DeadlineExceeded desc = context deadline exceeded
ProvisioningFailed: failed to provision volume with StorageClass "e2e-provisioning-5808-e2e-sc8rlfx": rpc error: code = DeadlineExceeded desc = context deadline exceeded
ProvisioningFailed: failed to provision volume with StorageClass "e2e-provisioning-5808-e2e-sc8rlfx": rpc error: code = DeadlineExceeded desc = context deadline exceeded
ProvisioningFailed: failed to provision volume with StorageClass "e2e-provisioning-5808-e2e-sc8rlfx": rpc error: code = DeadlineExceeded desc = context deadline exceeded
ProvisioningFailed: failed to provision volume with StorageClass "e2e-provisioning-5808-e2e-sc8rlfx": rpc error: code = DeadlineExceeded desc = context deadline exceeded
ProvisioningFailed: failed to provision volume with StorageClass "e2e-provisioning-5808-e2e-sc8rlfx": rpc error: code = Aborted desc = An operation with the given Volume ID pvc-7036e240-36f6-4a1b-b4bf-827c8e0d2d5a already exists
ProvisioningFailed: failed to provision volume with StorageClass "e2e-provisioning-5808-e2e-sc8rlfx": rpc error: code = Aborted desc = An operation with the given Volume ID pvc-022fd7e2-377f-4bb8-891a-ac1d29bcae6b already exists
ProvisioningFailed: failed to provision volume with StorageClass "e2e-provisioning-5808-e2e-sc8rlfx": rpc error: code = Aborted desc = An operation with the given Volume ID pvc-ae2482ad-ff83-43db-a7ec-002b083a415f already exists
ProvisioningFailed: failed to provision volume with StorageClass "e2e-provisioning-5808-e2e-sc8rlfx": rpc error: code = Aborted desc = An operation with the given Volume ID pvc-7ed1f60f-fd98-434a-af94-7bf0d0b95ac7 already exists
ProvisioningFailed: failed to provision volume with StorageClass "e2e-provisioning-5808-e2e-sc8rlfx": rpc error: code = Aborted desc = An operation with the given Volume ID pvc-c8aa9145-352f-45fd-9435-09b7baac3df7 already exists

Comment 2 Jan Safranek 2022-03-09 12:24:14 UTC
I checked the sources, the driver is idempotent, it only takes > 5 minutes to provision few volumes in parallel.

Comment 3 Jan Safranek 2022-03-09 12:26:43 UTC
I'm keeping the BZ open to skip the "parallel" tests.

Comment 8 Fabio Bertinatto 2022-07-15 18:15:09 UTC
The PR aboved seemed to have fixed the issue in jobs 4.11 CSI jobs (as intended), but I still see failures in Azure StackHub jobs:

https://search.ci.openshift.org/?search=provisioning+should+provision+storage+with+pvc+data+source+in+parallel&maxAge=48h&context=1&type=bug%2Bissue%2Bjunit&name=azure&excludeName=&maxMatches=5&maxBytes=20971520&groupBy=job

Comment 11 Wei Duan 2022-08-24 06:19:10 UTC
Checked two cases are disabled in azurestack csi CI for multi releases. Mark it as Verified.

Comment 14 errata-xmlrpc 2022-09-07 20:49:23 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.11.3 packages and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:6287