Bug 1637787 - Another transaction in progress seen in heketi logs while deleting pvcs.
Summary: Another transaction in progress seen in heketi logs while deleting pvcs.
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: heketi
Version: ocs-3.11
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: ---
Assignee: John Mulligan
QA Contact: RamaKasturi
URL:
Whiteboard:
Depends On: 1636872
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-10-10 06:18 UTC by RamaKasturi
Modified: 2019-01-23 21:30 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-01-23 21:30:56 UTC
Embargoed:


Attachments (Terms of Use)
createLargeScaleTest.sh (409 bytes, application/x-shellscript)
2018-10-10 06:18 UTC, RamaKasturi
no flags Details

Description RamaKasturi 2018-10-10 06:18:44 UTC
Created attachment 1492345 [details]
createLargeScaleTest.sh

Description of problem:
I have used the script attached (createLargeScaleTest.sh) where a mongodb pod is created and pvc is bound to it as soon as it is created. Once the pvcs are bound successfully i am using another script attached (deleteLargeScaleTest.sh) to delete dc, service and pvc created. I see that all the pvcs get deleted but when i run oc get pv i see all of the pvs in failed state.

oc get pvc :
=================
No resources found

oc get pv:
====================
oc get pv | grep Failed | wc -l
950


Version-Release number of selected component (if applicable):
[root@dhcp46-231 ~]# oc version
oc v3.11.20
kubernetes v1.11.0+d4cacc0
features: Basic-Auth GSSAPI Kerberos SPNEGO
heketi-client-7.0.0-14.el7rhgs.x86_64


How reproducible:
Hit it once

Steps to Reproduce:
1. Run the script attached createLargeScaleTest.sh and wait for all the pvcs to be bound
2. Run the script attached deleteLargeScaleTest.sh to delete all the resources created as part of above test.
3.

Actual results:
There are 950 pvs in failed state and they do not get deleted. In the heketi logs i see the error as "Another transaction in progress"

Expected results:
All the pvs should get delted.

Additional info:

Comment 2 RamaKasturi 2018-10-10 06:20:19 UTC
heketi db dump is copied to the location below.

http://rhsqe-repo.lab.eng.blr.redhat.com/cns/bugs/BZ-1600042/db_dump_oct8.txt

heketi log is copied to the location below.

http://rhsqe-repo.lab.eng.blr.redhat.com/cns/bugs/BZ-1600042/heketi_delete_oct8.log

Comment 4 Niels de Vos 2018-10-15 13:19:08 UTC
As mentioned in bug 1636872 it is required to have a delay between volume operations. It is important to know if deleting the volumes with a 2 second delay is more stable.

On the other hand, we may add a retry of the operations when the gluster CLI times out or gets a "another transaction is in progress" error.


Note You need to log in before you can comment on or make changes to this bug.