Bug 1637787
| Summary: | Another transaction in progress seen in heketi logs while deleting pvcs. | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | RamaKasturi <knarra> | ||||
| Component: | heketi | Assignee: | John Mulligan <jmulligan> | ||||
| Status: | CLOSED NEXTRELEASE | QA Contact: | RamaKasturi <knarra> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | ocs-3.11 | CC: | hchiramm, jmulligan, kramdoss, madam, rhs-bugs, rtalur, sankarshan, storage-qa-internal | ||||
| Target Milestone: | --- | ||||||
| Target Release: | --- | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2019-01-23 21:30:56 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | 1636872 | ||||||
| Bug Blocks: | |||||||
| Attachments: |
|
||||||
heketi db dump is copied to the location below. http://rhsqe-repo.lab.eng.blr.redhat.com/cns/bugs/BZ-1600042/db_dump_oct8.txt heketi log is copied to the location below. http://rhsqe-repo.lab.eng.blr.redhat.com/cns/bugs/BZ-1600042/heketi_delete_oct8.log As mentioned in bug 1636872 it is required to have a delay between volume operations. It is important to know if deleting the volumes with a 2 second delay is more stable. On the other hand, we may add a retry of the operations when the gluster CLI times out or gets a "another transaction is in progress" error. |
Created attachment 1492345 [details] createLargeScaleTest.sh Description of problem: I have used the script attached (createLargeScaleTest.sh) where a mongodb pod is created and pvc is bound to it as soon as it is created. Once the pvcs are bound successfully i am using another script attached (deleteLargeScaleTest.sh) to delete dc, service and pvc created. I see that all the pvcs get deleted but when i run oc get pv i see all of the pvs in failed state. oc get pvc : ================= No resources found oc get pv: ==================== oc get pv | grep Failed | wc -l 950 Version-Release number of selected component (if applicable): [root@dhcp46-231 ~]# oc version oc v3.11.20 kubernetes v1.11.0+d4cacc0 features: Basic-Auth GSSAPI Kerberos SPNEGO heketi-client-7.0.0-14.el7rhgs.x86_64 How reproducible: Hit it once Steps to Reproduce: 1. Run the script attached createLargeScaleTest.sh and wait for all the pvcs to be bound 2. Run the script attached deleteLargeScaleTest.sh to delete all the resources created as part of above test. 3. Actual results: There are 950 pvs in failed state and they do not get deleted. In the heketi logs i see the error as "Another transaction in progress" Expected results: All the pvs should get delted. Additional info: