Bug 1274074 - LockAcquisitionException when binding to pools
LockAcquisitionException when binding to pools
Status: CLOSED CURRENTRELEASE
Product: Candlepin
Classification: Community
Component: candlepin (Show other bugs)
0.9
Unspecified Unspecified
high Severity high
: ---
: ---
Assigned To: Filip Nguyen
Katello QA List
: Triaged
Depends On:
Blocks: 1300079
  Show dependency treegraph
 
Reported: 2015-10-21 16:22 EDT by Jonathon Turel
Modified: 2016-06-15 09:54 EDT (History)
11 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1300079 (view as bug list)
Environment:
Last Closed: 2016-06-15 09:54:36 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Comment 10 Dana Safford 2015-11-03 17:49:39 EST
e underlying situation is now on the Corporate Escalation List. I set the customer escalation flag on this BZ.

We will need more attention on this BZ. Please increase its priority.

Thanks,
Comment 11 Gary Lamb 2015-11-03 18:12:38 EST
Hi Filip - Could you please add me to the Service Desk Incident. Perhaps we can raise it's priority in the queue to help unblock you.
Comment 18 Alex Wood 2015-11-16 13:27:26 EST
We have solved the deadlock issue; however, it then results in an issue with lock-waits timing out.  In MySQL, the default lock wait timeout is 50s.  Depending on the amount of data and the load on some of our services, binds can take anywhere from 5s to 30s.  The portal is enqueuing all of the bind jobs at the same time so they are more or less getting run simultaneous.  If a user asks for 6 binds and each bind takes 10s, then obviously one job is going to time out.

There are a few solutions:
1. Decrease bind time dramatically.
2. Only enqueue 2 or 3 jobs per consumer at a time.
3. Do not perform async binds at all.

I am currently exploring option 1 as I think it results in the best user experience.  Option 2 is definitely workable as well.  Option 3 will suffice as a workaround.

We're working on getting option 3 in place while writing code for a longer term fix.
Comment 23 Filip Nguyen 2016-05-19 07:00:23 EDT
This was fixed in master with 

https://github.com/candlepin/candlepin/pull/1015

In 0.9.51 this was fixed with

https://github.com/candlepin/candlepin/pull/1031
Comment 24 Barnaby Court 2016-06-15 09:54:36 EDT
Moving to closed as the patches have been merged & built

Note You need to log in before you can comment on or make changes to this bug.