Bug 1394470

Summary: keystone does not retry on deadlock Transactions [500 Error]
Product: Red Hat OpenStack Reporter: Attila Fazekas <afazekas>
Component: openstack-keystoneAssignee: Adam Young <ayoung>
Status: CLOSED ERRATA QA Contact: nlevinki <nlevinki>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 10.0 (Newton)CC: jdennis, nkinder, rduartes, sclewis, srevivo
Target Milestone: Upstream M2Keywords: Triaged
Target Release: 11.0 (Ocata)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-05-17 19:46:13 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
keystone_exception.txt none

Description Attila Fazekas 2016-11-12 11:52:42 UTC
Created attachment 1219990 [details]
keystone_exception.txt

Description of problem:
DBDeadlock: (pymysql.err.InternalError) (1213, u'Deadlock found when trying to get lock; try restarting transaction')

The above error is retry-able error, but no evidence for keystone would really did a retry before throwing a 500.

Version-Release number of selected component (if applicable):
python-keystone.noarch              1:10.0.0-3.el7ost  @rhos-10.0-puddle        
python-keystoneauth1.noarch         2.12.2-1.el7ost    @rhos-10.0-puddle        
python-keystoneclient.noarch        1:3.5.0-1.el7ost   @rhos-10.0-puddle        
python-keystonemiddleware.noarch    4.9.0-1.el7ost     @rhos-10.0-puddle  


How reproducible:
Unknown

Comment 1 Adam Young 2016-12-08 16:52:39 UTC
How can I reproduce this?

Comment 2 Adam Young 2016-12-08 16:52:50 UTC
How can I reproduce this?

Comment 3 Adam Young 2016-12-08 17:34:21 UTC
Looks like it was fixed upstream.  Changing the external bug to link to the older discussions

Comment 4 Adam Young 2016-12-08 17:38:52 UTC
Should be fixed in all versions of OSP 11.

Comment 7 Rodrigo Duarte 2017-02-13 19:25:18 UTC
verified for openstack-keystone-11.0.0-0.20170127043446.cefbc3c.el7ost.noarch

When using a MariaDB Galera cluster, the error happens due a race condition [1]. To try to reproduce the error, I've run a big concurrency rally scenario test that would create and delete several roles and projects - this triggers several revocation events. No error was reported or found in keystone logs.

Marking as verified due the lack of evidence the issue persists rather than an actual validation.

[1] http://lists.openstack.org/pipermail/openstack-dev/2015-February/056007.html

Comment 8 errata-xmlrpc 2017-05-17 19:46:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1245