Bug 1014875

Summary: database deadlock in XML-RPC method recipes.tasks.register_result_file
Product: [Retired] Beaker Reporter: Dan Callaghan <dcallagh>
Component: schedulerAssignee: Raymond Mancy <rmancy>
Status: CLOSED CURRENTRELEASE QA Contact: Nick Coghlan <ncoghlan>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 0.15CC: aigao, asaha, dcallagh, ebaak, jingwang, llim, qwan, rmancy
Target Milestone: 0.15.1   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-10-23 06:57:05 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Dan Callaghan 2013-10-03 01:19:02 UTC
When handling recipes.tasks.register_result_file:

Oct  3 00:54:57 beaker-02 beaker-server[21247]: bkr.server.xmlrpccontroller ERROR Error handling XML-RPC method
Oct  3 00:54:57 beaker-02 beaker-server[21247]:  Traceback (most recent call last):
Oct  3 00:54:57 beaker-02 beaker-server[21247]:    File "/usr/lib/python2.6/site-packages/bkr/server/xmlrpccontroller.py", line 47, in RPC2
Oct  3 00:54:57 beaker-02 beaker-server[21247]:      response = self.process_rpc(method,params)
Oct  3 00:54:57 beaker-02 beaker-server[21247]:    File "/usr/lib/python2.6/site-packages/bkr/server/xmlrpccontroller.py", line 36, in process_rpc
Oct  3 00:54:57 beaker-02 beaker-server[21247]:      response = obj(*params)
Oct  3 00:54:57 beaker-02 beaker-server[21247]:    File "<string>", line 2, in register_result_file
Oct  3 00:54:57 beaker-02 beaker-server[21247]:    File "/usr/lib/python2.6/site-packages/bkr/server/identity.py", line 242, in require
Oct  3 00:54:57 beaker-02 beaker-server[21247]:      return func(*args, **kwargs)
Oct  3 00:54:57 beaker-02 beaker-server[21247]:    File "/usr/lib/python2.6/site-packages/bkr/server/recipetasks.py", line 78, in register_result_file
Oct  3 00:54:57 beaker-02 beaker-server[21247]:      filename=filename,
Oct  3 00:54:57 beaker-02 beaker-server[21247]:    File "/usr/lib/python2.6/site-packages/bkr/server/model.py", line 4144, in lazy_create
Oct  3 00:54:57 beaker-02 beaker-server[21247]:      return super(Log, cls).lazy_create(path=cls._normalized_path(path), **kwargs)
Oct  3 00:54:57 beaker-02 beaker-server[21247]:    File "/usr/lib/python2.6/site-packages/bkr/server/model.py", line 1489, in lazy_create
Oct  3 00:54:57 beaker-02 beaker-server[21247]:      unique_params, extra_params))
Oct  3 00:54:57 beaker-02 beaker-server[21247]:    File "/usr/lib64/python2.6/site-packages/sqlalchemy/engine/base.py", line 1191, in execute
Oct  3 00:54:57 beaker-02 beaker-server[21247]:      params)
Oct  3 00:54:57 beaker-02 beaker-server[21247]:    File "/usr/lib64/python2.6/site-packages/sqlalchemy/engine/base.py", line 1271, in _execute_clauseelement
Oct  3 00:54:57 beaker-02 beaker-server[21247]:      return self.__execute_context(context)
Oct  3 00:54:57 beaker-02 beaker-server[21247]:    File "/usr/lib64/python2.6/site-packages/sqlalchemy/engine/base.py", line 1302, in __execute_context
Oct  3 00:54:57 beaker-02 beaker-server[21247]:      context.parameters[0], context=context)
Oct  3 00:54:57 beaker-02 beaker-server[21247]:    File "/usr/lib64/python2.6/site-packages/sqlalchemy/engine/base.py", line 1401, in _cursor_execute
Oct  3 00:54:57 beaker-02 beaker-server[21247]:      context)
Oct  3 00:54:57 beaker-02 beaker-server[21247]:    File "/usr/lib64/python2.6/site-packages/sqlalchemy/engine/base.py", line 1394, in _cursor_execute
Oct  3 00:54:57 beaker-02 beaker-server[21247]:      context)
Oct  3 00:54:57 beaker-02 beaker-server[21247]:    File "/usr/lib64/python2.6/site-packages/sqlalchemy/engine/default.py", line 299, in do_execute
Oct  3 00:54:57 beaker-02 beaker-server[21247]:      cursor.execute(statement, parameters)
Oct  3 00:54:57 beaker-02 beaker-server[21247]:    File "/usr/lib64/python2.6/site-packages/MySQLdb/cursors.py", line 173, in execute
Oct  3 00:54:57 beaker-02 beaker-server[21247]:      self.errorhandler(self, exc, value)
Oct  3 00:54:57 beaker-02 beaker-server[21247]:    File "/usr/lib64/python2.6/site-packages/MySQLdb/connections.py", line 36, in defaulterrorhandler
Oct  3 00:54:57 beaker-02 beaker-server[21247]:      raise errorclass, errorvalue
Oct  3 00:54:57 beaker-02 beaker-server[21247]:  OperationalError: (OperationalError) (1213, 'Deadlock found when trying to get lock; try restarting transaction') 'INSERT INTO log_recipe_task_result (log_recipe_task_result.recipe_task_result_id, log_recipe_task_result.path, log_recipe_task_result.filename, log_recipe_task_result.start_time)\nSELECT %s, %s, %s, %s\nFROM DUAL\nWHERE NOT EXISTS (SELECT 1 FROM log_recipe_task_result\nWHERE log_recipe_task_result.path = %s AND log_recipe_task_result.filename = %s AND log_recipe_task_result.recipe_task_result_id = %s FOR UPDATE)'

Comment 6 Nick Coghlan 2013-10-03 05:24:06 UTC
On Gerrit: http://gerrit.beaker-project.org/#/c/2325/

Comment 7 Dan Callaghan 2013-10-03 06:13:16 UTC
For future reference, a summary of what Ray and I found while looking at this deadlock today:

The deadlock is caused by the "conditional insert" statement from two different LogRecipeTaskResult.lazy_create calls running concurrently. Note that the statements themselves must be executed concurrently by MySQL, not just in concurrent transactions (which makes this unlike all the previous race conditions we've fixed).

The problem can be simulated as follows using two mysql clients. I say "simulated" because the only way to guarantee the "bad" interleaving of the two pieces of the conditional insert is to split them up into two separate queries: first SELECT 1 ... FOR UPDATE, then INSERT.

In terminal 1:

    mysql> BEGIN;
    Query OK, 0 rows affected (0.00 sec)

    mysql> SELECT 1 FROM log_recipe_task_result WHERE log_recipe_task_result.path = '/' AND log_recipe_task_result.filename = 'test.log' AND log_recipe_task_result.recipe_task_result_id = 5046 FOR UPDATE;
    Empty set (0.00 sec)

In terminal 2:

    mysql> BEGIN;
    Query OK, 0 rows affected (0.00 sec)

    mysql> SELECT 1 FROM log_recipe_task_result WHERE log_recipe_task_result.path = '/' AND log_recipe_task_result.filename = 'test.log' AND log_recipe_task_result.recipe_task_result_id = 5047 FOR UPDATE;
    Empty set (0.00 sec)

In terminal 1:

    mysql> INSERT INTO log_recipe_task_result (recipe_task_result_id, path, filename, start_time) SELECT 5046, '/', 'test.log', '2013-10-03 01:04:00';

(Terminal 1 blocks)
In terminal 2:

    mysql> INSERT INTO log_recipe_task_result (recipe_task_result_id, path, filename, start_time) SELECT 5047, '/', 'test.log', '2013-10-03 01:04:00';
    Query OK, 1 row affected (2.69 sec)

In terminal 1:

    ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction

Comment 9 wangjing 2013-10-11 07:15:29 UTC
have talked with Dan,  now QE have no env to reproduce the deadlock to test the fix, so set qa_ack-.

Comment 10 HSS Product Manager 2013-10-11 07:17:34 UTC
HSS-QE has reviewed and declined this request. QE for this bug will be handled by IED.

Comment 11 Nick Coghlan 2013-10-11 07:28:58 UTC
Marking as VERIFIED based on the new automated regression test included with the patch: http://gerrit.beaker-project.org/#/c/2325/3/IntegrationTests/src/bkr/inttest/server/test_model.py

Comment 12 Raymond Mancy 2013-10-23 06:57:05 UTC
beaker 0.15.1 has been released.