Bug 1337790 - delete log_recipe_task_result rows for deleted jobs
Summary: delete log_recipe_task_result rows for deleted jobs
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Beaker
Classification: Retired
Component: general
Version: 22
Hardware: Unspecified
OS: Unspecified
medium
unspecified
Target Milestone: 23.0
Assignee: matt jia
QA Contact: tools-bugs
URL:
Whiteboard:
Depends On: 1322700
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-05-20 05:30 UTC by matt jia
Modified: 2016-07-07 23:10 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Clone Of: 1322700
Environment:
Last Closed: 2016-07-07 23:10:28 UTC
Embargoed:


Attachments (Terms of Use)

Description matt jia 2016-05-20 05:30:53 UTC
We have some obsoleted data on log_recipe_task_result for deleted jobs.

On a production db dump of 2015-08-14,

MariaDB [beaker]> SELECT count(*) FROM log_recipe_task_result INNER JOIN recipe_task_result on log_recipe_task_result.recipe_task_result_id  = recipe_task_result.id INNER JOIN recipe_task ON recipe_task_result.recipe_task_id = recipe_task.id INNER JOIN recipe on recipe_task.recipe_id = recipe.id INNER JOIN recipe_set on recipe.recipe_set_id = recipe_set.id INNER JOIN job on recipe_set.job_id = job.id WHERE job.deleted IS NOT NULL;

MariaDB [beaker]>

37688361 rows in set (23 min 6.76 sec)


We need to clean them before running db migration introduced in bug 1322700 to avoid triggering an error like this

sqlalchemy.exc.IntegrityError: (IntegrityError) (1451, 'Cannot delete or update a parent row: a foreign key constraint fails (`beaker`.`log_recipe_task_result`, CONSTRAINT `fk_log_recipe_task_result_recipe_task_result_id` FOREIGN KEY (`recipe_task_result_id`) REFERENCES `recipe_task_result` (`id`))') '\n        DELETE recipe_task_result\n        FROM recipe_task_result\n        INNER JOIN recipe_task ON recipe_task_result.recipe_task_id = recipe_task.id\n        INNER JOIN recipe on recipe_task.recipe_id = recipe.id\n        INNER JOIN recipe_set on recipe.recipe_set_id = recipe_set.id\n        INNER JOIN job on recipe_set.job_id = job.id\n        WHERE job.deleted IS NOT NULL\n        ' ()

Comment 1 matt jia 2016-05-20 06:06:45 UTC
On Gerrit:

   https://gerrit.beaker-project.org/#/c/4906/

Comment 2 Dan Callaghan 2016-05-20 13:25:32 UTC
Hmm this seems really strange to me. I don't remember hitting this FK violation when I ran the migration. I was using a slightly newer db dump (20160121).

Also we have code which clears these rows (Recipe.delete, it's the whole point of that method) and I'm sure it's always been there and should have been working -- so I would want to know exactly *where* these rows are coming from, and why they weren't deleted by Recipe.delete like they should have been.

Comment 3 Dan Callaghan 2016-05-20 13:32:51 UTC
Matt, can you re-test with a fresh recent db dump? Or better yet, check live using Teiid?

Is it possible that you are hitting this problem just because your database got into a weird state because of something you were testing (with job.deleted set to non-NULL even though the log rows weren't properly deleted)?

If the problem really does exist it would be worth checking *when* these log_recipe_task_result rows are left behind from.

Comment 4 matt jia 2016-05-23 01:06:18 UTC
(In reply to Dan Callaghan from comment #3)
> Matt, can you re-test with a fresh recent db dump? Or better yet, check live
> using Teiid?
> 

I have checked the live Teiid and I can see it still has some obsoleted log_recipe_task_result rows. Here is one example:

public=> SELECT log_recipe_task_result.id as log_recipe_task_result_id, job.deleted, job.to_delete  FROM log_recipe_task_result INNER JOIN recipe_task_result on log_recipe_task_result.recipe_task_result_id  = recipe_task_result.id INNER JOIN recipe_task ON recipe_task_result.recipe_task_id = recipe_task.id INNER JOIN recipe on recipe_task.recipe_id = recipe.id INNER JOIN recipe_set on recipe.recipe_set_id = recipe_set.id INNER JOIN job on recipe_set.job_id = job.id WHERE job.id = 423903;

 log_recipe_task_result_id |        deleted        |       to_delete       
---------------------------+-----------------------+-----------------------
                 181029403 | 2013-05-29 01:35:04.0 | 2013-05-28 09:03:18.0
(1 row)


> Is it possible that you are hitting this problem just because your database
> got into a weird state because of something you were testing (with
> job.deleted set to non-NULL even though the log rows weren't properly
> deleted)?
> 
> If the problem really does exist it would be worth checking *when* these
> log_recipe_task_result rows are left behind from.

It seems like some rows are left behind in 2013. I cannot find what causes this problem as those rows should have been deleted by log-delete.py. I suspect it may be caused by someone marked those jobs as deleted during that time for some reasons.

Comment 5 matt jia 2016-05-24 00:48:49 UTC
(In reply to matt jia from comment #4)
> (In reply to Dan Callaghan from comment #3)
> > Matt, can you re-test with a fresh recent db dump? Or better yet, check live
> > using Teiid?
> > 
> 
> I have checked the live Teiid and I can see it still has some obsoleted
> log_recipe_task_result rows. Here is one example:
> 
> public=> SELECT log_recipe_task_result.id as log_recipe_task_result_id,
> job.deleted, job.to_delete  FROM log_recipe_task_result INNER JOIN
> recipe_task_result on log_recipe_task_result.recipe_task_result_id  =
> recipe_task_result.id INNER JOIN recipe_task ON
> recipe_task_result.recipe_task_id = recipe_task.id INNER JOIN recipe on
> recipe_task.recipe_id = recipe.id INNER JOIN recipe_set on
> recipe.recipe_set_id = recipe_set.id INNER JOIN job on recipe_set.job_id =
> job.id WHERE job.id = 423903;
> 
>  log_recipe_task_result_id |        deleted        |       to_delete       
> ---------------------------+-----------------------+-----------------------
>                  181029403 | 2013-05-29 01:35:04.0 | 2013-05-28 09:03:18.0
> (1 row)
> 
MariaDB [beaker]> SELECT id, recipe_task_result_id  from log_recipe_task_result where id = 181029403;
+-----------+-----------------------+
| id        | recipe_task_result_id |
+-----------+-----------------------+
| 181029403 |              65567939 |
+-----------+-----------------------+
1 row in set (0.00 sec)


MariaDB [beaker]> select id from recipe_task_result where id = 65567939;
+----------+
| id       |
+----------+
| 65567939 |
+----------+]

Also checked this job on the latest production db dump and I can confirm that the problem does exist.

Comment 6 matt jia 2016-05-24 01:53:56 UTC
Moreover, re-test this on a latest production db dump and the problem does exist.

IntegrityError: (IntegrityError) (1451, 'Cannot delete or update a parent row: a foreign key constraint fails (`beaker`.`log_recipe_task_result`, CONSTRAINT `fk_log_recipe_task_result_recipe_task_result_id` FOREIGN KEY (`recipe_task_result_id`) REFERENCES `recipe_task_result` (`id`))') '\n        DELETE recipe_task_result\n        FROM recipe_task_result\n        INNER JOIN recipe_task ON recipe_task_result.recipe_task_id = recipe_task.id\n        INNER JOIN recipe on recipe_task.recipe_id = recipe.id\n        INNER JOIN recipe_set on recipe.recipe_set_id = recipe_set.id\n        INNER JOIN job on recipe_set.job_id = job.id\n        WHERE job.deleted IS NOT NULL\n        ' ()
Traceback (most recent call last):
  File "/bin/beaker-init", line 9, in <module>
    load_entry_point('bkr.server===23.0.git.215.490e9da', 'console_scripts', 'beaker-init')()
  File "/usr/lib/python2.7/site-packages/bkr/server/tools/init.py", line 378, in main
    return doit(opts)
  File "/usr/lib/python2.7/site-packages/bkr/server/util.py", line 178, in decorated
    return func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/bkr/server/tools/init.py", line 413, in doit
    upgrade_db(metadata)
  File "/usr/lib/python2.7/site-packages/bkr/server/tools/init.py", line 265, in upgrade_db
    run_alembic_operation(metadata, upgrade)
  File "/usr/lib/python2.7/site-packages/bkr/server/tools/init.py", line 317, in run_alembic_operation
    env_context.run_migrations()
  File "/usr/lib/python2.7/site-packages/alembic/runtime/environment.py", line 797, in run_migrations
    self.get_context().run_migrations(**kw)
  File "/usr/lib/python2.7/site-packages/alembic/runtime/migration.py", line 312, in run_migrations
    step.migration_fn(**kw)
  File "/usr/lib/python2.7/site-packages/bkr/server/alembic/versions/15d3fad78656_delete_recipe_task_results_for_deleted_jobs.py", line 29, in upgrade
    """)
  File "<string>", line 8, in execute
  File "<string>", line 3, in execute
  File "/usr/lib/python2.7/site-packages/alembic/operations/ops.py", line 1826, in execute
    return operations.invoke(op)
  File "/usr/lib/python2.7/site-packages/alembic/operations/base.py", line 318, in invoke
    return fn(self, operation)
  File "/usr/lib/python2.7/site-packages/alembic/operations/toimpl.py", line 161, in execute_sql
    execution_options=operation.execution_options
  File "/usr/lib/python2.7/site-packages/alembic/ddl/impl.py", line 121, in execute
    self._exec(sql, execution_options)
  File "/usr/lib/python2.7/site-packages/alembic/ddl/impl.py", line 118, in _exec
    return conn.execute(construct, *multiparams, **params)
  File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 841, in execute
    return meth(self, multiparams, params)
  File "/usr/lib64/python2.7/site-packages/sqlalchemy/sql/elements.py", line 322, in _execute_on_connection
    return connection._execute_clauseelement(self, multiparams, params)
  File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 938, in _execute_clauseelement
    compiled_sql, distilled_params
  File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1070, in _execute_context
    context)
  File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1271, in _handle_dbapi_exception
    exc_info
  File "/usr/lib64/python2.7/site-packages/sqlalchemy/util/compat.py", line 199, in raise_from_cause
    reraise(type(exception), exception, tb=exc_tb)
  File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1063, in _execute_context
    context)
  File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/default.py", line 442, in do_execute
    cursor.execute(statement, parameters)
  File "/usr/lib64/python2.7/site-packages/MySQLdb/cursors.py", line 220, in execute
    self.errorhandler(self, exc, value)
  File "/usr/lib64/python2.7/site-packages/MySQLdb/connections.py", line 36, in defaulterrorhandler
    raise errorvalue
sqlalchemy.exc.IntegrityError: (IntegrityError) (1451, 'Cannot delete or update a parent row: a foreign key constraint fails (`beaker`.`log_recipe_task_result`, CONSTRAINT `fk_log_recipe_task_result_recipe_task_result_id` FOREIGN KEY (`recipe_task_result_id`) REFERENCES `recipe_task_result` (`id`))') '\n        DELETE recipe_task_result\n        FROM recipe_task_result\n        INNER JOIN recipe_task ON recipe_task_result.recipe_task_id = recipe_task.id\n        INNER JOIN recipe on recipe_task.recipe_id = recipe.id\n        INNER JOIN recipe_set on recipe.recipe_set_id = recipe_set.id\n        INNER JOIN job on recipe_set.job_id = job.id\n        WHERE job.deleted IS NOT NULL\n        ' ()

Comment 7 Dan Callaghan 2016-05-24 15:44:25 UTC
Okay, not sure how I never hit this when I was testing... maybe I *did* hit it and I just fixed it somehow and forgot to file a bug.

Do we have ON DELETE CASCADE rules for this FK, either in the production db or the Python model definition?

If the rows left behind are *only* old ones from 2013, then I guess we just go ahead delete them in a migration. As long as there are no more recent ones -- because that would indicate that Recipe.delete() is not actually working properly.

Comment 8 matt jia 2016-05-25 03:14:58 UTC
(In reply to Dan Callaghan from comment #7)
> Okay, not sure how I never hit this when I was testing... maybe I *did* hit
> it and I just fixed it somehow and forgot to file a bug.
> 
> Do we have ON DELETE CASCADE rules for this FK, either in the production db
> or the Python model definition?
>

Yes, we have the cascade='all, delete-orphan' rule set in the Python model definition for logs.

Comment 9 matt jia 2016-05-25 04:11:36 UTC
(In reply to Dan Callaghan from comment #7)

> If the rows left behind are *only* old ones from 2013, then I guess we just
> go ahead delete them in a migration. As long as there are no more recent
> ones -- because that would indicate that Recipe.delete() is not actually
> working properly.

I did not see any recent ones in the production db as all the rows left behind are before 2013-05-29.

Comment 10 Dan Callaghan 2016-05-25 20:40:28 UTC
(In reply to matt jia from comment #8)
> Yes, we have the cascade='all, delete-orphan' rule set in the Python model
> definition for logs.

That's Python-level ORM cascade. I was talking about foreign key cascade rules. Anyway it looks we don't have those (either in the model definition or in the production db). Although we possibly should... not sure.

Comment 11 matt jia 2016-05-26 00:54:16 UTC
(In reply to Dan Callaghan from comment #10)
> (In reply to matt jia from comment #8)
> > Yes, we have the cascade='all, delete-orphan' rule set in the Python model
> > definition for logs.
> 
> That's Python-level ORM cascade. I was talking about foreign key cascade
> rules. Anyway it looks we don't have those (either in the model definition
> or in the production db). Although we possibly should... not sure.

Oh, I see. After reading the docs of SqlAchemy, if I understand it correctly, the database level ON DELETE cascade should be set if we want to delete the parent rows. 

See "ORM-level “delete” cascade vs. FOREIGN KEY level “ON DELETE” cascade" in

http://docs.sqlalchemy.org/en/latest/orm/cascades.html#delete

Comment 14 Dan Callaghan 2016-07-07 23:10:28 UTC
Beaker 23.0 has been released.


Note You need to log in before you can comment on or make changes to this bug.