1337789 – beaker-log-delete left behind some job logs from May 2013 and earlier

Bug 1337789 - beaker-log-delete left behind some job logs from May 2013 and earlier

Summary: beaker-log-delete left behind some job logs from May 2013 and earlier

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Beaker
Classification:	Retired
Component:	general
Sub Component:
Version:	22
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	25.0
Assignee:	Dan Callaghan
QA Contact:	Matt Tyson 🤬
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2016-05-20 05:29 UTC by matt jia
Modified:	2018-03-19 04:19 UTC (History)
CC List:	7 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2018-03-19 04:19:07 UTC
Embargoed:

Attachments	(Terms of Use)

Description matt jia 2016-05-20 05:29:24 UTC

Similar to bug 1322700, we should delete the log_recipe_task and log_recipe rows for the deleted jobs.

On a production db dump:

MariaDB [beaker]> SELECT count(*) FROM log_recipe_task INNER JOIN recipe_task ON log_recipe_task.recipe_task_id = recipe_task.id INNER JOIN recipe on recipe_task.recipe_id = recipe.id INNER JOIN recipe_set on recipe.recipe_set_id = recipe_set.id INNER JOIN job on recipe_set.job_id = job.id WHERE job.deleted IS NOT NULL;
+----------+
| count(*) |
+----------+
| 18403070 |
+----------+
1 row in set (10 min 33.02 sec)

MariaDB [beaker]> SELECT count(*) FROM log_recipe INNER JOIN recipe on log_recipe.recipe_id = recipe.id INNER JOIN recipe_set on recipe.recipe_set_id = recipe_set.id INNER JOIN job on recipe_set.job_id = job.id WHERE job.deleted IS NOT NULL;
+----------+
| count(*) |
+----------+
|  2915042 |
+----------+
1 row in set (39.77 sec)

Comment 1 Jon Orris 2016-09-19 22:33:06 UTC

https://gerrit.beaker-project.org/#/c/5245/

Comment 2 Jon Orris 2016-09-20 18:34:49 UTC

(In reply to Jon Orris from comment #1)
> https://gerrit.beaker-project.org/#/c/5245/

Pushed to wrong branch. Correct change is:

https://gerrit.beaker-project.org/#/c/5250/

Comment 3 Roman Joost 2016-12-13 07:06:35 UTC

This bug fix is included in beaker-server-24.0.git.210.01bf4bf which is currently available for testing here:

https://beaker-devel.app.eng.bos.redhat.com/

Comment 5 Dan Callaghan 2017-01-09 00:35:18 UTC

Matt, Jon: has someone actually checked that all the log_* rows we will be deleting here have already had the log files deleted on disk as well?

Otherwise this change is actually going to make things worse, not better (since now we will have a pile of old logs left behind on disk with no reference to them in the database to locate them). In that case the proper fix would be to figure out why beaker-log-delete has not already deleted these files and their corresponding log_* rows.

Comment 7 Dan Callaghan 2017-01-09 05:32:37 UTC

But in the meantime we need to roll this patch out of 24.0, right?

Comment 8 matt jia 2017-01-09 06:29:18 UTC

(In reply to Dan Callaghan from comment #7)
> But in the meantime we need to roll this patch out of 24.0, right?

Yeah, I have a patch to revert the patch.

https://gerrit.beaker-project.org/#/c/5572/

Comment 9 Roman Joost 2017-02-20 00:37:03 UTC

I'm actually putting this one back to NEW, since with the recent team re-shuffling it highly likely that someone else needs to finish it.

Comment 10 Dan Callaghan 2017-07-26 05:16:55 UTC

beaker-log-delete and the related model code in Job.delete() has received a lot of bug fixes over the years since it was originally introduced in 2011, so it might not be worthwhile digging through that to find exactly why it ended up leaving some old jobs in this state: where the job is marked as deleted, but some logs are still left behind.

I did look out for old code which looked like it might be failing to delete the logs but then still committing the db transaction -- but actually, all the fixes I could see seemed more likely to be the other way around -- that is the log files would be deleted but then the db transaction *not* committed.

Regardless, we probably just need to add some code to beaker-log-delete to look for jobs which are marked as deleted, but still have logs recorded against them, and to delete those again "properly".

Comment 11 Dan Callaghan 2017-07-26 06:35:17 UTC

It would be good to get this done sooner rather than later, since if there really are 20 million log files which could be deleted this would potentially save quite a bit of storage.

Comment 12 Dan Callaghan 2017-07-27 07:36:44 UTC

So this mess with "deleted" vs. "soft-deleted" vs. "to_delete" has been bugging me for years... It might be finally time to clean that up:

https://gerrit.beaker-project.org/5765 remove Job "sanitise" methods
https://gerrit.beaker-project.org/5766 remove unused params on Job.find_jobs
https://gerrit.beaker-project.org/5767 use Job.is_deleted in queries
https://gerrit.beaker-project.org/5768 rationalise Job.delete_jobs, Job._delete_criteria, job.delete_jobs
https://gerrit.beaker-project.org/5769 clarify deleted vs. expired vs. purged for jobs

With that clean-up in place, the problem here is a lot easier to describe: we have some old jobs which have been *deleted* but not successfully *purged*. And therefore we can just update the db to clear the purged timestamp and let beaker-log-delete purge them again.

https://gerrit.beaker-project.org/5770 make beaker-log-delete fix up old jobs which still have logs

Comment 15 Roman Joost 2018-03-19 04:19:07 UTC

Beaker 25.0 has been released.

Release notes are available upstream: https://beaker-project.org/docs/whats-new/release-25.html

Note You need to log in before you can comment on or make changes to this bug.