Bug 595801

Summary: [RFE] Allow tagging of RecipeSets for determining data retention
Product: [Retired] Beaker Reporter: Bill Peck <bpeck>
Component: web UIAssignee: Raymond Mancy <rmancy>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: medium Docs Contact:
Priority: high    
Version: 0.5CC: bpeck, ebaak, kbaker, mcsontos, rmancy, rousseau
Target Milestone: ---Keywords: FutureFeature
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-09-30 04:57:00 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 632609    

Description Bill Peck 2010-05-25 16:12:00 UTC
[RFE]

Extend DB Schema to support tagging of recipeSets with a data retention tag.

Only one tag per recipeSet will be allowed.

Example Tags

Name     | TimeFrame           | Default
------------------------------------------
45Days   | DateTime(45 Days)   | True
90Days   | DateTime(90 Days)   | False
120Days  | DateTime(120 Days)  | False
Forever  | DateTime()          | False

[Setting the tag]

XML schema will be extended to support tagging at job submission time.

If no tag is present at submission time then the default tag will be used.

The UI will be extended to support changing the tag after.  Only owner of the job and admin will be allowed to change a tag.

The UI will work similar to how priorities work, you can change the tags of all the recipeSets in a job or one recipeSet at a time.

Extend command line tool to allow setting the tag via command line.

[Notification of Expire]

Early Notification of expunging will be run daily (all notifications for a user will be in one email). Only email if the user prefs say to, default is True.
(RecipeSet.job.finish_time + Tag.TimeFrame - 30 Days) == datetime.now().day


[Report on Expire]

Extend WEB UI to allow query to show what results will be expired in 30, 60, 120, etc..
Extend Command line to show what results will be expired in 30, 60, 120, etc..

[Actual Expire]

The expiring will be based on the following:
(RecipeSet.Job.finish_time + Tag.TimeFrame) < datetime.now()

Comment 1 Edward Rousseau 2010-05-26 18:53:01 UTC
Looks good in general. Do we want the tag names to be purely time based or have descriptive names like "audit" (the forever tag), "Maint" (the GA - 1 release tag that kernel dev is asking for) something like "Active" to designate the data is from an actively developed codebase. The problem with hard dates is the data retention requirement for anything beyond default / forever is product release driven rather than calendar driven. Having descriptive names could allow us to have adaptive business logic for tag expiration (wired to milestones from the product pages for example).

While there is only 1 tag per set the tag needs to be changeable post-run. For example QE doesn't know which run is audit (forever savable) until the release ships having many potential "audit" tagged sets and I think it would be easier to move those from "Active" to "Audit" (more automatable) rather than detagging "Audit" to "Default" at the end.

Comment 2 Bill Peck 2010-08-31 19:40:32 UTC
Hi Ray,

This is higher priority now.  I've updated the target milestone.  If you have any questions be sure to ask.

Thanks

Comment 3 Raymond Mancy 2010-09-06 07:15:02 UTC
I'm not quite sure I understand what these tags are for.

Apparently they indicate when the results of the RecipeSet will expire.
What impact does this expiration have? Are we removing all trace of them, leaving a stub or something? Just removing the logs?
After what point can you _not_ change the tag for the recipeset ?

Comment 4 Bill Peck 2010-09-08 17:01:09 UTC
Ok, Here is what I understand after talking with Ed.

We don't want the expire logic to live in beaker, we want the tags there so that we can query the recipesets that we want to delete.

From a basic standpoint we would have the following tags:

Scratch
60Days
120Days
Active
Audit

(They would live in a table with an admin interface to be able to create more)

We will then have to implement a way to query on these tags so that we can do the following:

#List all RecipeSets that are tagged Scratch and have a completion date older than 30 days
bkr job-list --tag Scratch --complete 30Days

#List all RecipeSets that are from RedHatEnterpriseLinux2.1
bkr job-list --family RedHatEnterpriseLinux2.1

Then this data can be used to delete RecipeSets, that deletion should be recorded in History and the last RecipeSet to be deleted from a job should delete the job.


We can decide later if we just want to delete the data files and leave the DB records.  This is what Brew does.  I would like to do it this way if possible.  If we do keep the DB records we might want to put a hidden field in which we flip when deleting so that we don't see the records by default.


The point is at this point we only need a way to store the tags so we can start laying the metadata groundwork.


Ed or Kevin speak up if I have mis-understood anything here. :-)

Comment 5 Raymond Mancy 2010-09-09 01:11:38 UTC
(In reply to comment #4)
> Ok, Here is what I understand after talking with Ed.
> 
> We don't want the expire logic to live in beaker, we want the tags there so
> that we can query the recipesets that we want to delete.

Are we at least going to have a script in beaker which will handle the deletion process once given a list of expired jobs?

> 
> From a basic standpoint we would have the following tags:
> 
> Scratch
> 60Days
> 120Days
> Active
> Audit
> 

Right.
 

> (They would live in a table with an admin interface to be able to create more)
> 
> We will then have to implement a way to query on these tags so that we can do
> the following:
> 
> #List all RecipeSets that are tagged Scratch and have a completion date older
> than 30 days
> bkr job-list --tag Scratch --complete 30Days
> 
> #List all RecipeSets that are from RedHatEnterpriseLinux2.1
> bkr job-list --family RedHatEnterpriseLinux2.1
> 

Sure. What are you actually envisioning returning here though?
If you've got two recipesets in a job, one with Scratch and one with Audit, how does a command like this handle it? Will it be returning recipeSet ids? (despite the command being 'job-list') 

> Then this data can be used to delete RecipeSets, that deletion should be
> recorded in History and the last RecipeSet to be deleted from a job should
> delete the job.
> 
> 
> We can decide later if we just want to delete the data files and leave the DB
> records.  This is what Brew does.  I would like to do it this way if possible. 
> If we do keep the DB records we might want to put a hidden field in which we
> flip when deleting so that we don't see the records by default.
> 
> 
> The point is at this point we only need a way to store the tags so we can start
> laying the metadata groundwork.
> 
> 
> Ed or Kevin speak up if I have mis-understood anything here. :-)

Comment 6 Bill Peck 2010-09-09 02:37:15 UTC
(In reply to comment #5)
> (In reply to comment #4)
> > Ok, Here is what I understand after talking with Ed.
> > 
> > We don't want the expire logic to live in beaker, we want the tags there so
> > that we can query the recipesets that we want to delete.
> 
> Are we at least going to have a script in beaker which will handle the deletion
> process once given a list of expired jobs?

beaker will need to provide an interface for deleting the job data files, and possibly deleting the job metadata.  If not actually deleting the DB records then setting a hidden flag to keep them from showing by default.

bkr job-delete --logs-only RS:232

I would support deleting a whole job as well if asked.
bkr job-delete J:231

> 
> > 
> > From a basic standpoint we would have the following tags:
> > 
> > Scratch
> > 60Days
> > 120Days
> > Active
> > Audit
> > 
> 
> Right.
> 
> 
> > (They would live in a table with an admin interface to be able to create more)
> > 
> > We will then have to implement a way to query on these tags so that we can do
> > the following:
> > 
> > #List all RecipeSets that are tagged Scratch and have a completion date older
> > than 30 days
> > bkr job-list --tag Scratch --complete 30Days
> > 
> > #List all RecipeSets that are from RedHatEnterpriseLinux2.1
> > bkr job-list --family RedHatEnterpriseLinux2.1
> > 
> 
> Sure. What are you actually envisioning returning here though?
> If you've got two recipesets in a job, one with Scratch and one with Audit, how
> does a command like this handle it? Will it be returning recipeSet ids?

RS:232

> (despite the command being 'job-list') 

The command could be called recipeset I suppose.  Its really just a distinction between jobs and other things you list like distros, tasks, etc..


> 
> > Then this data can be used to delete RecipeSets, that deletion should be
> > recorded in History and the last RecipeSet to be deleted from a job should
> > delete the job.
> > 
> > 
> > We can decide later if we just want to delete the data files and leave the DB
> > records.  This is what Brew does.  I would like to do it this way if possible. 
> > If we do keep the DB records we might want to put a hidden field in which we
> > flip when deleting so that we don't see the records by default.
> > 
> > 
> > The point is at this point we only need a way to store the tags so we can start
> > laying the metadata groundwork.
> > 
> > 
> > Ed or Kevin speak up if I have mis-understood anything here. :-)

Comment 7 Raymond Mancy 2010-09-09 03:04:44 UTC
(In reply to comment #6)
> (In reply to comment #5)
> > (In reply to comment #4)
> > > Ok, Here is what I understand after talking with Ed.
> > > 
> > > We don't want the expire logic to live in beaker, we want the tags there so
> > > that we can query the recipesets that we want to delete.
> > 
> > Are we at least going to have a script in beaker which will handle the deletion
> > process once given a list of expired jobs?
> 
> beaker will need to provide an interface for deleting the job data files, and
> possibly deleting the job metadata.  If not actually deleting the DB records
> then setting a hidden flag to keep them from showing by default.
> 
> bkr job-delete --logs-only RS:232
> 
> I would support deleting a whole job as well if asked.
> bkr job-delete J:231
> 
> > 
> > > 
> > > From a basic standpoint we would have the following tags:
> > > 
> > > Scratch
> > > 60Days
> > > 120Days
> > > Active
> > > Audit
> > > 
> > 
> > Right.
> > 
> > 
> > > (They would live in a table with an admin interface to be able to create more)
> > > 
> > > We will then have to implement a way to query on these tags so that we can do
> > > the following:
> > > 
> > > #List all RecipeSets that are tagged Scratch and have a completion date older
> > > than 30 days
> > > bkr job-list --tag Scratch --complete 30Days
> > > 
> > > #List all RecipeSets that are from RedHatEnterpriseLinux2.1
> > > bkr job-list --family RedHatEnterpriseLinux2.1
> > > 
> > 
> > Sure. What are you actually envisioning returning here though?
> > If you've got two recipesets in a job, one with Scratch and one with Audit, how
> > does a command like this handle it? Will it be returning recipeSet ids?
> 
> RS:232
> 
> > (despite the command being 'job-list') 
> 
> The command could be called recipeset I suppose.  Its really just a distinction
> between jobs and other things you list like distros, tasks, etc..
> 

No you're right it's just a paradigm shift that needs to be made.

> 
> > 
> > > Then this data can be used to delete RecipeSets, that deletion should be
> > > recorded in History and the last RecipeSet to be deleted from a job should
> > > delete the job.
> > > 
> > > 
> > > We can decide later if we just want to delete the data files and leave the DB
> > > records.  This is what Brew does.  I would like to do it this way if possible. 
> > > If we do keep the DB records we might want to put a hidden field in which we
> > > flip when deleting so that we don't see the records by default.
> > > 
> > > 
> > > The point is at this point we only need a way to store the tags so we can start
> > > laying the metadata groundwork.
> > > 
> > > 
> > > Ed or Kevin speak up if I have mis-understood anything here. :-)

Comment 8 Raymond Mancy 2010-09-09 07:34:07 UTC
I'm trying to keep the implementation generic enough that we can use these tags throughout beaker. This would mean that we could use tags on systems,tasks or anything else we want to.

Comment 9 Raymond Mancy 2010-09-14 00:21:37 UTC
Still need to implement server-side process for deletion, and client-side script. Also post job submit tag change.

Comment 10 Raymond Mancy 2010-09-22 02:27:44 UTC
(In reply to comment #4)

> 
> #List all RecipeSets that are from RedHatEnterpriseLinux2.1
> bkr job-list --family RedHatEnterpriseLinux2.1

Isn't the family at the Recipe level? 
recipe.distro.osversion

Comment 11 Bill Peck 2010-09-22 13:22:16 UTC
Your right.

I don't think it makes much sense to keep part of a recipeset.  Which is why I was thinking delete the recipeSets, but as you point out there isn't much data to query on from the recipeset.

Should we allow deleting of the recipes without deleting the recipeset?

Comment 12 Raymond Mancy 2010-09-26 01:11:00 UTC
Seeing as there isn't much point in keeping half a recipeset, and there isn't much data in the recipeset, with the family we could always query on the recipe, and then delete the whole containing recipeset.

Comment 13 Raymond Mancy 2010-09-27 01:04:38 UTC
http://git.fedorahosted.org/git/?p=beaker.git;a=commit;h=e591d682ccd20aeae8ab2e06e90b8e317e252390

Adding new Jobs will automatically assign default Tag to Recipe if none are specified in the RecipeSet.

Via the Job page recipeset can be changed ala priorities. The I've made a generic master-slave JS which the tags use and which I'll move the priorities over to as well at a later date. I've also tweaked some of the CSS around the priorities.

Also added 'job-list' command to beaker, can use --family, --tag --completeDays to return a list of RecipeSets that match.