Bug 730993

Summary: only purge unused drift files that are older than a certain time
Product: [Other] RHQ Project Reporter: John Mazzitelli <mazz>
Component: driftAssignee: John Mazzitelli <mazz>
Status: CLOSED CURRENTRELEASE QA Contact: Mike Foley <mfoley>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.0.1   
Target Milestone: ---   
Target Release: JON 3.0.0, RHQ 4.3.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-02-07 19:21:58 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 707225    
Attachments:
Description Flags
730993.diff none

Description John Mazzitelli 2011-08-16 13:49:57 UTC
I checked in code that adds to the data purge job that runs hourly. We now purge drift files if there are no drifts referencing it. Every hour, when you see the data purge job emit its log messages, if you look in there, you will see messages about purging drift files.

This means we can clean up and reclaim space for drift files that are no longer used (that is, referenced as either an old or new file from any drift entry).

This goes through the drift server plugin - if we are using the RHQ DB backend, we'll purge unused rows in RHQ_DRIFT_FILE. I left a TODO in the MongoDB plugin to do the purging when using MongoDB as the drift backend.

Jay then suggested the following:

"one caveat here that we may be able to take care of with a simple flag or expiration data or something.  We actually may want drift files that are not (yet) associated with drifts.  This is the whole idea behind seeding the db with files we expect to be reported from agents. For example, we know we're going to deploy bundle Foo to 100 machines. We may very well want to slurp that bundle into the drift backend and create drift files in advance, so that we never actually need to download them from an agent.  They'll already be there."

To support that, we should add a system setting like the other purge ones - the ones like "purge alerts older than X days" or "purge events older than X days. For example, "purge orphaned (or unused) drift files older than X days".

We can add a AND clause in the DELETE SQL (see JPADriftFile):

DELETE FROM RHQ_DRIFT_FILE
    WHERE (HASH_ID NOT IN (SELECT OLD_DRIFT_FILE FROM RHQ_DRIFT))
      AND (HASH_ID NOT IN (SELECT NEW_DRIFT_FILE FROM RHQ_DRIFT))
      AND CTIME < ?

where ? is bound to some value in the past (epoch millis) that 
corresponds to how old a unused drift file is allowed to be without 
getting purged.

We'd need something similar in any drift server plugin impl (like the mongo plugin).

Comment 1 John Mazzitelli 2011-08-16 17:26:18 UTC
Created attachment 518539 [details]
730993.diff

i implemented this locally. in case we want to have a global setting to say how old drift files must be before they can be purged, see the attached patch.

Comment 2 John Mazzitelli 2011-11-01 19:30:20 UTC
this was previously committed on August 16, 2011
 - commit 5bdcd83af54cd10ff4b78bfba66db7330a11abd5

Comment 3 John Mazzitelli 2011-11-04 20:59:28 UTC
to test

1. first, get some drift 
2. confirm "select id from rhq_drift_file" - make sure you see the rows in there
   (you'll use this later)
3. I think you will have to then uninventory the resource that the drift was on
4. select id from rhq_drift_file - make sure the drifts are still there

you then wait for the time to expire (the time defined in the system settings)
after that time, "select id from rhq_drift_file" should show you the rows are gone.

Comment 4 Mike Foley 2011-11-04 21:22:58 UTC
rows are gone.  verified.

Comment 5 John Mazzitelli 2011-11-04 22:44:57 UTC
> 2. confirm "select id from rhq_drift_file" - make sure you see the rows in
> there (you'll use this later)

just to correct this - "id" isn't a valid column - "hash_id" would work

Comment 6 Mike Foley 2012-02-07 19:21:58 UTC
changing status of VERIFIED BZs for JON 2.4.2 and JON 3.0 to CLOSED/CURRENTRELEASE