Bug 747338

Summary: log-delete leaves empty directories
Product: [Retired] Beaker Reporter: Matt Brodeur <mbrodeur>
Component: schedulerAssignee: Raymond Mancy <rmancy>
Status: CLOSED WONTFIX QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 0.7CC: bpeck, dcallagh, ebaak, mcsontos, mschick, rmancy, stl
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-07-31 22:07:12 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Matt Brodeur 2011-10-19 14:33:37 UTC
After running log-delete to remove expired test logs, we're left with empty directories.  The deletion process should detect when everything in a directory has been removed and trim the directory as well.

Comment 1 Bill Peck 2012-01-11 17:43:02 UTC
moving to 0.8.2.

rmancy,
  Looking at this bit of code:

        for job_id in job_ids:
            job = Job.by_id(job_id)
            logs = job.get_log_dirs()
            if logs:
                logs = cls._remove_descendants(logs)
            yield (job, logs)


Can we simply change _remove_descendants and replace it with the part that is the job id?

http://FQDN/beaker-logs/2011/12/1739/173922/364531/

to 

http://FQDN/beaker-logs/2011/12/1739/173922

This would still leave the intermediate directory "1739".

To do anything more would mean making http calls to see if the directory is actually empty.   If its local we can do a rmdir command which will only delete empty dirs.

Comment 2 Raymond Mancy 2012-01-11 23:39:46 UTC
(In reply to comment #1)
> moving to 0.8.2.
> 
> rmancy,
>   Looking at this bit of code:
> 
>         for job_id in job_ids:
>             job = Job.by_id(job_id)
>             logs = job.get_log_dirs()
>             if logs:
>                 logs = cls._remove_descendants(logs)
>             yield (job, logs)
> 
> 
> Can we simply change _remove_descendants and replace it with the part that is
> the job id?
> 
> http://FQDN/beaker-logs/2011/12/1739/173922/364531/
> 
> to 
> 
> http://FQDN/beaker-logs/2011/12/1739/173922
> 
> This would still leave the intermediate directory "1739".
> 

That example would work on the archive server, but I don't think it would if the logs were still on the lab controller.

Moreover, I'm not sure if it's wise to make assumptions about where on the filesystem logs lie when deleting them as it could cause problems if they change.

> To do anything more would mean making http calls to see if the directory is
> actually empty.   If its local we can do a rmdir command which will only delete
> empty dirs.

Comment 3 Bill Peck 2012-03-26 18:51:56 UTC
We still don't have a good solution for this.  Because of shared top level directories its not simple for us to know if the directory is empty.

Can we make calls to determine if a directory is empty over webdav?

Comment 4 Raymond Mancy 2012-04-17 04:33:31 UTC
dcallagh has been giving me some hints about PROPFIND, so maybe we can use that.

Comment 6 Raymond Mancy 2012-07-26 06:07:40 UTC
We can use PROPFIND to find empty dirs. But I'm not quite sure the best way to do it. All PROPFIND does is return the properties (type, href, ctime etc) of an entity given by a URL. For folders, you can also pass a depth level. Unfortunately it won't tell us directly whether a folder is empty though.

I can think of a couple of ways to achieve this, but none of them simple. I'd suggest that the simplest way to achieve this, is a cron job runinng 'find' or something similar.

Comment 8 Raymond Mancy 2012-07-31 22:07:12 UTC
I'd suggest running a cron job, perhaps with something like the following:

  find /var/www/html/beaker-logs -type d -empty -print | xargs rmdir