Bug 786613 - Excessive file scanning in drift detection when using includes filters
Summary: Excessive file scanning in drift detection when using includes filters
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: RHQ Project
Classification: Other
Component: drift
Version: 4.2
Hardware: All
OS: All
medium
medium
Target Milestone: ---
: JON 3.0.1
Assignee: Jay Shaughnessy
QA Contact: Mike Foley
URL:
Whiteboard:
Depends On: 760289
Blocks: jon30-sprint10, rhq43-sprint10 jon310-sprint11, rhq44-sprint11
TreeView+ depends on / blocked
 
Reported: 2012-02-01 22:13 UTC by Charles Crouch
Modified: 2015-02-01 23:27 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of: 760289
Environment:
Last Closed: 2013-09-03 15:11:02 UTC
Embargoed:


Attachments (Terms of Use)

Description Charles Crouch 2012-02-01 22:13:19 UTC
+++ This bug was initially created as a clone of Bug #760289 +++

I may be wrong, I'm still looking at this, but it looks like to me like
the drift detector processes all files, recursively, under the base
directory, looking for files that match the filters.  That doesn't mean
we digest them all but it does grab each file in order to see whether
it matches the filters.

This means that if you use a broad base directory, like c:/ on windows, and
your includes filters are subdir1 and subdir2, that we'll actually scan the
entire file system looking for files. It should, I would think, only look in
subdir1 and subdir2 directories recursively.

This can most likely hang up an agent.  Researching more now...

--- Additional comment from jshaughn on 2011-12-05 14:19:26 EST ---


Still looking at this but assuming it is as described above, the workaround
would be to delete the offending definition and create multiple more
specific definitions.  So, instead of:

Dir
  filterSubDir1
  filterSubDir2

use two defs not using includes filters:

  Dir/subDir1
  Dir/subDir2


Note, that you can always perform a pattern-based  filter on the basedir
using an includes filter using "." as the path.

--- Additional comment from jshaughn on 2011-12-05 16:05:02 EST ---


master commit 6f3d99d160c4910bffe16ada89b625fe251bea44

Now, when using includes file paths limit the directory scanning to only
those included directories.

Note that using a "." as an includes path basically translates to using
the base directory, in which case the scan will be as it is now.
A future enhancement may be to analyze the pattern and decide
whether a recursive scan is necessary.  Currently. So, using includes
patterns to just look for certain files in the base directory will
expose you to the full scan.


Test Notes:
This is not obvious to test as it's mainly a performance fix. But,
prior to the fix, creating a drift definition with a basedir of the
file system root, with an includes subdir, would take a very large 
period of time to complete for a sizeable file system, and a lot of
disk/cpu activity.  It should not complete very quickly assuming the
included subdir has a reasonable number of total files.

--- Additional comment from mfoley on 2011-12-12 13:15:00 EST ---

verified by testing positive use-case around filters.  RHQ 3.  master

Comment 3 Simeon Pinder 2012-02-09 19:55:31 UTC
Moving this to ON_QA as new RC3 binary available here:
https://brewweb.devel.redhat.com//buildinfo?buildID=198086

Comment 4 Mike Foley 2012-02-09 20:43:22 UTC
verified as follows:
1) smoke test of drift
2) Test Notes:
This is not obvious to test as it's mainly a performance fix. But,
prior to the fix, creating a drift definition with a basedir of the
file system root, with an includes subdir, would take a very large 
period of time to complete for a sizeable file system, and a lot of
disk/cpu activity.  It should not complete very quickly assuming the
included subdir has a reasonable number of total files.

Comment 5 Heiko W. Rupp 2013-09-03 15:11:02 UTC
Bulk closing of old issues in VERIFIED state.


Note You need to log in before you can comment on or make changes to this bug.