Bug 756100 - RFE: use timestamp and file size during drift detection scans
RFE: use timestamp and file size during drift detection scans
Status: CLOSED CURRENTRELEASE
Product: RHQ Project
Classification: Other
Component: drift (Show other bugs)
4.2
Unspecified Unspecified
high Severity urgent (vote)
: ---
: RHQ 4.3.0
Assigned To: Jay Shaughnessy
Mike Foley
: FutureFeature, Improvement
Depends On:
Blocks: 707225 jon30-sprint10/rhq43-sprint10
  Show dependency treegraph
 
Reported: 2011-11-22 12:14 EST by John Sanda
Modified: 2013-08-31 06:16 EDT (History)
1 user (show)

See Also:
Fixed In Version: 4.3
Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-08-31 06:16:35 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)

  None (edit)
Description John Sanda 2011-11-22 12:14:06 EST
Description of problem:
Currently drift detection is always done by generating and comparing SHAs of files against recorded SHAs in a snapshot file. Calculating a digest sum is a CPU-intensive operation. The time required to calculate the digest is relative to file size and density. Always doing the digest calculation could significantly increase the agent's footprint in terms of CPU utilization as well as IO overhead.

One fairly safe way to avoid always doing the digest calculation but also still detecting changes is to look at file timestamps and sizes. If neither the file's timestamp nor its size has changed, we can skip recalculating the digest. 

We still need calculate and store SHAs when generating the initial snapshot or any time the file timestamp of size might not otherwise be available. This enhancement could significantly reduce the time for drift detection runs and the overall agent footprint.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:
Comment 1 Jay Shaughnessy 2011-12-15 14:07:58 EST
master commit e60b0694356357d6c73b04128606afe6d75e3041

Now using timestamp and filesize info to avoid SHA digest
generation when possible.  Jsanda added most of the support, working
the new info into the changeset files. Jshaughn added support for 
handling situations where the current changeset is supplied by the
server, due to pinning or agent sync. When supplied by the server no
timestamp info is available, so the non-timestamped changesets
must be replaced with timestamped versions as soon as possible.

Test Notes
Although the benefit is performance oriented and not testable in
a standard fashion, there are still many scenarios that can be exercised
to ensure that the added support does not generate problems in the
various workflows.  In particular exercise non-pinned definitions and
multiple snapshots with all varieties of drift (including, if possible
on the OS, setting monitored files non-readable by the agent process.
These files should be treated like removed files but have some different
internal code paths). Also, exercise pinned defs, moving then in and out
of compliance with all types of drift.  And also, starting the agent
--clean (or --purgedata) such that drift sync must be executed.
Comment 2 Mike Foley 2011-12-16 09:25:26 EST
Documenting verification with Drift TCMS test case execution runs 

https://tcms.engineering.redhat.com/plan/4174/#testruns
Comment 3 Heiko W. Rupp 2013-08-31 06:16:35 EDT
Bulk close of old bugs in VERIFIED state.

Note You need to log in before you can comment on or make changes to this bug.