Bug 789454 - Drift detection fails for whole directory if special-file is present
Summary: Drift detection fails for whole directory if special-file is present
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: RHQ Project
Classification: Other
Component: drift
Version: 4.2
Hardware: Unspecified
OS: Unspecified
high
urgent
Target Milestone: ---
: JON 3.0.1
Assignee: Jay Shaughnessy
QA Contact: Mike Foley
URL:
Whiteboard:
Depends On:
Blocks: 728579 790598
TreeView+ depends on / blocked
 
Reported: 2012-02-10 19:47 UTC by Heiko W. Rupp
Modified: 2013-09-03 15:05 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 790598 (view as bug list)
Environment:
Last Closed: 2013-09-03 15:05:25 UTC
Embargoed:


Attachments (Terms of Use)
Agent log documenting failed verification (2.08 MB, text/x-log)
2012-02-23 16:10 UTC, Mike Foley
no flags Details

Description Heiko W. Rupp 2012-02-10 19:47:39 UTC
DriftDetection fails if eht directory to be monitored e.g. contains a unix domain socket (see stack trace below) 

org.rhq.core.pc.drift.DriftDetector#doDirectoryScan

basically does
       for (File dir : getScanDirectories(basedir, includes)) {
            forEachFile(dir, new FilterFileVisitor(basedir, includes, excludes, new FileVisitor() {
                @Override
                public void visit(File file) {
                    try {
                     ... work_on_file... <<--- fails for socket, see below
                    } catch (IOException e) {
                        throw new DriftDetectionException(
                            "An error occurred while generating a coverage change set for " + schedule, e);
                    }
}

So if the _work_on_file part is throwing an IOException as it does for a socket, the catch-block is re-throwing that as DriftDetectionException and throwing out of  the whole set of loops, basically ignoring all files that are still to be processed. 
As the postgres socket name starts with a dot, it is sorted early in the list, so that it basically breaks drift detection for the directory completely.

While /tmp may be a corner case for drift detection, unix domain sockets (and other kinds of special files) can show up in all directories.

2012-02-10 20:36:59,963 ERROR [pool-3-thread-1] (rhq.core.pc.drift.DriftDetector)- Drift detection failed: An error occurred while generating a coverage change set for DriftDetectionSchedule[resourceId: 10001, driftDefinitionId: 10001, driftDefinitionName: fosdem]
org.rhq.core.pc.drift.DriftDetectionException: An error occurred while generating a coverage change set for DriftDetectionSchedule[resourceId: 10001, driftDefinitionId: 10001, driftDefinitionName: fosdem]
	at org.rhq.core.pc.drift.DriftDetector$2.visit(DriftDetector.java:667)
	at org.rhq.core.pc.drift.FilterFileVisitor.visit(FilterFileVisitor.java:136)
	at org.rhq.core.util.file.FileUtil.forEachFile(FileUtil.java:434)
	at org.rhq.core.pc.drift.DriftDetector.doDirectoryScan(DriftDetector.java:649)
	at org.rhq.core.pc.drift.DriftDetector.generateSnapshot(DriftDetector.java:627)
	at org.rhq.core.pc.drift.DriftDetector.run(DriftDetector.java:139)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
	at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
	at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at java.lang.Thread.run(Thread.java:680)
Caused by: java.io.FileNotFoundException: /tmp/.s.PGSQL.5432 (Operation not supported on socket)
	at java.io.FileInputStream.open(Native Method)

Comment 1 Mike Foley 2012-02-10 19:57:29 UTC
when verifying ... please also try edge cases such as:

links
named pipes
sockets
block devices

see section 3-1 in this document:
http://tldp.org/LDP/intro-linux/html/sect_03_01.html

Comment 2 Mike Foley 2012-02-13 16:51:33 UTC
when verifying, please also (re) verify file permission testcases ...particularly files which the agent does not have permission for

Comment 3 Heiko W. Rupp 2012-02-14 16:17:12 UTC
missing lines of the stack trace

Caused by: java.io.FileNotFoundException: /tmp/.s.PGSQL.5432 (Operation not supported on socket)
	at java.io.FileInputStream.open(Native Method)
	at java.io.FileInputStream.<init>(FileInputStream.java:120)
	at org.rhq.core.util.MessageDigestGenerator.calcDigest(MessageDigestGenerator.java:255)
	at org.rhq.core.util.MessageDigestGenerator.calcDigestString(MessageDigestGenerator.java:291)
	at org.rhq.core.pc.drift.DriftDetector.sha256(DriftDetector.java:689)
	at org.rhq.core.pc.drift.DriftDetector.access$200(DriftDetector.java:57)
	at org.rhq.core.pc.drift.DriftDetector$2.visit(DriftDetector.java:662)

Comment 4 Jay Shaughnessy 2012-02-14 22:08:31 UTC
release/3.0.x commit: f55c68ee2db419cbbe2daef2f7721a7c68279039

 File.canRead() returning true means that the process has read access to
 a file, it is a security check. It does not mean that the file can be
 parsed as a Stream.  In particular, FileInputStream(File) can throw
 FileNotFoundException on a file it can't actually handle, like a
 socket file.

 I added more protection for this situation, for files which drift monitoring
 can not and should not be performed.  They will be skipped (and logged in
 debug).  Moreover, taking Heiko's suggestion, don't fail drift detection
 due to a problematic file, even if the problem is unexpected.

Comment 5 Simeon Pinder 2012-02-17 05:33:53 UTC
Moving to ON_QA for testing with JON 3.0.1.GA RC5 or better:
https://brewweb.devel.redhat.com//buildinfo?buildID=199114

Comment 6 Mike Foley 2012-02-23 16:09:57 UTC
I am marking this failed to verify in JON 3.01 RC5.

I created a Drift configuration named "mytmp" on my /tmp directory, which contained both special files, as well as files without permisssions.  Snapshot #0 is never created.

Attached is the agent log file documenting this test.  DEBUG was turned on.

Notice:
1)  at 10:48 ... Drift Detection is requested.

2012-02-23 10:47:19,117 DEBUG [WorkerThread#0[192.168.0.111:56452]] (rhq.core.pc.drift.DriftManager)- DriftDetectionSchedule[resourceId: 10001, driftDefinitionId: 10012, driftDefinitionName: mytmp] has been added to ScheduleQueue[DriftDetectionSchedule[resourceId: 10001, driftDefinitionId: 10012, driftDefinitionName: mytmp], DriftDetectionSchedule[resourceId: 10001, driftDefinitionId: 10001, driftDefinitionName: mydrift]]

2) at 10:48 ...Drift Detection has begun

2012-02-23 10:48:14,847 DEBUG [pool-3-thread-1] (rhq.core.pc.drift.DriftDetector)- Adding /tmp/imageio8311409316851006636.tmp to coverage change set for DriftDetectionSchedule[resourceId: 10001, driftDefinitionId: 0, driftDefinitionName: mytmp]

3) at 10:48 ...Drift Detection correctly skips over some files 

2012-02-23 10:48:18,293 DEBUG [pool-3-thread-1] (rhq.core.pc.drift.DriftDetector)- Skipping /tmp/orbit-mfoley/linc-9ffa-0-662d6c6de3691 since it is missing or is not a physically readable file.

4) ...by 10:49 ...Drift Detection has stopped 

[Nothing in logs relating to Drift]

[The UI does not show Snapshot #0 ]


5)  at 10:58 (which is 10 minutes later) ... I request Drift Detection again 

012-02-23 10:58:25,542 DEBUG [WorkerThread#0[192.168.0.111:59770]] (rhq.core.pc.drift.DriftManager)- DriftDetectionSchedule[resourceId: 10001, driftDefinitionId: 10012, driftDefinitionName: mytmp] has been added to ScheduleQueue[DriftDetectionSchedule[resourceId: 10001, driftDefinitionId: 10012,

6) at 10:58+ ... there is no more Drift Detection activity in the agent log
7) at 10:58+ ... the UI is not showing me snapshot #0 
8) at 10:59 .... I create a file in /tmp


echo hello > mynewfile3.txt


9) at 11:07 .... nothing more in the agent logs, still no snapshot 0

Comment 7 Mike Foley 2012-02-23 16:10:42 UTC
Created attachment 565326 [details]
Agent log documenting failed verification

Comment 8 Mike Foley 2012-02-23 16:25:25 UTC
just adding onto comment #6 ...

30 minutes later ... still no snapshot #0 in the UI.  no more drift activity in agent logs.  drift is blocked or hung.

Comment 9 Mike Foley 2012-02-27 18:42:57 UTC
observed by jay

analysis...  root cause is special file permissions ... 'prw' ...fifo ...or named pipe 

esoteric edge case ... but it does hang drift if it occurs.

[root@foleymonsterbox1 icedteaplugin-mfoley]# ls -al
<mfoley> prw-------.  1 mfoley mfoley     0 Jul 12  2011 24966-icedteanp-appletviewer-to-plugin

Comment 10 Jay Shaughnessy 2012-02-27 19:36:20 UTC
I found that the problem here is not related to the files being handled
by this BZ.  A new BZ will be written up about named pipe files.

As for this, I created a drift def using mfoley's system, which included
unreadable (non pipe) files as well as regular text files and my test
passed fine.  I used filters to avoid the pipe files.

I recommend this be moved to VERIFIED.

Comment 11 Mike Foley 2012-02-27 19:39:45 UTC
added https://bugzilla.redhat.com/show_bug.cgi?id=798006 per comment #10.

Comment 12 Mike Foley 2012-02-27 19:57:41 UTC
agree with comment #10.

additionally, i expressly verified that drift definitions do not hang on socket files (the original issue).

Comment 13 Heiko W. Rupp 2013-09-03 15:05:25 UTC
Bulk closing of old issues in VERIFIED state.


Note You need to log in before you can comment on or make changes to this bug.