Bug 752981 - [Platform Plug-in] FileSystemComponent returns invalid diskQueue metric due to calling refresh method of FileSystemInfo twice when gathering metrics
[Platform Plug-in] FileSystemComponent returns invalid diskQueue metric due t...
Status: CLOSED CURRENTRELEASE
Product: RHQ Project
Classification: Other
Component: Plugins (Show other bugs)
4.2
Unspecified Unspecified
high Severity high (vote)
: ---
: RHQ 4.3.0
Assigned To: Charles Crouch
Mike Foley
:
Depends On:
Blocks: 753904 783879
  Show dependency treegraph
 
Reported: 2011-11-10 17:33 EST by Larry O'Leary
Modified: 2015-02-01 18:27 EST (History)
5 users (show)

See Also:
Fixed In Version: 4.3
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 769936 783879 (view as bug list)
Environment:
Last Closed: 2013-09-01 06:16:23 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
image of verification (fail to verify) (35.33 KB, image/png)
2012-02-03 16:57 EST, Mike Foley
no flags Details

  None (edit)
Description Larry O'Leary 2011-11-10 17:33:14 EST
When void org.rhq.plugins.platform.FileSystemComponent.getValues(MeasurementReport report, Set<MeasurementScheduleRequest> metrics) throws Exception is executed it first creates an instance of FileSystemInfo on line 73[1] and then calls the FileSystemInfo refresh method on line 74. The call on line 74 is a duplicate of the refresh called during the creation of the FileSystemInfo object as this is what it does in its constructor[2].


Although this is minor, it does have a performance impact when many metrics are enabled, collected frequently, and there are multiple file systems on the platform being monitored. 

Not certain to which call should actually be occurring but this should probably be cleaned up.


[1]: http://git.fedorahosted.org/git?p=rhq/rhq.git;a=blob;f=modules/plugins/platform/src/main/java/org/rhq/plugins/platform/FileSystemComponent.java;h=9d68c239f3bd981cfeee41f6b40e9ca5b28c6b4c;hb=HEAD#l73
[2]: http://git.fedorahosted.org/git?p=rhq/rhq.git;a=blob;f=modules/core/native-system/src/main/java/org/rhq/core/system/FileSystemInfo.java;h=2bde8ee772e6e1cb9af65b75dc5f4713c7c38c52;hb=HEAD#l46
Comment 1 Ian Springer 2011-11-11 10:43:53 EST
Nice find.

Fixed in master - commit 68be8c6:

http://git.fedorahosted.org/git/?p=rhq/rhq.git;a=commitdiff;h=68be8c6
Comment 2 Larry O'Leary 2011-11-11 10:54:24 EST
Upon further investigation the multiple executions actually result in invalid data coming back from the native library. The statistics that are gathered are time based with a time resolution of 1 second and some of the values are calculated based on the last "gather" time. If the refresh method is called in succession with a second or less between, the result is disk stats being calculated with:

 0.0 / 0 = NaN
 > 0.0 / 0 = Infinity

The result depends on the execution time.

Disk Queue test with delay of 0 returned 100 NaN results and 0 infinity results out of a total tests of 100
Disk Queue test with delay of 1 returned 100 NaN results and 0 infinity results out of a total tests of 100
Disk Queue test with delay of 5 returned 68 NaN results and 32 infinity results out of a total tests of 100
Disk Queue test with delay of 10 returned 70 NaN results and 29 infinity results out of a total tests of 100
Disk Queue test with delay of 50 returned 64 NaN results and 31 infinity results out of a total tests of 100
Disk Queue test with delay of 100 returned 51 NaN results and 39 infinity results out of a total tests of 100
Disk Queue test with delay of 500 returned 34 NaN results and 16 infinity results out of a total tests of 100
Disk Queue test with delay of 1000 returned 0 NaN results and 0 infinity results out of a total tests of 100
Disk Queue test with delay of 1001 returned 0 NaN results and 0 infinity results out of a total tests of 100
Disk Queue test with delay of 1010 returned 0 NaN results and 0 infinity results out of a total tests of 100
Disk Queue test with delay of 1050 returned 0 NaN results and 0 infinity results out of a total tests of 100
Disk Queue test with delay of 1100 returned 0 NaN results and 0 infinity results out of a total tests of 100
Disk Queue test with delay of 1500 returned 0 NaN results and 0 infinity results out of a total tests of 100
Disk Queue test with delay of 2000 returned 0 NaN results and 0 infinity results out of a total tests of 100
Comment 3 Larry O'Leary 2011-11-14 14:25:39 EST
Committed to release-3.0.1 as 96b03ecc9c77bd80e72792fd996b3a0ad6592229 - http://git.fedorahosted.org/git/?p=rhq/rhq.git;a=commitdiff;h=96b03ecc9c77bd80e72792fd996b3a0ad6592229
Comment 4 Lukas Krejci 2011-12-21 15:41:03 EST
Larry, would you be able to provide repro steps for this issue?
Comment 5 Larry O'Leary 2011-12-21 23:43:19 EST
1. From a Linux Platform resource, expand and select File System -> /boot
2. Select the Monitor > Tables sub-tab
3. The Last value for Disk Queue should contain a number

Actual Result:
<no value / i.e. it is blank>

Expected Result:
0 or a positive number

Please note that by default this metric is collected once every 20 minutes. So, if you are tested a new installation, it is best to drop the collection schedule for this metric to 1 minute so you do not have as long to wait.
Comment 6 Simeon Pinder 2012-02-02 22:16:56 EST
Available to test in 3.0.1.GA RC2 available here:
https://brewweb.devel.redhat.com//buildinfo?buildID=197202
Comment 7 Mike Foley 2012-02-03 16:57:29 EST
Created attachment 559375 [details]
image of verification (fail to verify)

image to document my observations.  fail to verify at step #3 in repro steps.
Comment 8 Mike Foley 2012-02-03 16:58:22 EST
fail to verify.  image documenting failure of repro steps at step #3.  i don't see disk queue
Comment 9 Charles Crouch 2012-02-12 21:52:40 EST
Setting back to ON-QA for the master branch and RHQ4.3 based on https://bugzilla.redhat.com/show_bug.cgi?id=783879#c8
Comment 11 Heiko W. Rupp 2013-09-01 06:16:23 EDT
Bulk closing of items that are on_qa and in old RHQ releases, which are out for a long time and where the issue has not been re-opened since.

Note You need to log in before you can comment on or make changes to this bug.