Bug 879583 - Monitoring : Platform plugin "Process" service reports wrong availability (SIGAR)
Summary: Monitoring : Platform plugin "Process" service reports wrong availability (SI...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: RHQ Project
Classification: Other
Component: Agent
Version: unspecified
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: RHQ 4.6
Assignee: Thomas Segismont
QA Contact: Mike Foley
URL:
Whiteboard:
Depends On:
Blocks: 879639
TreeView+ depends on / blocked
 
Reported: 2012-11-23 11:38 UTC by Thomas Segismont
Modified: 2013-09-03 14:41 UTC (History)
3 users (show)

Fixed In Version:
Clone Of:
: 879639 (view as bug list)
Environment:
Last Closed: 2013-09-03 14:41:46 UTC
Embargoed:


Attachments (Terms of Use)

Description Thomas Segismont 2012-11-23 11:38:37 UTC
Description of problem:
When monitoring a process, status update may be wrong for an arbitrary number of availability checks.


Version-Release number of selected component (if applicable):
4.6.0-SNAPSHOT

How reproducible:
Always

Steps to Reproduce:
1.Start a test process on an monitored machine (e.g. LibreOffice)
2.For the monitored machine, import a new child resource in RHQ with type process (e.g. with PIQL process|basename|match=^soffice.*)
3.When the process availability shown is "UP", close/kill the test process

  
Actual results:
The availability status is still "UP" for a time longer than the availability check interval.

Expected results:
The availability status should be "DOWN" as soon as the availability check interval has elapsed.


Additional info:
In the ProcessInfo class, the method isRunning uses the SIGAR class ProcState. If the process has been killed or shutdown, the instance of ProcState contains stale data.

Comment 1 Thomas Segismont 2012-11-23 13:30:39 UTC
In ProcessComponent class, the ProcessInfo instance is refreshed each time a metric collection is made.

So after a metric collection, the next availability check has fresh data to process.

This could explain why, after some time, the closed/killed process is eventually reported "DOWN".

Comment 2 Thomas Segismont 2012-11-26 10:06:46 UTC
ProcessInfo instance is now refreshed on every availabilty check.

master 5c4217e

Comment 3 Lukas Krejci 2012-11-26 15:17:19 UTC
I think the fix is not enough for pid file based processes, because the pid file is never re-read until the agent / plugin container is restarted and the component re-started.

Comment 4 Thomas Segismont 2012-11-27 16:30:52 UTC
Lukas,

As discussed with you on IRC, the problem is not the discovery type of the process component.

The problem is that with the first fix, after the refresh it's too late to see if the process has not yet been restarted.

So I remixed the fix.

master 2ec8d54

Comment 5 Lukas Krejci 2012-11-27 17:01:33 UTC
This looks correct to me.

Comment 6 Heiko W. Rupp 2013-09-03 14:41:46 UTC
Bulk closing of issues in old RHQ releases that are in production for a while now.

Please open a new issue when running into an issue.


Note You need to log in before you can comment on or make changes to this bug.