Bug 879639

Summary: Monitoring : process availability gives wrong status (SIGAR)
Product: [JBoss] JBoss Operations Network Reporter: Thomas Segismont <tsegismo>
Component: AgentAssignee: Thomas Segismont <tsegismo>
Status: CLOSED CURRENTRELEASE QA Contact: Mike Foley <mfoley>
Severity: high Docs Contact:
Priority: unspecified    
Version: JON 3.1.1CC: ahovsepy, hrupp, myarboro
Target Milestone: ER01   
Target Release: JON 3.2.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 879583 Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 879583    
Bug Blocks:    
Attachments:
Description Flags
active.png
none
inactive.png none

Description Thomas Segismont 2012-11-23 14:25:52 UTC
+++ This bug was initially created as a clone of Bug #879583 +++

Description of problem:
When monitoring a process, status update may be wrong for an arbitrary number of availability checks.


Version-Release number of selected component (if applicable):
4.6.0-SNAPSHOT

How reproducible:
Always

Steps to Reproduce:
1.Start a test process on an monitored machine (e.g. LibreOffice)
2.For the monitored machine, import a new child resource in RHQ with type process (e.g. with PIQL process|basename|match=^soffice.*)
3.When the process availability shown is "UP", close/kill the test process

  
Actual results:
The availability status is still "UP" for a time longer than the availability check interval.

Expected results:
The availability status should be "DOWN" as soon as the availability check interval has elapsed.


Additional info:
In the ProcessInfo class, the method isRunning uses the SIGAR class ProcState. If the process has been killed or shutdown, the instance of ProcState contains stale data.

--- Additional comment from Thomas SEGISMONT on 2012-11-23 14:30:39 CET ---

In ProcessComponent class, the ProcessInfo instance is refreshed each time a metric collection is made.

So after a metric collection, the next availability check has fresh data to process.

This could explain why, after some time, the closed/killed process is eventually reported "DOWN".

Comment 1 Thomas Segismont 2012-12-10 08:58:25 UTC
Fix applied on release/jon3.1.x branch:
34d2887 (cherry-picked from 5c4217e)
ea84700 (cherry-picked from 2ec8d54)

Comment 2 Larry O'Leary 2013-09-06 14:30:42 UTC
As this is MODIFIED or ON_QA, setting milestone to ER1.

Comment 3 Armine Hovsepyan 2013-09-30 11:19:27 UTC
verified
steps:
1. started libre office process
2. changed availability collection from schedules (33 secs)
3. stopped office process
4. in 33 secs office process moved to not-available state
screenshots attached

Comment 4 Armine Hovsepyan 2013-09-30 11:20:15 UTC
Created attachment 805134 [details]
active.png

Comment 5 Armine Hovsepyan 2013-09-30 11:20:58 UTC
Created attachment 805135 [details]
inactive.png