Bug 1091134 - Platform process service CPU Percentage metric returns values that are inconsistent with actual CPU load measurements
Summary: Platform process service CPU Percentage metric returns values that are incons...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: JBoss Operations Network
Classification: JBoss
Component: Plugin -- Other
Version: JON 3.2
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ER01
: JON 3.2.3
Assignee: Thomas Segismont
QA Contact: Jeeva Kandasamy
URL:
Whiteboard:
Depends On: 1109439 1127875
Blocks: 1127876
TreeView+ depends on / blocked
 
Reported: 2014-04-25 01:20 UTC by Larry O'Leary
Modified: 2018-12-05 18:19 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1127876 (view as bug list)
Environment:
Last Closed: 2014-09-05 15:40:38 UTC
Type: Bug


Attachments (Terms of Use)
Scren shot showing 6699.8% CPU usage (179.75 KB, image/png)
2014-04-25 01:20 UTC, Larry O'Leary
no flags Details
Scren shot showing 17051.8% CPU usage less then 2 hours later (104.65 KB, image/png)
2014-04-25 01:22 UTC, Larry O'Leary
no flags Details
Byteman rule and helper jar (3.49 KB, application/zip)
2014-05-21 13:19 UTC, Thomas Segismont
no flags Details
Byteman trace output showing single resource over 1 day (47.73 KB, application/x-gzip)
2014-05-29 19:43 UTC, Larry O'Leary
no flags Details
CPU-percentage greater than 200% with two cpu (167.20 KB, image/jpeg)
2014-08-22 11:57 UTC, Jeeva Kandasamy
no flags Details


Links
System ID Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 795963 None None None Never
Red Hat Bugzilla 1100609 None None None Never

Internal Links: 1100609

Description Larry O'Leary 2014-04-25 01:20:35 UTC
Created attachment 889497 [details]
Scren shot showing 6699.8% CPU usage

Description of problem:
After importing a platform process service resource into inventory, metric values reported for its *CPU Percentage* measurement are invalid.

In the reported case maximum CPU Percentage value was reported at 6699.8% --as seen in the metric table on the monitoring page -- while average was being reported at 352.5%. This was seen over an 8 hour period which also included a minimum of 0.4% and a live value of 1.3%.

Not even two hours later, the same metric for the same resource reported a value of 17051.8%. This is for a Java process and the process service's plug-in configuration property *Full Process Tree* is set to *Yes*.

The target machine only has 8 CPU cores.



Version-Release number of selected component (if applicable):
3.2.0.GA

How reproducible:
Every few minutes

Additional info:
It is not clear at this time how this issue occurs. It is only clear that the reported values are impossible given the CPU and the actual process' load. It appears that CPU Percentage is a metric returned by the native libraries and seems to relate to the actual CPU the process is running on. However, even that is not clear.

Comment 1 Larry O'Leary 2014-04-25 01:22:23 UTC
Created attachment 889498 [details]
Scren shot showing 17051.8% CPU usage less then 2 hours later

Comment 2 Thomas Segismont 2014-05-21 13:19:21 UTC
Created attachment 897969 [details]
Byteman rule and helper jar

Comment 5 Larry O'Leary 2014-05-29 19:43:57 UTC
Created attachment 900502 [details]
Byteman trace output showing single resource over 1 day

Byteman trace log excerpt showing a process identified as domain5server1 over a 24 hour period. During this period the invalid and high CPU % was reported:

 - At 2014-05-23 22:57:01.926-0500 the CPU percent was returned at 7730.6 % -- cpuPercent=77.30594216184681;
   + This continued every minute until 2014-05-23 23:08:03.007-0500 at which time its value was cpuPercent=0.010536456138238304;
 - At 2014-05-23 23:08:03.005-0500 the process ID changed from pid=[23236] to pid=[32765];
 
The important take away here is that the invalid value seems to correspond with what seems to be a restart of the process. Perhaps the process with id of 23236 is gone during the period of time the invalid values were returned?

Comment 13 Larry O'Leary 2014-06-13 23:21:22 UTC
The fix for bug 1109439 also resolves this bug.

Comment 14 Libor Zoubek 2014-08-08 11:00:45 UTC
Setting to MODIFIED, as fix of this BZ was implemented via  Bug 1100609 - which is already in 3.3 branch

Comment 15 Simeon Pinder 2014-08-15 03:19:10 UTC
Moving to ON_QA as this is available for test in JON 3.2.3 ER01 build:

http://jon01.mw.lab.eng.bos.redhat.com:8042/dist/release/jon/3.2.3.GA/8-14-14/

Comment 16 Thomas Segismont 2014-08-19 10:19:25 UTC
(In reply to Libor Zoubek from comment #14)
> Setting to MODIFIED, as fix of this BZ was implemented via  Bug 1100609 -
> which is already in 3.3 branch

Wrong reference:

The fix has been applied via Bug 1109439 in 3.2.x branch.

See https://bugzilla.redhat.com/show_bug.cgi?id=1109439#c2

Comment 17 Jeeva Kandasamy 2014-08-22 11:57:57 UTC
Created attachment 929562 [details]
CPU-percentage greater than 200% with two cpu

I imported a java process, on the first check it took usage as 206.5%, where I have only two CPUs. Hence I'm reopening this issue.

Steps I followed,
1. Imported a process resource which is down
2. After successful import started the process(java) up and running.
3. very first value of the cpu usage shows as 206.5%, where I had only 2 CPUs
4. I was kept run for an hour,I see problem with only very first value.
5. I restarted the java process service, again I can see the wrong value(570.2%) for the very first value of the CPU usage after restart.

Version: 
JBoss Operations Network
Version : 3.2.0.GA Update 03
Build Number : bca1bc8:e19c43d
GWT Version : 2.5.0
SmartGWT Version : 3.0p


Screen shot is attached.

Comment 22 Jeeva Kandasamy 2014-08-27 13:54:43 UTC
To track this edge case I opened another BZ https://bugzilla.redhat.com/show_bug.cgi?id=1134437


Note You need to log in before you can comment on or make changes to this bug.