Created attachment 889497 [details]
Scren shot showing 6699.8% CPU usage
Description of problem:
After importing a platform process service resource into inventory, metric values reported for its *CPU Percentage* measurement are invalid.
In the reported case maximum CPU Percentage value was reported at 6699.8% --as seen in the metric table on the monitoring page -- while average was being reported at 352.5%. This was seen over an 8 hour period which also included a minimum of 0.4% and a live value of 1.3%.
Not even two hours later, the same metric for the same resource reported a value of 17051.8%. This is for a Java process and the process service's plug-in configuration property *Full Process Tree* is set to *Yes*.
The target machine only has 8 CPU cores.
Version-Release number of selected component (if applicable):
Every few minutes
It is not clear at this time how this issue occurs. It is only clear that the reported values are impossible given the CPU and the actual process' load. It appears that CPU Percentage is a metric returned by the native libraries and seems to relate to the actual CPU the process is running on. However, even that is not clear.
Created attachment 889498 [details]
Scren shot showing 17051.8% CPU usage less then 2 hours later
Created attachment 897969 [details]
Byteman rule and helper jar
Created attachment 900502 [details]
Byteman trace output showing single resource over 1 day
Byteman trace log excerpt showing a process identified as domain5server1 over a 24 hour period. During this period the invalid and high CPU % was reported:
- At 2014-05-23 22:57:01.926-0500 the CPU percent was returned at 7730.6 % -- cpuPercent=77.30594216184681;
+ This continued every minute until 2014-05-23 23:08:03.007-0500 at which time its value was cpuPercent=0.010536456138238304;
- At 2014-05-23 23:08:03.005-0500 the process ID changed from pid= to pid=;
The important take away here is that the invalid value seems to correspond with what seems to be a restart of the process. Perhaps the process with id of 23236 is gone during the period of time the invalid values were returned?
The fix for bug 1109439 also resolves this bug.
Setting to MODIFIED, as fix of this BZ was implemented via Bug 1100609 - which is already in 3.3 branch
Moving to ON_QA as this is available for test in JON 3.2.3 ER01 build:
(In reply to Libor Zoubek from comment #14)
> Setting to MODIFIED, as fix of this BZ was implemented via Bug 1100609 -
> which is already in 3.3 branch
The fix has been applied via Bug 1109439 in 3.2.x branch.
Created attachment 929562 [details]
CPU-percentage greater than 200% with two cpu
I imported a java process, on the first check it took usage as 206.5%, where I have only two CPUs. Hence I'm reopening this issue.
Steps I followed,
1. Imported a process resource which is down
2. After successful import started the process(java) up and running.
3. very first value of the cpu usage shows as 206.5%, where I had only 2 CPUs
4. I was kept run for an hour,I see problem with only very first value.
5. I restarted the java process service, again I can see the wrong value(570.2%) for the very first value of the CPU usage after restart.
JBoss Operations Network
Version : 3.2.0.GA Update 03
Build Number : bca1bc8:e19c43d
GWT Version : 2.5.0
SmartGWT Version : 3.0p
Screen shot is attached.
To track this edge case I opened another BZ https://bugzilla.redhat.com/show_bug.cgi?id=1134437