Bug 591519

Summary: platform plugin: ArrayIndexOutOfBoundsException in Sigar.getCpuPercList() when number of CPUs has decreased since last time getCpuPercList() was called
Product: [Other] RHQ Project Reporter: Ian Springer <ian.springer>
Component: PluginsAssignee: Ian Springer <ian.springer>
Status: CLOSED CURRENTRELEASE QA Contact: Corey Welton <cwelton>
Severity: medium Docs Contact:
Priority: urgent    
Version: 1.3.1CC: ccrouch, loleary
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: 2.4 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-08-12 16:49:39 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 591531, 595917    

Description Ian Springer 2010-05-12 13:44:18 UTC
I'm fairly sure this is due to the following SIGAR bug:

http://jira.hyperic.com/browse/SIGAR-216

Comment 1 Larry O'Leary 2010-05-12 16:01:18 UTC
This issue can be reproduced with the Sigar test shell:

 - Using a Linux machine with multiple CPU(s)
 - Execute: java -jar lib/sigar-1.6.3.82.jar 
 - sigar> cpuinfo
   ... output shows 2 CPUs
 - From another terminal: echo 0 > /sys/devices/system/cpu/cpu1/online
 - sigar> cpuinfo
   ... output shows 1 CPU
 - From another terminal: echo 1 > /sys/devices/system/cpu/cpu1/online
 - sigar> cpuinfo
   Unexpected exception processing command 'cpuinfo': java.lang.ArrayIndexOutOfBoundsException: 1
   java.lang.ArrayIndexOutOfBoundsException: 1
	   at org.hyperic.sigar.Sigar.getCpuPercList(Sigar.java:379)
	   at org.hyperic.sigar.cmd.CpuInfo.output(CpuInfo.java:64)
	   at org.hyperic.sigar.cmd.SigarCommandBase.processCommand(SigarCommandBase.java:188)
	   at org.hyperic.sigar.shell.ShellBase.processCommand(ShellBase.java:397)
	   at org.hyperic.sigar.cmd.Shell.processCommand(Shell.java:122)
	   at org.hyperic.sigar.shell.ShellBase.handleCommand(ShellBase.java:364)
	   at org.hyperic.sigar.shell.ShellBase.handleCommand(ShellBase.java:310)
	   at org.hyperic.sigar.shell.ShellBase.run(ShellBase.java:289)
	   at org.hyperic.sigar.cmd.Shell.main(Shell.java:222)
	   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	   at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	   at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	   at java.lang.reflect.Method.invoke(Method.java:616)
	   at org.hyperic.sigar.cmd.Runner.main(Runner.java:214)

Comment 3 Larry O'Leary 2010-05-19 17:02:48 UTC
Any word on getting a new 1.6 version of Sigar with the fix indicated in the upstream JIRA?

Comment 4 Ian Springer 2010-05-19 19:56:01 UTC
If you are comfortable that the fix works, I will go ahead and ask Doug from Hyperic to cut a 1.6.5 release for us. Let me know.

Comment 5 Larry O'Leary 2010-05-19 20:15:44 UTC
(In reply to comment #4)
Yes.  The fix does work and seems very valid so I think it is safe to proceed.  

I have not looked at the commit history since 1.6.3 so am not sure what else may be impacted so you may want to confirm before we look into that.

Comment 6 Ian Springer 2010-05-19 20:27:34 UTC
We actually already upgraded RHQ to SIGAR 1.6.4 last week in order to obtain several other bug fixes. The regression test suite passed after the upgrade, so it's looking promising. I don't think 1.6.5 is going to add much besides this ArrayIndexOutOfBoundsException fix, since 1.6.4. was just released a couple weeks ago and most new development is happening on 1.7.0 now.

I'll go ahead and ask Doug for a 1.6.5.

Comment 7 Ian Springer 2010-05-24 22:25:25 UTC
Upgraded to SIGAR 1.6.5 (git commit d6abc10) - pushed to master branch. 

TEST STEPS
==========
1) Install RHQ Agent on a Linux machine with multiple CPUs.
2) Verify that CPU Resources (one for each of the CPUs) are in RHQ inventory and are green.
3) On the CPU Resources, enable all metrics and change the collection interval to 1 minute. Wait a couple minutes and verify that all the metrics are being collected successfully.
4) From a terminal on the Agent box, disable CPU 1:
   sudo echo 0 > /sys/devices/system/cpu/cpu1/online
5) Wait a couple minutes and verify that all the metrics are still being collected successfully. Grep the Agent log for ArrayIndexOutOfBoundsException (the tack trace in Comment 1 above) and verify you don't see any occurrences.
6) From a terminal on the Agent box, re-enable CPU 1:
   sudo echo 1 > /sys/devices/system/cpu/cpu1/online
7) Wait a couple minutes and verify that all the metrics are still being collected successfully. Grep the Agent log for ArrayIndexOutOfBoundsException (the stack trace in Comment 1 above) and verify you don't see any occurrences.

Comment 8 Corey Welton 2010-05-27 16:22:51 UTC
QA Verified.

Comment 9 Corey Welton 2010-08-12 16:49:39 UTC
Mass-closure of verified bugs against JON.