Bug 837510

Summary: NPE in AS5 plugin
Product: [Other] RHQ Project Reporter: Heiko W. Rupp <hrupp>
Component: PluginsAssignee: Stefan Negrea <snegrea>
Status: CLOSED CURRENTRELEASE QA Contact: Mike Foley <mfoley>
Severity: unspecified Docs Contact:
Priority: medium    
Version: 4.4CC: ahovsepy, hrupp, jsanda
Target Milestone: ---   
Target Release: JON 3.1.1   
Hardware: Unspecified   
OS: Unspecified   
See Also: https://bugzilla.redhat.com/show_bug.cgi?id=893217
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 846846 (view as bug list) Environment:
Last Closed: 2013-09-03 11:02:17 EDT Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Bug Depends On:    
Bug Blocks: 846846    

Description Heiko W. Rupp 2012-07-04 03:15:55 EDT
From agent.log

2012-07-04 09:14:09,225 WARN  [ResourceContainer.invoker.daemon-2] 
(org.rhq.plugins.jbossas5.WebApplicationContextComponent)- Failed to determine 
whether the web app context localhost is clustered or not.
java.lang.RuntimeException: Failed to load [ComponentType{type=MBean, 
subtype=WebApplicationManager}] ManagedComponent 
	at sun.reflect.GeneratedMethodAccessor32.invoke(Unknown Source)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
	at java.lang.Thread.run(Thread.java:680)
Caused by: java.lang.NullPointerException
	... 10 more
Comment 1 Heiko W. Rupp 2012-07-04 03:24:26 EDT
To add: as5 was down at this time, so it is a bit of corner case, but the plugin should nevertheless not throw a NPE.
Comment 2 Stefan Negrea 2012-08-08 16:57:39 EDT
This is clearly an edge case because it happens only if the AS5 server is inventoried but the managing component is not started yet. A possible way to get into this state is to inventory the AS5 server, stop the AS5 server, and then restart the RHQ server. 

I inspected the code very carefully. The errors are just printed to the logs to show something is not right but they do not bubble anywhere. So from that perspective the code is good because it just logs the unusual circumstance and does not prevent the component from getting started.

However, there is one scenario that is not correctly handled by the existing code. If an application is clustered and the scenario presented above happens, when the AS5 server gets online again, the plugin code will not recheck the clustered property, thus leaving the default value for the property (which is not clustered). The clustered flag is used only for a trait and not used by any other external code. Also the value is never refreshed. The operation to check for the clustered flag is very expensive, as it involves calls to the profile service.

To fix this, the code that checks whether an application is clustered or not will be relocated outside of the start method. It will run on the first metrics collection (done only on started components that are available). The getter method that returns the cluster setting will return false until such time. The getter method will be marked as deprecated and left in the code to avoid potential problems with external plugins that rely on the content system. Also, the trait will not be reported by the agent until the property is retrieve successfully from the AS5 server.
Comment 4 John Sanda 2012-08-13 22:16:31 EDT
Moving to ON_QA since JON 3.1.1 ER2 build is availble - https://brewweb.devel.redhat.com/buildinfo?buildID=228250
Comment 5 Armine Hovsepyan 2012-09-06 12:03:32 EDT
verified with the scenario Stefan mentioned.
Comment 6 Heiko W. Rupp 2013-09-03 11:02:17 EDT
Bulk closing of old issues in VERIFIED state.