Description of problem: I've been doing development of plugins and noticed the following in the logs. 2012-02-05 20:47:36,072 INFO [org.rhq.enterprise.server.discovery.DiscoveryServerServiceImpl] Error processing availability report from [vg61l01ad-hadoop002]: javax.ejb.EJBException:java.lang.NullPointerException -> java.lang.NullPointerException:null I traced the problem down to something being null in AvailabilityManagerBean.mergeAvailabilityReport(). It's not clear if the issue is coming from a bad plugin. My plugin would fail to return from its start() method and I suppose this might have prevented some database process from occurring. The resources do appear in the tree, however, they are all unavailable. Version-Release number of selected component (if applicable): RHQ 4.1, but the code seems to not have changed recently. How reproducible: Removing the resource (server) and discovering it again doesn't seem to cause it to clear. Steps to Reproduce: 1. Create a bad(?) plugin; deploy 2. Discovery fails (or never completes) as the start() method fails. 3. Notice the logs Expected results: The server should guard against bad data and provide some indication what was bad about the data. Additional info: I will provide more information as I find it.
I have seen something similar with the very first availability report of a new resource. The problem is in org.rhq.enterprise.server.measurement.AvailabilityManagerBean#updateResourceAvailability where the variable 'currentAvailability' is null for when no ResourceAvailability has been set, as this was the first report to come in.
master 34ed3853dda
I'll test and verify. However, wouldn't this case happen very frequently? Looking at the commit, I'd also feel better about this if there was a unit test case added as well. Is there such a test for your EJB components?
Bulk closing of items that are on_qa and in old RHQ releases, which are out for a long time and where the issue has not been re-opened since.