Bug 726714

Summary: Metric template creation generates NPE (null pointer exception)
Product: [Other] RHQ Project Reporter: Jay Shaughnessy <jshaughn>
Component: Core ServerAssignee: Jay Shaughnessy <jshaughn>
Status: CLOSED CURRENTRELEASE QA Contact: Mike Foley <mfoley>
Severity: high Docs Contact:
Priority: medium    
Version: 3.0.1CC: hrupp, skondkar
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: 4.1 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 728641 (view as bug list) Environment:
Last Closed: 2012-02-07 19:20:08 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 728641    
Attachments:
Description Flags
the full stack none

Description Jay Shaughnessy 2011-07-29 15:16:40 UTC
Created attachment 515898 [details]
the full stack

Making a metric template change, including the existing inventory, can cause Server NPEs and a failed update.

To recreate:
1) Import RHQ Agent resource
2) Uninventory RHQ Agent resource
3) Re-Import RHQ Agent resource
4) Try to make a metric template change

try setting the Agent-Server Cloxk Difference schedule to 29 minutes

This will generate:

10:45:01,362 ERROR [MeasurementScheduleManagerBean] Error updating measurement definitions:
java.lang.NullPointerException
        at org.rhq.enterprise.server.measurement.MeasurementScheduleManagerBean.sendUpdatedSchedulesToAgent(MeasurementScheduleManagerBean.java:830)
        at org.rhq.enterprise.server.measurement.MeasurementScheduleManagerBean.modifyDefaultCollectionIntervalForMeasurementDefinitions(MeasurementScheduleManagerBean.java:513)
        at org.rhq.enterprise.server.measurement.MeasurementScheduleManagerBean.modifyDefaultCollectionIntervalForMeasurementDefinitions(MeasurementScheduleManagerBean.java:374)
        at org.rhq.enterprise.server.measurement.MeasurementScheduleManagerBean.updateDefaultCollectionIntervalForMeasurementDefinitions(MeasurementScheduleManagerBean.java:341)

(full stack attached)

Comment 1 Jay Shaughnessy 2011-07-29 16:26:54 UTC
This happens when there is a resource behind-the-scenes, either in an uninventoried or deleted state.

For uninventoried resources the problem should go away when the 
out-of-band uninventorying is actually performed. 

*** !!! Test Note : Recreation Step update !!! ***

In the recreation steps the problem will only be recreated during a
potentially brief period between the uninventory request from the
gui and the actual server-side uninventory is performed.

Step 3 is actually unnecessary. And you must execute step 4 quickly.



There may be other scenarios but I haven't been able to identify them.
I thought a deleted (delete, not uninventory) may cause the same
problem in a more persistent way, but that did not pan out. Or,
perhaps if agents are down.  Regardless, the fix should take care of
any situation like this.

Comment 2 Jay Shaughnessy 2011-07-29 17:42:34 UTC
master commit e4e340d418889e7a8972c421fe002a84ff2aa187

Comment 3 Sunil Kondkar 2011-08-01 10:02:22 UTC
Verified on build#231 (Version: 4.1.0-SNAPSHOT Build Number: c4a82bc)

Uninventoried the RHQ Agent resource and set the Agent-Server Clock Difference schedule to 29 minutes. No exception is observed.
Also verified this when RHQ Agent resource is down.

Marking as verified.

Comment 4 Mike Foley 2012-02-07 19:20:08 UTC
changing status of VERIFIED BZs for JON 2.4.2 and JON 3.0 to CLOSED/CURRENTRELEASE