Bug 535173 (RHQ-1897)

Summary: NPE when agent is connecting types
Product: [Other] RHQ Project Reporter: John Mazzitelli <mazz>
Component: AgentAssignee: RHQ Project Maintainer <rhq-maint>
Status: CLOSED WORKSFORME QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: unspecifiedCC: cwelton, sdharane
Target Milestone: ---Keywords: SubBug
Target Release: ---   
Hardware: All   
OS: All   
URL: http://jira.rhq-project.org/browse/RHQ-1897
Whiteboard:
Fixed In Version: 1.4 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-10-19 06:38:36 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 565628    

Description John Mazzitelli 2009-04-01 02:38:00 UTC
I have no idea what this means, if it is bad or not or if its recoverable. I saw it on an agentspawn deployment - many agents spit this out:

22:32:34,003 ERROR [Starting Agent 33007] (rhq.core.pc.inventory.InventoryManager)- Could not load inventory data from disk
org.rhq.core.clientapi.agent.PluginContainerException: Cannot load inventory file: /root/baker/perf/perftest/spawn/data/33007/inventory.dat
        at org.rhq.core.pc.inventory.InventoryFile.loadInventory(InventoryFile.java:107)
        at org.rhq.core.pc.inventory.InventoryManager.loadFromDisk(InventoryManager.java:1191)
        at org.rhq.core.pc.inventory.InventoryManager.initialize(InventoryManager.java:184)
        at org.rhq.core.pc.PluginContainer.startContainerService(PluginContainer.java:329)
        at org.rhq.core.pc.PluginContainer.initialize(PluginContainer.java:232)
        at org.rhq.enterprise.agent.AgentMain.startPluginContainer(AgentMain.java:1726)
        at org.rhq.enterprise.agent.AgentMain.start(AgentMain.java:622)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:585)
        at org.rhq.perftest.AgentSpawn$1.run(AgentSpawn.java:170)
        at java.lang.Thread.run(Thread.java:595)
Caused by: java.lang.NullPointerException
        at org.rhq.core.clientapi.agent.metadata.PluginMetadataManager.getType(PluginMetadataManager.java:179)
        at org.rhq.core.pc.inventory.InventoryFile.connectTypes(InventoryFile.java:113)
        at org.rhq.core.pc.inventory.InventoryFile.loadInventory(InventoryFile.java:105)
        ... 12 more


Comment 1 John Mazzitelli 2009-04-01 02:50:53 UTC
svn rev 3577 avoids a NPE if a resource's type is null. This just avoids a NPE exception but it doesn't address the underlying problem - WHY is this type null in the first place? Somehow the type looks to be null in inventory.dat when clearly it should not be.

I think it might be because I killed the agent prior to an initial clean shutdown. If a clean agent (one without inventory.dat) starts but gets killed without shutting down nicely the first time, its original inventory.dat with unsync'ed types are in inventory.dat. I think we need to make sure we cleanly shutdown the agent for a first time in order for it to write out inventory.dat with the full  types. This is just a guess. This might be related to RHQ-979

Comment 2 Joseph Marques 2009-08-06 16:31:13 UTC
workaround suggestion: better error handling, such as continuing as if the inventory.dat file didn't exist and renaming the broken inventory.dat file on disk (as opposed to blindly removing it, which gives a change to analyze it).

Comment 3 Corey Welton 2009-08-06 17:12:32 UTC
Pushed to 1.4

Comment 4 Red Hat Bugzilla 2009-11-10 20:49:00 UTC
This bug was previously known as http://jira.rhq-project.org/browse/RHQ-1897


Comment 5 wes hayutin 2010-02-16 16:56:11 UTC
Temporarily adding the keyword "SubBug" so we can be sure we have accounted for all the bugs.

keyword:
new = Tracking + FutureFeature + SubBug

Comment 6 wes hayutin 2010-02-16 17:01:19 UTC
making sure we're not missing any bugs in rhq_triage

Comment 7 John Mazzitelli 2010-08-30 17:56:15 UTC
I know of no other time when someone reported this. Might be caused by an anomaly during perf testing using agentspawn (which is not a true production environment situation).

We have a new inventory --sync mechanism that we can utilize to force the agent to resync its inventory (and this rebuild inventory.dat) in case it ever gets corrupted.

Since no one has seen this again in over a year, I say close as "cannot reproduce".

Comment 8 Sudhir D 2010-10-19 06:38:36 UTC
Closing this as WorksForMe. I wasn't able to with the latest rhq-server-4.0.0-SNAPSHOT build.

If anyone see this. Please reopen.