Description of problem:
The issue has to do with resource having an inventory status of DELETED which is not the same as being uninventoried. Some resource types, like EARs and WARs, support deletion. This means you can physically delete the resource through RHQ. Deleting a resource does not however remove it from inventory. While a resource with an inventory status of DELETED is not visible in the UI, it is not purged from the database as is the case if the resource is uninventoried. The reason for this is that if the resource is later added back (e.g., redeploy a newer version of a WAR file), we simply change the inventory status back to COMMITTED while preserving important information like audit trail history and metrics.
If you have agent already in inventory and restart it with its local inventory purged, during the initial inventory sync the agent receives from the server the platform resource. The agent subsequently sends a request to the server to fetch the resource tree of the platform. The method on the server, ResourceManagerBean.getResourceTree, does not filter out DELETED resources. This means that the agent can end up with a DELETED resource in its local inventory.
This is problematic because it causes the server to reject inventory reports from the agent. When the agent sends an inventory report to the server, the server does some validation on the report which occurs in DiscoveryBossBean.validateInventoryReport. One of those checks is that the inventory status of each resource in the report is not DELETED.
On the one hand, a DELETED resource might be harmless in terms of how the server processes the report. On the other hand, there could be more problems on the agent side above and beyond inventory reports getting rejected. The bottom line is, the agent should never have DELETED resources in its local inventory. The agent's local inventory should reflect what is actually on the managed platform, and by definition, a DELETED resource has been physically deleted and no longer exists on the managed platform.
Version-Release number of selected component (if applicable):
any time you have a deleted resource
Steps to Reproduce:
1. Import a Tomcat server (with a deployed WAR) into inventory.
2. In the resource tree in the UI, expand Tomcat Virtual Hosts and then select localhost.
3. Go to the Inventory tab and click on the Child Resources subtab.
4. Select a WAR file and click the Delete button in the page footer.
5. After the WAR file has been deleted from the webapps directory, restart your agent with the -u option to purge the local inventory.
6. Wait a couple minutes for the agent to initialize and then from the agent promt run, inventory -xe inventory.xml.
Look in the inventory.xml file and you will see the DELETED resource.
The agent's local inventory should not contain any resources with an inventory status of DELETED. ResourceManagerBean.getResourceTree needs to filter out DELETED resources.
This needs further review:
-what are the concrete user facing results of this issue
-what are the risks around changing the current behaviour, how complex is the fix?
After further investigation, this issue is more severe than I initially thought. It manifests itself in InventoryManager.initialize when the plugin container initializes. The method InventoryManager.mergeUnknownResources is where the call to the server is made to fetch the resource tree (where the DELETED resource(s) are also fetched). That method gets called during initialization from the call to activateAndUpgradeResources around line 232-ish. The initialization of the discovery and availability jobs does not occur until after the call to activateAndUpgradeResources. The net effect is that this bug prevents InventoryManager from fully initializing which in turn prevents availability and discovery scans from running.
The fix is pretty straightforward. ResourceManagerBean.getResourceTree should be updated to filter out resources with an inventory status of DELETED. The only caller of this method is the PC/InventoryManager via DiscoveryServerServiceImpl.getResources; so, the scope of the change is narrow. getResourceTree is not part of the remote APIs.
The overall risk is low in my opinion since the issue can easily be reproduced and is easily testable.
This needs to sets of eyes on it before doing a fix. I'm just paranoid about things which impact InventoryManager :-)
I wonder how a DELETED resource ended up in the list of resources processed by mergeUnknownResources().
The synchInventory() method (which is the only caller of mergeUnknownResources()) sorts out the resources into several buckets prior to calling handlers of these "resource buckets" (one of which is the mergeUnknownResources() method). The method that sorts out the resources is called processSyncInfo() and it puts any resource with the status DELETED into a different bucket than the one handled using mergeUnknownResources().
I have updated the code to filter out DELETED resources in the resource tree sent to the agent.
master commit hash: 997263c268dac476fbdf6b6e76cb0988b69a9748
Lukas, the problem is that in mergeUnknownResources we have,
Set<Resource> unknownResources =
getResources delegates to ResourceManagerBean.getResourceTree which does not filter out DELETED resources. After that server call returns InventoryManager iterates over each resource calling merge which effectively adds it to the local inventory.
Created attachment 577818 [details]
CLI script to uninventory deleted resources