Description of problem: After doing a server upgrade, I experienced a looping issue where some agents (not all) were doing inventory syncing fairly continuously. The root cause isn't clear, but the following was observed: 2013-10-05 01:33:26,275 INFO [InventoryManager.discovery-1] (InventoryManager)- Syncing local inventory with Server inventory... 2013-10-05 01:33:26,279 INFO [InventoryManager.discovery-1] (RuntimeDiscoveryExecutor)- Executing runtime discovery scan rooted at [platform]... ... 2013-10-05 01:33:27,333 INFO [InventoryManager.discovery-1] (RuntimeDiscoveryExecutor)- Scanned platform and 0 server(s) and discovered 0 new descendant Resource(s). 2013-10-05 01:33:27,333 INFO [InventoryManager.discovery-1] (InventoryManager)- Sending [runtime] inventory report to Server... 2013-10-05 01:33:27,860 INFO [InventoryManager.discovery-1] (InventoryManager)- Syncing local inventory with Server inventory... As you can see about 1 second later the same sync occurred again. This appears to be a loop for the following. InventoryManager. private void synchInventory(ResourceSyncInfo syncInfo, boolean partialInventory) { calls public boolean handleReport(InventoryReport report) { calls synchInventory(...) in a new thread... by this.inventoryThreadPoolExecutor.schedule((Callable<? extends Object>) this.serviceScanExecutor, configuration.getChildResourceDiscoveryDelay(), TimeUnit.SECONDS); ^^ the delay here is only 5 seconds... and the cycle repeats itself. I added additional pooled EJB instances as increased the communication setting concurrency and this did *not* fix the problem. The problem fixed itself when I stopped most of the other agents. Version-Release number of selected component (if applicable): 4.9 How reproducible: Steps to Reproduce: 1. In cases where the server appears busy. I'm guessing there's just not enough time for the proper sync to happen. 2. There may be changes in resources/plugins before/after upgrade. 3. Additional info:
This approach may fix the symptom but probably not the cause. diff --git a/modules/core/plugin-container/src/main/java/org/rhq/core/pc/inventory/InventoryManager.java b/modules/core/plugin-container/src/main/java/org/rhq/core/pc/inventory/InventoryManager.java index 2e4d52a..d2b1604 100644 --- a/modules/core/plugin-container/src/main/java/org/rhq/core/pc/inventory/InventoryManager.java +++ b/modules/core/plugin-container/src/main/java/org/rhq/core/pc/inventory/InventoryManager.java @@ -41,6 +41,7 @@ import java.util.concurrent.ExecutionException; import java.util.concurrent.Executor; import java.util.concurrent.Future; +import java.util.concurrent.ScheduledFuture; import java.util.concurrent.ScheduledThreadPoolExecutor; import java.util.concurrent.TimeUnit; import java.util.concurrent.atomic.AtomicInteger; @@ -216,6 +217,11 @@ */ private ResourceUpgradeDelegate resourceUpgradeDelegate = new ResourceUpgradeDelegate(this); + /** + * Prevent service scans from looping. + */ + private volatile ScheduledFuture<? extends Object> serviceScan; + public InventoryManager() { super(DiscoveryAgentService.class); } @@ -1271,8 +1277,11 @@ private void synchInventory(ResourceSyncInfo syncInfo, boolean partialInventory) // requestAvailabilityCheck on each unknown or modified resource. requestFullAvailabilityReport(); - this.inventoryThreadPoolExecutor.schedule((Callable<? extends Object>) this.serviceScanExecutor, - configuration.getChildResourceDiscoveryDelay(), TimeUnit.SECONDS); + // Don't schedule yet another scan we already did so + if (serviceScan != null && !serviceScan.isDone()) { + this.serviceScan = this.inventoryThreadPoolExecutor.schedule((Callable<? extends Object>) this.serviceScanExecutor, + configuration.getChildResourceDiscoveryDelay(), TimeUnit.SECONDS); + } } } catch (Throwable t) { log.warn("Failed to synchronize local inventory with Server inventory for Resource [" + syncInfo.getId()