Clone of RHQ Bugzilla 1015734 (https://bugzilla.redhat.com/show_bug.cgi?id=1015734) ******************************************************************************* Elias Ross 2013-10-04 23:27:02 EDT Description of problem: After doing a server upgrade, I experienced a looping issue where some agents (not all) were doing inventory syncing fairly continuously. The root cause isn't clear, but the following was observed: 2013-10-05 01:33:26,275 INFO [InventoryManager.discovery-1] (InventoryManager)- Syncing local inventory with Server inventory... 2013-10-05 01:33:26,279 INFO [InventoryManager.discovery-1] (RuntimeDiscoveryExecutor)- Executing runtime discovery scan rooted at [platform]... ... 2013-10-05 01:33:27,333 INFO [InventoryManager.discovery-1] (RuntimeDiscoveryExecutor)- Scanned platform and 0 server(s) and discovered 0 new descendant Resource(s). 2013-10-05 01:33:27,333 INFO [InventoryManager.discovery-1] (InventoryManager)- Sending [runtime] inventory report to Server... 2013-10-05 01:33:27,860 INFO [InventoryManager.discovery-1] (InventoryManager)- Syncing local inventory with Server inventory... As you can see about 1 second later the same sync occurred again. This appears to be a loop for the following. InventoryManager. private void synchInventory(ResourceSyncInfo syncInfo, boolean partialInventory) { calls public boolean handleReport(InventoryReport report) { calls synchInventory(...) in a new thread... by this.inventoryThreadPoolExecutor.schedule((Callable<? extends Object>) this.serviceScanExecutor, configuration.getChildResourceDiscoveryDelay(), TimeUnit.SECONDS); ^^ the delay here is only 5 seconds... and the cycle repeats itself. I added additional pooled EJB instances as increased the communication setting concurrency and this did *not* fix the problem. The problem fixed itself when I stopped most of the other agents. Version-Release number of selected component (if applicable): 4.9 How reproducible: Steps to Reproduce: 1. In cases where the server appears busy. I'm guessing there's just not enough time for the proper sync to happen. 2. There may be changes in resources/plugins before/after upgrade. 3. Additional info: ******************************************************************************* Elias Ross 2013-10-06 16:33:58 EDT This approach may fix the symptom but probably not the cause. diff --git a/modules/core/plugin-container/src/main/java/org/rhq/core/pc/inventory/InventoryManager.java b/modules/core/plugin-container/src/main/java/org/rhq/core/pc/inventory/InventoryManager.java index 2e4d52a..d2b1604 100644 --- a/modules/core/plugin-container/src/main/java/org/rhq/core/pc/inventory/InventoryManager.java +++ b/modules/core/plugin-container/src/main/java/org/rhq/core/pc/inventory/InventoryManager.java @@ -41,6 +41,7 @@ import java.util.concurrent.ExecutionException; import java.util.concurrent.Executor; import java.util.concurrent.Future; +import java.util.concurrent.ScheduledFuture; import java.util.concurrent.ScheduledThreadPoolExecutor; import java.util.concurrent.TimeUnit; import java.util.concurrent.atomic.AtomicInteger; @@ -216,6 +217,11 @@ */ private ResourceUpgradeDelegate resourceUpgradeDelegate = new ResourceUpgradeDelegate(this); + /** + * Prevent service scans from looping. + */ + private volatile ScheduledFuture<? extends Object> serviceScan; + public InventoryManager() { super(DiscoveryAgentService.class); } @@ -1271,8 +1277,11 @@ private void synchInventory(ResourceSyncInfo syncInfo, boolean partialInventory) // requestAvailabilityCheck on each unknown or modified resource. requestFullAvailabilityReport(); - this.inventoryThreadPoolExecutor.schedule((Callable<? extends Object>) this.serviceScanExecutor, - configuration.getChildResourceDiscoveryDelay(), TimeUnit.SECONDS); + // Don't schedule yet another scan we already did so + if (serviceScan != null && !serviceScan.isDone()) { + this.serviceScan = this.inventoryThreadPoolExecutor.schedule((Callable<? extends Object>) this.serviceScanExecutor, + configuration.getChildResourceDiscoveryDelay(), TimeUnit.SECONDS); + } } } catch (Throwable t) { log.warn("Failed to synchronize local inventory with Server inventory for Resource [" + syncInfo.getId()
Probably fixed by BZ 1073201 (at least the linked customer case seems to be caused by that bug)
(In reply to Michael Burman from comment #1) > Probably fixed by BZ 1073201 (at least the linked customer case seems to be > caused by that bug) Yes. It appears so. After further review, the inventory report received by the agent during inventory sync includes an unknown resource. It is this that is causing the agent to repeat the discovery scan. This issue was fixed in upstream bug 1073201 and that fix was released in JBoss ON 3.3.