Bug 1170525 - Disabling agent plug-in or resource type for already discovered resource puts agent into discovery loop
Summary: Disabling agent plug-in or resource type for already discovered resource puts...
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: JBoss Operations Network
Classification: JBoss
Component: Agent
Version: JON 3.2
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: JON 3.3.0
Assignee: John Mazzitelli
QA Contact: Mike Foley
URL:
Whiteboard:
Depends On: 1073201
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-12-04 08:58 UTC by bkramer
Modified: 2019-02-15 13:54 UTC (History)
3 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2015-01-15 20:31:34 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1015734 0 unspecified NEW RuntimeDiscoveryExecutor can execute discovery scans in a loop 2022-03-31 04:27:58 UTC
Red Hat Knowledge Base (Solution) 1274143 0 None None None Never

Internal Links: 1015734

Description bkramer 2014-12-04 08:58:00 UTC
Clone of RHQ Bugzilla 1015734 (https://bugzilla.redhat.com/show_bug.cgi?id=1015734)
*******************************************************************************

Elias Ross 2013-10-04 23:27:02 EDT 

Description of problem:

After doing a server upgrade, I experienced a looping issue where some agents (not all) were doing inventory syncing fairly continuously. The root cause isn't clear, but the following was observed:

2013-10-05 01:33:26,275 INFO  [InventoryManager.discovery-1] (InventoryManager)- Syncing local inventory with Server inventory...
2013-10-05 01:33:26,279 INFO  [InventoryManager.discovery-1] (RuntimeDiscoveryExecutor)- Executing runtime discovery scan rooted at [platform]...
...
2013-10-05 01:33:27,333 INFO  [InventoryManager.discovery-1] (RuntimeDiscoveryExecutor)- Scanned platform and 0 server(s) and discovered 0 new descendant Resource(s).
2013-10-05 01:33:27,333 INFO  [InventoryManager.discovery-1] (InventoryManager)- Sending [runtime] inventory report to Server...
2013-10-05 01:33:27,860 INFO  [InventoryManager.discovery-1] (InventoryManager)- Syncing local inventory with Server inventory...

As you can see about 1 second later the same sync occurred again. This appears to be a loop for the following.

InventoryManager.
    private void synchInventory(ResourceSyncInfo syncInfo, boolean partialInventory) {

calls 
    public boolean handleReport(InventoryReport report) {

calls
    synchInventory(...) in a new thread... by

                this.inventoryThreadPoolExecutor.schedule((Callable<? extends Object>) this.serviceScanExecutor,
                    configuration.getChildResourceDiscoveryDelay(), TimeUnit.SECONDS);

^^ the delay here is only 5 seconds...

and the cycle repeats itself.

I added additional pooled EJB instances as increased the communication setting concurrency and this did *not* fix the problem.

The problem fixed itself when I stopped most of the other agents.


Version-Release number of selected component (if applicable): 4.9


How reproducible: 

Steps to Reproduce:
1. In cases where the server appears busy. I'm guessing there's just not enough time for the proper sync to happen. 
2. There may be changes in resources/plugins before/after upgrade.
3.

Additional info:

*******************************************************************************

Elias Ross 2013-10-06 16:33:58 EDT

This approach may fix the symptom but probably not the cause.

diff --git a/modules/core/plugin-container/src/main/java/org/rhq/core/pc/inventory/InventoryManager.java b/modules/core/plugin-container/src/main/java/org/rhq/core/pc/inventory/InventoryManager.java
index 2e4d52a..d2b1604 100644
--- a/modules/core/plugin-container/src/main/java/org/rhq/core/pc/inventory/InventoryManager.java
+++ b/modules/core/plugin-container/src/main/java/org/rhq/core/pc/inventory/InventoryManager.java
@@ -41,6 +41,7 @@
 import java.util.concurrent.ExecutionException;
 import java.util.concurrent.Executor;
 import java.util.concurrent.Future;
+import java.util.concurrent.ScheduledFuture;
 import java.util.concurrent.ScheduledThreadPoolExecutor;
 import java.util.concurrent.TimeUnit;
 import java.util.concurrent.atomic.AtomicInteger;
@@ -216,6 +217,11 @@
      */
     private ResourceUpgradeDelegate resourceUpgradeDelegate = new ResourceUpgradeDelegate(this);
 
+    /**
+     * Prevent service scans from looping.
+     */
+    private volatile ScheduledFuture<? extends Object> serviceScan;
+
     public InventoryManager() {
         super(DiscoveryAgentService.class);
     }
@@ -1271,8 +1277,11 @@ private void synchInventory(ResourceSyncInfo syncInfo, boolean partialInventory)
                 // requestAvailabilityCheck on each unknown or modified resource.
                 requestFullAvailabilityReport();
 
-                this.inventoryThreadPoolExecutor.schedule((Callable<? extends Object>) this.serviceScanExecutor,
-                    configuration.getChildResourceDiscoveryDelay(), TimeUnit.SECONDS);
+                // Don't schedule yet another scan we already did so
+                if (serviceScan != null && !serviceScan.isDone()) {
+                    this.serviceScan = this.inventoryThreadPoolExecutor.schedule((Callable<? extends Object>) this.serviceScanExecutor,
+                        configuration.getChildResourceDiscoveryDelay(), TimeUnit.SECONDS);
+                }
             }
         } catch (Throwable t) {
             log.warn("Failed to synchronize local inventory with Server inventory for Resource [" + syncInfo.getId()

Comment 1 Michael Burman 2015-01-15 08:42:45 UTC
Probably fixed by BZ 1073201 (at least the linked customer case seems to be caused by that bug)

Comment 2 Larry O'Leary 2015-01-15 20:31:34 UTC
(In reply to Michael Burman from comment #1)
> Probably fixed by BZ 1073201 (at least the linked customer case seems to be
> caused by that bug)

Yes. It appears so. After further review, the inventory report received by the agent during inventory sync includes an unknown resource. It is this that is causing the agent to repeat the discovery scan. This issue was fixed in upstream bug 1073201 and that fix was released in JBoss ON 3.3.


Note You need to log in before you can comment on or make changes to this bug.