Bug 1170525

Summary: Disabling agent plug-in or resource type for already discovered resource puts agent into discovery loop
Product: [JBoss] JBoss Operations Network Reporter: bkramer <bkramer>
Component: AgentAssignee: John Mazzitelli <mazz>
Status: CLOSED NEXTRELEASE QA Contact: Mike Foley <mfoley>
Severity: high Docs Contact:
Priority: high    
Version: JON 3.2CC: fbrychta, loleary, miburman
Target Milestone: ---   
Target Release: JON 3.3.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-01-15 20:31:34 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1073201    
Bug Blocks:    

Description bkramer 2014-12-04 08:58:00 UTC
Clone of RHQ Bugzilla 1015734 (https://bugzilla.redhat.com/show_bug.cgi?id=1015734)
*******************************************************************************

Elias Ross 2013-10-04 23:27:02 EDT 

Description of problem:

After doing a server upgrade, I experienced a looping issue where some agents (not all) were doing inventory syncing fairly continuously. The root cause isn't clear, but the following was observed:

2013-10-05 01:33:26,275 INFO  [InventoryManager.discovery-1] (InventoryManager)- Syncing local inventory with Server inventory...
2013-10-05 01:33:26,279 INFO  [InventoryManager.discovery-1] (RuntimeDiscoveryExecutor)- Executing runtime discovery scan rooted at [platform]...
...
2013-10-05 01:33:27,333 INFO  [InventoryManager.discovery-1] (RuntimeDiscoveryExecutor)- Scanned platform and 0 server(s) and discovered 0 new descendant Resource(s).
2013-10-05 01:33:27,333 INFO  [InventoryManager.discovery-1] (InventoryManager)- Sending [runtime] inventory report to Server...
2013-10-05 01:33:27,860 INFO  [InventoryManager.discovery-1] (InventoryManager)- Syncing local inventory with Server inventory...

As you can see about 1 second later the same sync occurred again. This appears to be a loop for the following.

InventoryManager.
    private void synchInventory(ResourceSyncInfo syncInfo, boolean partialInventory) {

calls 
    public boolean handleReport(InventoryReport report) {

calls
    synchInventory(...) in a new thread... by

                this.inventoryThreadPoolExecutor.schedule((Callable<? extends Object>) this.serviceScanExecutor,
                    configuration.getChildResourceDiscoveryDelay(), TimeUnit.SECONDS);

^^ the delay here is only 5 seconds...

and the cycle repeats itself.

I added additional pooled EJB instances as increased the communication setting concurrency and this did *not* fix the problem.

The problem fixed itself when I stopped most of the other agents.


Version-Release number of selected component (if applicable): 4.9


How reproducible: 

Steps to Reproduce:
1. In cases where the server appears busy. I'm guessing there's just not enough time for the proper sync to happen. 
2. There may be changes in resources/plugins before/after upgrade.
3.

Additional info:

*******************************************************************************

Elias Ross 2013-10-06 16:33:58 EDT

This approach may fix the symptom but probably not the cause.

diff --git a/modules/core/plugin-container/src/main/java/org/rhq/core/pc/inventory/InventoryManager.java b/modules/core/plugin-container/src/main/java/org/rhq/core/pc/inventory/InventoryManager.java
index 2e4d52a..d2b1604 100644
--- a/modules/core/plugin-container/src/main/java/org/rhq/core/pc/inventory/InventoryManager.java
+++ b/modules/core/plugin-container/src/main/java/org/rhq/core/pc/inventory/InventoryManager.java
@@ -41,6 +41,7 @@
 import java.util.concurrent.ExecutionException;
 import java.util.concurrent.Executor;
 import java.util.concurrent.Future;
+import java.util.concurrent.ScheduledFuture;
 import java.util.concurrent.ScheduledThreadPoolExecutor;
 import java.util.concurrent.TimeUnit;
 import java.util.concurrent.atomic.AtomicInteger;
@@ -216,6 +217,11 @@
      */
     private ResourceUpgradeDelegate resourceUpgradeDelegate = new ResourceUpgradeDelegate(this);
 
+    /**
+     * Prevent service scans from looping.
+     */
+    private volatile ScheduledFuture<? extends Object> serviceScan;
+
     public InventoryManager() {
         super(DiscoveryAgentService.class);
     }
@@ -1271,8 +1277,11 @@ private void synchInventory(ResourceSyncInfo syncInfo, boolean partialInventory)
                 // requestAvailabilityCheck on each unknown or modified resource.
                 requestFullAvailabilityReport();
 
-                this.inventoryThreadPoolExecutor.schedule((Callable<? extends Object>) this.serviceScanExecutor,
-                    configuration.getChildResourceDiscoveryDelay(), TimeUnit.SECONDS);
+                // Don't schedule yet another scan we already did so
+                if (serviceScan != null && !serviceScan.isDone()) {
+                    this.serviceScan = this.inventoryThreadPoolExecutor.schedule((Callable<? extends Object>) this.serviceScanExecutor,
+                        configuration.getChildResourceDiscoveryDelay(), TimeUnit.SECONDS);
+                }
             }
         } catch (Throwable t) {
             log.warn("Failed to synchronize local inventory with Server inventory for Resource [" + syncInfo.getId()

Comment 1 Michael Burman 2015-01-15 08:42:45 UTC
Probably fixed by BZ 1073201 (at least the linked customer case seems to be caused by that bug)

Comment 2 Larry O'Leary 2015-01-15 20:31:34 UTC
(In reply to Michael Burman from comment #1)
> Probably fixed by BZ 1073201 (at least the linked customer case seems to be
> caused by that bug)

Yes. It appears so. After further review, the inventory report received by the agent during inventory sync includes an unknown resource. It is this that is causing the agent to repeat the discovery scan. This issue was fixed in upstream bug 1073201 and that fix was released in JBoss ON 3.3.