Bug 1015734
| Summary: | RuntimeDiscoveryExecutor can execute discovery scans in a loop | ||
|---|---|---|---|
| Product: | [Other] RHQ Project | Reporter: | Elias Ross <genman> |
| Component: | Agent | Assignee: | Nobody <nobody> |
| Status: | NEW --- | QA Contact: | |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 4.9 | CC: | bkramer, hrupp |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | Type: | Bug | |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
This approach may fix the symptom but probably not the cause.
diff --git a/modules/core/plugin-container/src/main/java/org/rhq/core/pc/inventory/InventoryManager.java b/modules/core/plugin-container/src/main/java/org/rhq/core/pc/inventory/InventoryManager.java
index 2e4d52a..d2b1604 100644
--- a/modules/core/plugin-container/src/main/java/org/rhq/core/pc/inventory/InventoryManager.java
+++ b/modules/core/plugin-container/src/main/java/org/rhq/core/pc/inventory/InventoryManager.java
@@ -41,6 +41,7 @@
import java.util.concurrent.ExecutionException;
import java.util.concurrent.Executor;
import java.util.concurrent.Future;
+import java.util.concurrent.ScheduledFuture;
import java.util.concurrent.ScheduledThreadPoolExecutor;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicInteger;
@@ -216,6 +217,11 @@
*/
private ResourceUpgradeDelegate resourceUpgradeDelegate = new ResourceUpgradeDelegate(this);
+ /**
+ * Prevent service scans from looping.
+ */
+ private volatile ScheduledFuture<? extends Object> serviceScan;
+
public InventoryManager() {
super(DiscoveryAgentService.class);
}
@@ -1271,8 +1277,11 @@ private void synchInventory(ResourceSyncInfo syncInfo, boolean partialInventory)
// requestAvailabilityCheck on each unknown or modified resource.
requestFullAvailabilityReport();
- this.inventoryThreadPoolExecutor.schedule((Callable<? extends Object>) this.serviceScanExecutor,
- configuration.getChildResourceDiscoveryDelay(), TimeUnit.SECONDS);
+ // Don't schedule yet another scan we already did so
+ if (serviceScan != null && !serviceScan.isDone()) {
+ this.serviceScan = this.inventoryThreadPoolExecutor.schedule((Callable<? extends Object>) this.serviceScanExecutor,
+ configuration.getChildResourceDiscoveryDelay(), TimeUnit.SECONDS);
+ }
}
} catch (Throwable t) {
log.warn("Failed to synchronize local inventory with Server inventory for Resource [" + syncInfo.getId()
|
Description of problem: After doing a server upgrade, I experienced a looping issue where some agents (not all) were doing inventory syncing fairly continuously. The root cause isn't clear, but the following was observed: 2013-10-05 01:33:26,275 INFO [InventoryManager.discovery-1] (InventoryManager)- Syncing local inventory with Server inventory... 2013-10-05 01:33:26,279 INFO [InventoryManager.discovery-1] (RuntimeDiscoveryExecutor)- Executing runtime discovery scan rooted at [platform]... ... 2013-10-05 01:33:27,333 INFO [InventoryManager.discovery-1] (RuntimeDiscoveryExecutor)- Scanned platform and 0 server(s) and discovered 0 new descendant Resource(s). 2013-10-05 01:33:27,333 INFO [InventoryManager.discovery-1] (InventoryManager)- Sending [runtime] inventory report to Server... 2013-10-05 01:33:27,860 INFO [InventoryManager.discovery-1] (InventoryManager)- Syncing local inventory with Server inventory... As you can see about 1 second later the same sync occurred again. This appears to be a loop for the following. InventoryManager. private void synchInventory(ResourceSyncInfo syncInfo, boolean partialInventory) { calls public boolean handleReport(InventoryReport report) { calls synchInventory(...) in a new thread... by this.inventoryThreadPoolExecutor.schedule((Callable<? extends Object>) this.serviceScanExecutor, configuration.getChildResourceDiscoveryDelay(), TimeUnit.SECONDS); ^^ the delay here is only 5 seconds... and the cycle repeats itself. I added additional pooled EJB instances as increased the communication setting concurrency and this did *not* fix the problem. The problem fixed itself when I stopped most of the other agents. Version-Release number of selected component (if applicable): 4.9 How reproducible: Steps to Reproduce: 1. In cases where the server appears busy. I'm guessing there's just not enough time for the proper sync to happen. 2. There may be changes in resources/plugins before/after upgrade. 3. Additional info: