Bug 534385 (RHQ-1187)
Summary: | improve performance of resource uninventory | ||
---|---|---|---|
Product: | [Other] RHQ Project | Reporter: | John Mazzitelli <mazz> |
Component: | Performance | Assignee: | Joseph Marques <jmarques> |
Status: | CLOSED NEXTRELEASE | QA Contact: | Pavel Kralik <pkralik> |
Severity: | medium | Docs Contact: | |
Priority: | high | ||
Version: | unspecified | CC: | mvecera |
Target Milestone: | --- | Keywords: | Improvement |
Target Release: | --- | ||
Hardware: | All | ||
OS: | All | ||
URL: | http://jira.rhq-project.org/browse/RHQ-1187 | ||
Whiteboard: | |||
Fixed In Version: | 1.3 | Doc Type: | Enhancement |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | Type: | --- | |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 534390, 534391, 535428 | ||
Bug Blocks: |
Description
John Mazzitelli
2008-11-26 17:18:00 UTC
we need to think about how this affects deleting agents too When we start doing this, we should also allow to ignore individual services. Those should not be uninventoried, but marked as ignored so that they don't directly show up again after a discovery. The gui should perhaps change the 'uninventory' button for those to 'ignore' The AD portlet (or the parents inventory ?) should allow the bring them back in. we *need* this implemented. (btw: in addition to changnig the status, we need to change the resource key to something like "~~dummy~~") my recent thoughts: * need a new UNINVENTORIED type for InventoryStatus to distinguish from deleted resources (delete button from inventory tab of the parent resource) for the in-band work: * change all statuses to UNINVENTORIED in one fell swoop - there are queries in ResourceManager that show how to do this to resource tree with at most 6 levels of depth in a single query * modify the resource key to junk so that the discovery mechanisms on the agent think that the entire resource tree has been successfully deleted * set the agent references on all uninventoried resources to null - this will prevent the synchronization job from seeing two platforms (the old one with the junk resource key, and the new one after the next discovery run) * after the first three steps are complete, notify the agent of the uninventory action for the given resources - this should trigger a new auto-discovery and work because the server-side when it returns the ResourceSyncInfo tree back to the agent it will be the empty set for the out-of-band work: * quartz job can come along and look to see if any resources have the UNINVENTORIED status and takes it sweet time cleaning up after it - no need to batch resources in sizes of 200, the size can be 1 to keep the transaction small, especially considering that resources which have been in inventory for a long time can conceivably have lots of audited data scattered across all parts of the domain other thoughts: * use HibernatePerformanceMonitor utility to profile uninventorying a single resource today (which has a reasonable amount of history across various parts of the domain) to see where our bottlenecks are. consider reworking the slowest points * what happens if there is an error during the async uninventory, will that resource forever remain in a loop trying to uninventory itself and always failing? do we ever mark a resource as "bad" and stop trying to uninventory it? how would this error even be reported back to one or more registered users of the system now that it's done completely in the background? rev4160 - this commit adds asynchronous uninventory and fixes several spots of row contention [RHQ-1187][RHQ-1191][RHQ-1192] - asynchronous uninventory by setting uninventoried resources agent references to null to stop majority agent-side sync, setting the parent to null to take it out of the object graph, using a special UNINVENTORY inventoryStatus so it doesn't conflict with existing semantics around any other state, and using a dummy resourceKey so that the next discovery doesn't collide; [RHQ-1324] - specific timings during uninventory calling reinventory failure are no longer possible because uninventory of the entire resource tree occurs atomically in one bulk update statement, then the agent is notified if successful; [RHQ-2124][RHQ-1656][RHQ-1221] - removed hot spots and various other points of contention by shortening transaction times or using indexes as available for: a) uninventory work, b) cloud manager job, c) check for suspect agent job, d) dynagroup recalculation job, e) alerts cache in-band agent and server status bit setting, f) isAgentBackfilled checking Asynchronous uninventory verified. r4181 This bug was previously known as http://jira.rhq-project.org/browse/RHQ-1187 This bug is related to RHQ-914 This bug is related to RHQ-1324 This bug relates to RHQ-2218 |