Bug 1023451
| Summary: | [perf] Large retained heap during inventory report merge (causing OOMs) | |||
|---|---|---|---|---|
| Product: | [Other] RHQ Project | Reporter: | Lukas Krejci <lkrejci> | |
| Component: | Core Server | Assignee: | Jay Shaughnessy <jshaughn> | |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Mike Foley <mfoley> | |
| Severity: | high | Docs Contact: | ||
| Priority: | unspecified | |||
| Version: | 4.9 | CC: | hrupp, jsanda, jshaughn, loleary, mazz, myarboro, rhatlapa, vnguyen | |
| Target Milestone: | --- | |||
| Target Release: | RHQ 4.10 | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | Bug Fix | ||
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1029688 (view as bug list) | Environment: | ||
| Last Closed: | 2014-04-23 12:31:30 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1024397, 1029688 | |||
|
Description
Lukas Krejci
2013-10-25 13:03:11 UTC
There was a lot of work done to improve the performance (speed) of merging inventory. There is optimization is in place to efficiently handle large hierarchies and varying child cardinalities. Changes here must not compromise throughput so testing must measure throughput as well as memory consumption. A second HA server would possibly mitigate the issue and may not be unreasonable for such a configuration. I too am not sure about the Resource Type issue, it seems that a caching mechanism here would have been sufficient but maybe it has issues. Also, I do remember that there is code in place that already tries to help out GC, but perhaps we can do more. Maybe this doesn't need to be said, but in short, it's very important to have a holistic understanding of this mechanism, including the transactioning, before making further optimizations. A correction of the symptoms of this bug: 1) We limit the number of concurrently processed inventory reports and by default we process at most 5 of them. 2) The heap dump I have at hand is consistent with this finding: a) it contains 2 threads with abnormal (among the rest) memory usage of > 50MB - the stacktraces of these 2 threads show that these threads are executing inside DiscoveryBossBean.mergeInventory(). b) There is a great number of threads with memory usage of approx 20MB that seem to be in the process of deserializing the requests for mergeInventory (judged by the presence of an InventoryReport instance in their heaps). c) There are altogether 45 threads handling http input that account for the majority of the consumed memory. So the issue here, I think, is twofold: 1) We consume large chunks of memory while handling inventory reports 2) Large inventories are actually quite expensive to transfer as command requests. Even before we can decide that given inventory merge request is to be ignored (because of concurrency limits), we can consume as much as 20MB of memory. So it seems that rather than trying to minimize the memory usage while actually merging the inventories (this came through a speed optimization efforts, which might have increased the memory usage but definitely increased throughput), we should concentrate on changing the remote invocations such that we can stop the invocation even before we need to deserialize its full set of parameters. One solution that comes to my mind is to make use of our remote streams to pass all the parameters (the question that is what to do with parameters that are streams themselves). We could then decide what to do with the invocation and only after we determined that it can be invoked, we'd read its params from the remote stream. Mazz, what do you think about this? After talking with Mazz I am trying out this simple approach: 1) Change the signature of DiscoveryServerService.mergeInventoryReport from: MergeInventoryReportResults mergeInventoryReport(InventoryReport) to: MergeInventoryReportResults mergeInventoryReport(InputStream) The InventoryReport object will travel to the server in the serialized form in an input stream. This means that the communication infrastructure can process the @LimitedConcurrency and other annotations and proceed to actually executing the impl of the above method without the huge InventoryReport taking up (nearly) any memory. The DiscoveryServerServiceImpl will then deserialize the InventoryReport from the provided stream and pass it down to DiscoveryBossBean which does the real work and remains unchanged. As such the code changes are rather small and we gain the ability to deserialize the InventoryReport lazily and only when really needed. In another words, we decided not to change the implementation of the inventory merge in any manner as that is a highly optimized and complex code, even if the memory consumption can be quite high during it. Instead, we make sure that we consume no additional memory in merge inventory requests that are to be discarded anyway. In scenarios like the one described in this BZ, those discarded requests will greatly outnumber the actually processed ones. I think this is going to need a more thorough look when we look at the agent-server comms as a whole. The approach with turning the InventoryReport into an input stream (as suggested) doesn't work, because the mergeInventoryReport is a synchronous call. When a synchronous call tries to call back to the agent (i.e. read the provided inputstream), the communication will deadlock, because the synchronous call is still in progress and the agent processes the calls in a single threaded manner. I tried to overcome that by splitting the execution of the merge into a following sequence: New methods: 1) @Asynchronous @LimitedConcurrency void DiscoveryServerService.initiateInventoryReportMerge(String handle, Agent agent) 2) DiscoveryAgentService: InventoryReport getCollectedInventoryReport(String handle) 3) DiscoveryAgentService: void finishInventoryReportMerge(String handle, MergeInventoryReportResults results, Exception failure) The agent generates its inventory report and when it is ready to send it up to the server, it first stores the report into temporary storage and generates a unique "handle" for it. It then initiates the merge by calling the server with the handle. This is an asynchronous call and thus returns immediately on the agent. On the server, during the initiateInventoryReportMerge method, we call the getCollectedInventoryReport to obtain the inventory report from the agent, process it normally and then report back to the agent about the results of the merge - both in case of success and failure. Because the agent needs to know the result of the merge and originally the merge was synchronous, we need to somehow wait for the new workflow to finish (so that we don't have to reimplement large chunks of the agent code). This can be done, but has a number of edge cases that I feel introduce enough complexity that I am willing to stop trying to fix this problem until we come up with a more comprehensive solution to the problem of passing large arguments to remote methods (like stream the arguments in all cases or something). The edge cases revolve around one of the parties becoming unavailable during the merge sequence. In order to not block the agent indefinitely, we need to define a timeout on the agent for how long it waits for the server to process the inventory report (and reports back by calling the finishInventoryReportMerge() method). What happens if the server processing just took longer than the timeout? How does the rest of the agent cope with the servers inability to merge the inventory "in time"? Wouldn't we potentially overload the server with merge requests that could quicker than the server can handle (due to too low a timeout on the agent)? This approach also opens the door to the possible processing of multiple merge requests from a single agent on the server side concurrently - this can be avoided, but again is a complication. So all in all I'm in favor of not fixing this until we come up with a more scalable way of passing the arguments during synchronous calls. But this is something that I would not try to do in the scope of JON 3.2.0. This would be a fundamental change to how we do comms and thus is very risky. i am recommending removing the JON 3.2 GA Blocker, and moving this to JON 3.2.1 for the following reasons: 1) comment #4. Developer feels the change is very risky 2) 3 weeks from GA, with all large topology testing behind us. 3) PM-402 ...the documented large topology requirement, is minimally met. (true, there are sporadic OOMs after the requirement is met, and true ... the agent heap size needs to be increased to 2GB) true ...the agent needs a smaller memory footprint ... and true ... more optimizations and testing is needed for large topology scenarios but this doesn't seem practical with the current timeline and scope for GA. flagging for discussion by the triage team. mark: let's discuss this at the next triage call. removing JON 3.2 GA Blocker as this is RHQ. cloned this into a JON BZ Just noting that I am doing some related work on lowering the footprint of inventory sync. It's not a complete redesign, it doesn't completely change our comm approach. It focuses on 1) breaking a full platform sync into pieces to lower the memory footprint and 2) moving the sync data away from a tree structure to reduce the inefficient hibernate-eager-fetch approach to building the tree, and lightening objects passed across the wire. The work is in a branch, I'll update the BZ with commit hashes when it moves to master.
I've merged the branch into master. Perhaps this will be sufficient to handle the needs of this BZ? I'm moving to ON_QA and asking Heiko to perform any further triage...
There are several commits involved, from most recent:
commit 5ef600890aaedcc3a040f5496d18c3c808d73795
Date: Fri Jan 3 16:50:39 2014 -0500
fix merge issue, this class somehow disappeared
commit d0ab1fc5e0163804c1db9179f598e4b8d791e56b
Date: Fri Jan 3 16:05:16 2014 -0500
Some tweaks to test code
commit bd89436f6e6b5db574a32ffd6636e13e1d702d54
Date: Thu Dec 19 14:03:24 2013 -0500
Add a little more logging to get a better start/end of a sync.
commit ad9e7671b663f622cee1dac8e3e0113fe838743f
Date: Wed Dec 18 09:28:48 2013 -0500
This commit builds on the "chunking" work introduced earlier. This work
replaces the tree structure, previously used to pass sync info between
agent and server, with flat collections. This allows us to replace the
costly Hibernate-based tree building approach with more comprehensive
queries that reduce the number of DB round trips dramatically.
Notes:
- This commit fixed an issue with the previous work regarding the handling
of top level services.
- The ResourceSyncInfo class is now lighter weight, parent-child info is
removed.
- PlatformSyncInfo is now a POJO.
- The handling of unknown resources (agent side) changed significantly
because it had depended on the previous tree structure.
- mocks again had to be updated
Bulk closing of 4.10 issues. If an issue is not solved for you, please open a new BZ (or clone the existing one) with a version designator of 4.10. |