Bug 1439320 - [scale lab] ODL is deployed with 2G heap size
Summary: [scale lab] ODL is deployed with 2G heap size
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: puppet-opendaylight
Version: 10.0 (Newton)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: beta
: ---
Assignee: Tim Rozet
QA Contact: Itzik Brown
URL:
Whiteboard: scale_lab
Depends On: 1451401 1512073
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-04-05 17:12 UTC by Sai Sindhur Malleni
Modified: 2018-02-19 12:39 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-10-02 14:58:40 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Sai Sindhur Malleni 2017-04-05 17:12:40 UTC
Description of problem: By default the systemd unit file is setup with 2G heap size for ODL. In performance and scale testing we have seen ODL take 24G memory in some cases. While a 24G heap size in most cases is not possible to have and not needed, it would be good to double the default to 4GB to begin with.


Version-Release number of selected component (if applicable):
RHOP 10

How reproducible:
100%

Steps to Reproduce:
1. Deploy RHOP 10 with ODL with defaults
2. Create neutron resources at scale (several 100s)
3.

Actual results:


Expected results:


Additional info:
Linked BZ:https://bugzilla.redhat.com/show_bug.cgi?id=1436799

Comment 1 Nir Yechiel 2017-07-20 08:56:06 UTC
@Mike, was this fixed already? Should we target this for RHOSP 12, move to ON_QA and let the scale team test again?

Comment 2 Mike Kolesnik 2017-07-23 11:15:13 UTC
Not sure, perhaps Tim knows

Comment 3 Tim Rozet 2017-07-25 14:56:27 UTC
There are a few fixes needed here.  First we do not add java options correctly with puppet-odl.  See:

https://github.com/dfarrell07/puppet-opendaylight/issues/139

After fixing that, we then need to decide on proper min and max heap size.  Also related to garbage collection for java opts:
https://github.com/dfarrell07/puppet-opendaylight/issues/104

Comment 4 Tim Rozet 2017-07-25 15:18:14 UTC
Maybe we also need to look at configuring -XX:MaxPermSize and -Xss

Need a java expert to comment or examine the ODL memory usage and what the correct defaults should be.

Comment 5 jamo luhrsen 2017-07-26 06:14:25 UTC
for the most part, all of the upstream system test jobs use 2G as the
Max. There are some (not really) scale tests that are working with 
this. Scale, in the sense that 5-6 hundred virtual switches are 
connected to ODL and they are validated to be there. The point is that
2G is never getting in the way of normal system tests.

I'm more inclined to first assume that we have some memory leak to
figure out, before defaulting to some larger value of Xmx.

However, as long as it seems reasonable to java experts (I am not)
to use ~4G, I wouldn't object.

I know the ODL Performance white paper that was done with the Beryllium
release, the max heap was 8G.

Comment 6 Michael Vorburger 2017-07-27 19:28:05 UTC
IMHO we should just use the defaults from upstream's odlparent opendaylight-karaf-empty.

> Maybe we also need to look at configuring -XX:MaxPermSize and -Xss

FYI MaxPermSize is pre-Java 8 and not required (not allowed) in Java 8+

I dunno what "-Xss", but if you mean "-Xms" we already have that.  The Xss appears to be an exotic option related to stack size (some posts mention it re. StackOverflowError), and I'd be reluctant to fiddle with that, unless we have a proven need.

Comment 7 Tim Rozet 2017-07-27 19:35:28 UTC
Yeah -Xss is thread stack size.  It would be interesting to know how many and size of the threads being used during these times of high memory consumption.  It sounds to me like we need to test and measure memory footprint again with Carbon SR1 and determine if our default max heap is too small or if it is correct and there are just memory leaks.

Comment 8 lpeer 2017-08-06 07:28:16 UTC
Sridhar - can you please update this bug after the scale lab testing in August 2017.

Comment 10 Tim Rozet 2017-09-29 20:14:55 UTC
Sridhar,
After completing the scale/perf tests with ODL and identifying several OOM bugs and getting fixes...how do you feel about the current heap size?  Do you think it is sufficient?  Should we hold off on this bug until we have more testing from the next round of scale/perf when all of the memory leak bugs have been fixed?

Comment 11 Sai Sindhur Malleni 2017-09-29 20:18:04 UTC
Sridhar definitely has a better ahdnle on this, but going to chime in with a data points here. After the fixes for memory leaks in openflow plugin, we haven't seen the heap go over 1G even under what we would consider stress tests.

Comment 12 Sridhar Gaddam 2017-10-01 16:37:20 UTC
(In reply to Sai Sindhur Malleni from comment #11)
> Sridhar definitely has a better ahdnle on this, but going to chime in with a
> data points here. After the fixes for memory leaks in openflow plugin, we
> haven't seen the heap go over 1G even under what we would consider stress
> tests.

We noticed that most of the time the memory usage was well within the 1GB. Once i remember seeing it using slightly above 1.5G but less than 2GB which is the default value. We haven't seen any OOM after the OOM fixes went in, even after running multiple iterations of the tests. So, I feel we are good and can close this bug.

Comment 13 Sridhar Gaddam 2017-10-01 16:46:12 UTC
An additional note: This bug is about increasing the default value of JAVA HEAP to 4G instead of 2G. Based on the fixes that went in and our observations we are good with 2G, however IMO we should still have OOO support to override the default value (if necessary). Do we have a separate RHBZ for OOO support?

Comment 14 Sai Sindhur Malleni 2017-10-01 22:34:04 UTC
Sridhar,
Yes, we have a BZ for enabling heap size configuration via TripleO. Here is the link https://bugzilla.redhat.com/show_bug.cgi?id=1488968


Note You need to log in before you can comment on or make changes to this bug.