Right now, ENGINE_HEAP_MAX default value is 1G.
Which completely does not make sense with the minimal requirement for RHEV-M machine to be at least 16G.
We have more and more customers experiencing out of memory error and have to increase this value manually, as per this solution:
Why not to avoid this problem at first, and change the default value to something more reasonable.
I have 2 suggestions:
1) Change it to be 4G by default.
2) Make it one of the installer questions. Suggest a value based on available RAM/2 ; but let user decide.
P.S. ENGINE_HEAP_MIN would need to be modified as well, probably.
Looking at the article, I think we should align to the article values by default.
Juan any reason for not doing that?
I don't have any objection to do that change.
Shirly, what should the min/max be for dwhd and for the jboss running Reports?
(In reply to Yedidyah Bar David from comment #4)
> Shirly, what should the min/max be for dwhd and for the jboss running
4Gb for JBoss running reports.
Default 1G for DWH should enough, we could increase to 2G to be safe.
1. Currently (that is, before  was merged), the engine defaults to 1G min/max heap size, reports does the same, and dwh does not have defaults - thus, relies on the JRE defaults, which seem to me to be more-or-less: max heap size = 0.25 of RAM.  changes the engine defaults to be min 2G, max 4G.
2. Currently it's possible, although perhaps slow, to run all of them together on a 1G machine, e.g. for testing. OTOH, only dwhd automatically enjoys the larger heap configured by default if running on a larger machine. So a machine with e.g. 16G (our recommended size) will have: engine 1G, reports 1G, dwhd 4G. Rest will be used for caching, and if postgresql is local, for its cache too (which helps too of course).
3. I intend to have engine-setup configure the default for each of them to be the minimum of 1G and 0.25 of RAM. This way, a machine with 1-4G will behave more-or-less as now, and one with 16G will allow each have 4G.
4. I suggest to not query the user about these, as imo we already have enough questions, and the default should be ok, but to do keep these values also in the answer file to allow to easily change them.
5. We should probably not override existing configs on upgrades, so as to not change configs done by following , but do change if we do not find these configured.
6. This means that we'll also ignore  in practice. We can add another answer file key to not configure this, thus not override , not sure we want.
7. I am only somewhat concerned about the case of separate machines for engine/dwh/reports - I thought about doing something complex for this, not sure it's worth it.
Comments are welcome.
Marina - does this sound reasonable? Do you anticipate cases that will want to allocate less than e.g. 8GB for the engine machine but still want a 2GB heap or more?
Gil, same question to you. Please try to stress-test an engine/dwh/reports setup with default values and see that you get errors, and then another test with the patches applied (I'll soon push patches for dwh and reports). I suggest two instances, one with 8GB and one with 16.
Didi, thanks for the irc chat today.
I didn't take in consideration the dwh and reports when I reported the bug.
The suggestion is to allocate max(1, 0.25 * ram) for process: engine, dwh, reports. Which will leave the OS with 0.25 RAM available for postgres and other processes. Will this be enough? I do not know. Scale tests by QE would be great idea.
I will also check with the last customer (internal RH) that had to increase that value and update with their setup. And see if they would be willing to run some tests for us.
Yuri, please start working on a test to tackle this request.
Please add your result to the BZ
Following a private talk with Yaniv, I am not pushing any change for dwh currently. He said that in 3.5 they made some improvements that should make it not need much ram, and anyway it currently uses the jvm's default which is more-or-less what we intended to try anyway.
(In reply to Yuri Obshansky from bug 1190466 comment #1)
> RHEV-M's JVM Heap Size value is very depended on setup (how many
> Hosts/Storages and VMs). I didn't change default value of 1G during my
> performance tests and I didn't encounter with Out Of Memory exception in
> spite of using non-powerful server for engine - 8 CPUs and 16 G RAM.
I wouldn't call this "non-powerful". It's probably considerably less powerful than a modern physical server an organization might purchase today for such a use, but I am not at all certain about a virtual one allocated for the same use.
We say that 4GB is minimum, 16GB recommended. I'd expect most customers to go somewhere in between. It will also be nice if as a result of this discussion/bug/fix, we'll be able to provide some more detailed recommendations, including expected memory use per system size (measured in number of hosts/VMs/other objects/etc) and how number of cpu cores affect responsiveness (in whatever metric).
> setup was 1 DC/Cluster/1-2 Hosts/1-2 Storages/ 100-200 VMs.
I'd expect large customers to be much larger than that. I am pretty certain that we customers having a single engine managing 10 times that.
> My opinion is:
> - we don't need change default value of 1G. It will be required only on
> specific configuration which could be done by customer as well.
The whole point of this bug is to prevent the customer from needing to suffer downtime, then have to wait for support etc., until their system is properly configured.
> - we need perform tuning tests and publish tuning tips.
> There is impossible
> to test all configurations thus we require an input from our support with
> customers common used RHEV-M configuration and setup.
I agree it's impossible, but perhaps we can to do some more.
Can we, for a start, simulate, say, 20 hosts with 1000 VMs?
(In reply to Yedidyah Bar David from comment #13)
> (In reply to Yuri Obshansky from bug 1190466 comment #1)
> > - we need perform tuning tests and publish tuning tips.
Just to clarify:
The patches are ready. What we need now is numbers.
Once we have the numbers, which we need anyway if we want to publish such tips, making the code roughly follow these tips is relatively little work.
Automated message: can you please update doctext or set it as not required?
doc text copied from 3.5 bug 1188971
25% of 4839 is 1209, thus ok.
[root@jb-ovirt36 ~]# grep totalMemory /var/log/ovirt-engine/setup/ovirt-engine-setup-20150417115209-i59eqx.log
2015-04-17 11:52:09 DEBUG otopi.context context.dumpEnvironment:509 ENV OVESETUP_CONFIG/totalMemoryMB=NoneType:'None'
2015-04-17 11:52:12 DEBUG otopi.context context.dumpEnvironment:509 ENV OVESETUP_CONFIG/totalMemoryMB=NoneType:'None'
2015-04-17 11:52:13 DEBUG otopi.context context.dumpEnvironment:509 ENV OVESETUP_CONFIG/totalMemoryMB=int:'4839'
2015-04-17 11:58:23 DEBUG otopi.context context.dumpEnvironment:509 ENV OVESETUP_CONFIG/totalMemoryMB=int:'4839'
2015-04-17 12:00:16 DEBUG otopi.context context.dumpEnvironment:509 ENV OVESETUP_CONFIG/totalMemoryMB=int:'4839'
[root@jb-ovirt36 ~]# grep HEAP /etc/ovirt-engine/engine.conf.d/10-setup-java.conf
[root@jb-ovirt36 ~]# ps aux | grep java
ovirt 22710 0.9 15.9 3009980 788728 ? Sl 12:00 3:08 ovirt-engine -server -XX:+TieredCompilation -Xms1209M -Xmx1209M -XX:PermSize=256m -XX:MaxPermSize=256m -Djava.awt.headless=true -Dsun.rmi.dgc.client.gcInterval=3600000 -Dsun.rmi.dgc.server.gcInterval=3600000 -Djsse.enableSNIExtension=false -Djava.security.krb5.conf=/etc/ovirt-engine/krb5.conf -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/ovirt-engine/dump -Djava.util.logging.manager=org.jboss.logmanager -Dlogging.configuration=file:///var/lib/ovirt-engine/jboss_runtime/config/ovirt-engine-logging.properties -Dorg.jboss.resolver.warning=true -Djboss.modules.system.pkgs=org.jboss.byteman -Djboss.modules.write-indexes=false -Djboss.server.default.config=ovirt-engine -Djboss.home.dir=/usr/share/ovirt-engine-jboss-as -Djboss.server.base.dir=/usr/share/ovirt-engine -Djboss.server.data.dir=/var/lib/ovirt-engine -Djboss.server.log.dir=/var/log/ovirt-engine -Djboss.server.config.dir=/var/lib/ovirt-engine/jboss_runtime/config -Djboss.server.temp.dir=/var/lib/ovirt-engine/jboss_runtime/tmp -Djboss.controller.temp.dir=/var/lib/ovirt-engine/jboss_runtime/tmp -jar /usr/share/ovirt-engine-jboss-as/jboss-modules.jar -mp /var/lib/ovirt-engine/jboss_runtime/modules/00-modules-common:/var/lib/ovirt-engine/jboss_runtime/modules/01-ovirt-engine-jboss-as-modules -jaxpmodule javax.xml.jaxp-provider org.jboss.as.standalone -c ovirt-engine.xml
ovirt 22896 0.5 2.4 2682008 122296 ? Sl 12:00 1:50 ovirt-engine-dwhd -Dorg.ovirt.engine.dwh.settings=/tmp/tmp84E7Du/settings.properties -classpath /usr/share/ovirt-engine-dwh/lib/*::/usr/share/java/dom4j.jar:/usr/share/java/commons-collections.jar:/usr/share/java/postgresql-jdbc.jar ovirt_engine_dwh.historyetl_3_6.HistoryETL --context=Default
fyi it's wrong doctext:
engine-setup now automatically configures the heap size to be the maximum of 1GB
and 1/4 of available memory."
(In reply to Jiri Belka from comment #19)
> fyi it's wrong doctext:
> engine-setup now automatically configures the heap size to be the maximum of
> and 1/4 of available memory."
If you have a 8GB machine, you want 2GB, not 1GB - the maximum of 1 and 2.
BTW, the doc people dropped the word 'maximum' altogether from the 3.5 bug 1188971 doctext, perhaps that's best...
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.