Red Hat Bugzilla – Bug 806060
OutOfMemory Error in tomcat6 when comparing config file revisions
Last modified: 2012-11-01 12:18:04 EDT
+++ This bug was initially created as a clone of Bug #729210 +++
Description of problem:
When trying to compare the revisions of a config file, satellite encounters an Internal Server Error, and an OutOfMemory error is recorded in the tomcat catalina.out log.
Version-Release number of selected component (if applicable):
Satellite 5.4.1 on RHEL6.1 64bit
Always when comparing the config files in the link below. (not all config file comparisons cause the problem, but comparing these 2 files always causes the problem)
Steps to Reproduce:
1. Create a config file
2. Upload the files listed in the attachment as newer versions (sequentially)
3. Click on "Compare File" button
3. Click on the 'View Comparison' button
After about 30seconds an Internal Server Error appears in the WebUI. In the catalina.out, a 'java.lang.OutOfMemoryError' error appears.
No out of memory error.
Increasing the tomcat6 heap size doesn't resolve the issue. That is, I modified the following in /etc/tomcat6/tomcat6.conf:
JAVA_OPTS="$JAVA_OPTS -ea -Xms2048m -Xmx2048m -Djava.awt.headless=true -Dorg.xml.sax.driver=org.apache.xerces.parsers.SAXParser -XX:MaxNewSize=2048 -XX:-UseConcMarkSweepGC"
However tomcat didn't get an OutOfMemory error now, but I saw in the output of top that it quickly consumed its 2G of heap space (by observing the RES column) and the java process stayed pegged at 100% CPU utilization (indefinately - until I restarted satellite).
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
12513 tomcat 20 0 3328m 2.1g 13m S 99.2 27.7 2:07.50 java
--- Additional comment from firstname.lastname@example.org on 2011-08-09 04:01:43 EDT ---
I forgot to mention in the 'Additional info' section in my previous update, with regards to increasing the java heap size, that about a minute after clicking on the 'View Comparison' button, an ISE appears in the satellite UI. No OutOfMemory errors appear in the catalina.out this time, but in the httpd error logs there appear ajp timeout errors regarding getting a response back from tomcat. This is all in conjunction with what I mentioned previously about the 100% CPU usage of the java/tomcat process and full use of the increased heap memory space.
--- Additional comment from email@example.com on 2011-08-26 03:21:45 EDT ---
Produced the OutOfMemoryError again with a different set of configuration files. My own this time so I've attached them to the ticket for reproducing.
In this new situation, and as with the original reproducer, there are quite a number of differences between the files that are being compared and the differences occur all throughout the file.
Seems the problem is in the recursive step method of the Trace class in com/redhat/rhn/common/filediff/Trace.java.
--- Additional comment from firstname.lastname@example.org on 2011-08-26 03:27:24 EDT ---
Created attachment 520021 [details]
Reproducer file v1
Add this file to the configuration channel.
--- Additional comment from email@example.com on 2011-08-26 03:29:26 EDT ---
Created attachment 520022 [details]
Reproducer file v2
Then upload this file as a revision of the first file. Then try comparing the 2 files. This will reproduce the tomcat out of memory error.
The crux of the issue is that the diff algorithm we use branches and explores every possibility of resolution when it encounters a difference in the two files. These branches are explored concurrently so that we can choose not bother exploring branches if we know that there is simpler explanation for the difference. However, we only *know* that a branch is sub-optimal if a better branch has made it all the way to the end of the file.
This means that for large files with many changes we can end up branching many, many, many times before any of the branches actually reach the end of the file. The growth of the branches is exponential for each change we encounter, resulting in the out of memory error.
The fix that I have implemented is to simply limit the number of branches to our best guess at the 1000 most efficient branches. This should still give us the optimal diff between the two files the vast majority of the time, while greatly cutting back on the number of branches we have to explore. The resulting diff is still guaranteed to be *correct*, but not necessarily *optimal*.
Committed to spacewalk master: 623df35a7ee271e713f908dced5f04d0ce06d434
Moving ON_QA. Packages that address this bugzilla should now be available in yum repos at http://yum.spacewalkproject.org/nightly/
Spacewalk 1.8 has been released: https://fedorahosted.org/spacewalk/wiki/ReleaseNotes18