Description of problem: Revision Comparison fails after revision is a month old on config files in spacewalk web. The same comparison works fine via rhncfg-manager cli. Issue seems to be with tomcat. Version-Release number of selected component (if applicable): v1.3 How reproducible: I am experiencing this issue on v0.8 but tested on v1.2 and it still was present. Steps to Reproduce: - Create a config file in a config channel and save - Add 100 lines to file - Add 100 more lines - Compare revision 3 with revision 1 (this works great) - wait about a month and compare again (action will time out) - compare via rhncfg-manager cli and action will work So it seems to be a tomcat/apache issue and not database Actual results: actions times out and does not return diff Expected results: return diff as it did when it was first created and as it does with rhncfg-manager Additional info: Posted to the list a year ago: http://www.mail-archive.com/spacewalk-list@redhat.com/msg04577.html All my config files (roughly 200) in my v0.8 installation that have revisions over a month old have this problem. I am not able to view revision comparisons via web UI. They have to be done from command-line via rhncfg-manager.
Also have a similar problem (500 error) removing the file with old revisions and see this in the /var/log/catalina.out: [TP-Processor6] ERROR com.redhat.rhn.common.localization.LocalizationService - ** ERROR: Message with id: [filedetails.details.jspf.revnum] not found.*** Again, I am able to remove the file immediately via command line. Also I forgot to mention that I increased the apache timeout to 6 minutes to allow the commands to finish. The default I think was 90 seconds. This worked for removing a file, but when doing the comparison 6 minutes still was not long enough.
More tomcat errors during the diff... It looks like it just keeps going until it runs out of memory. I already have it set to use up to 5GB and it still runs out of memory!! 2011-05-31 15:44:38,358 [TP-Processor3] ERROR com.redhat.rhn.frontend.servlets.SessionFilter - Error during transaction. Rolling back javax.servlet.ServletException: Servlet execution threw an exception at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:292) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188) at com.redhat.rhn.frontend.servlets.AuthFilter.doFilter(AuthFilter.java:101) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188) at com.opensymphony.module.sitemesh.filter.PageFilter.parsePage(PageFilter.java:142) at com.opensymphony.module.sitemesh.filter.PageFilter.doFilter(PageFilter.java:58) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188) at com.redhat.rhn.frontend.servlets.LocalizedEnvironmentFilter.doFilter(LocalizedEnvironmentFilter.java:67) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188) at com.redhat.rhn.frontend.servlets.EnvironmentFilter.doFilter(EnvironmentFilter.java:108) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188) at com.redhat.rhn.frontend.servlets.SessionFilter.doFilter(SessionFilter.java:55) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188) at com.redhat.rhn.frontend.servlets.SetCharacterEncodingFilter.doFilter(SetCharacterEncodingFilter.java:97) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:210) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:172) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:108) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:151) at org.apache.jk.server.JkCoyoteHandler.invoke(JkCoyoteHandler.java:200) at org.apache.jk.common.HandlerRequest.invoke(HandlerRequest.java:291) at org.apache.jk.common.ChannelSocket.invoke(ChannelSocket.java:775) at org.apache.jk.common.ChannelSocket.processConnection(ChannelSocket.java:704) at org.apache.jk.common.ChannelSocket$SocketConnection.runIt(ChannelSocket.java:897) at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:685) at java.lang.Thread.run(Thread.java:619) 2011-05-31 15:44:38,370 [TP-Processor3] ERROR org.apache.catalina.core.ContainerBase.[Catalina].[localhost].[/rhn].[action] - Servlet.service() for servlet action threw exception java.lang.OutOfMemoryError: Java heap space at com.redhat.rhn.common.filediff.Trace.makeEdit(Trace.java:198) at com.redhat.rhn.common.filediff.Trace.makeMatch(Trace.java:187) at com.redhat.rhn.common.filediff.Trace.step(Trace.java:154) at com.redhat.rhn.common.filediff.Differ.step(Differ.java:76) at com.redhat.rhn.common.filediff.Differ.diff(Differ.java:49) at com.redhat.rhn.common.filediff.Diff.diffFiles(Diff.java:92) at com.redhat.rhn.common.filediff.Diff.htmlDiff(Diff.java:51) at com.redhat.rhn.frontend.action.configuration.files.DiffAction.performFileDiff(DiffAction.java:90) at com.redhat.rhn.frontend.action.configuration.files.DiffAction.execute(DiffAction.java:66) at org.apache.struts.action.RequestProcessor.processActionPerform(RequestProcessor.java:431) at org.apache.struts.action.RequestProcessor.process(RequestProcessor.java:237) at com.redhat.rhn.frontend.struts.RhnRequestProcessor.process(RhnRequestProcessor.java:82) at org.apache.struts.action.ActionServlet.process(ActionServlet.java:1196) at org.apache.struts.action.ActionServlet.doGet(ActionServlet.java:414) at javax.servlet.http.HttpServlet.service(HttpServlet.java:690) at javax.servlet.http.HttpServlet.service(HttpServlet.java:803) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:269) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188) at com.redhat.rhn.frontend.servlets.AuthFilter.doFilter(AuthFilter.java:101) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188) at com.opensymphony.module.sitemesh.filter.PageFilter.parsePage(PageFilter.java:142) at com.opensymphony.module.sitemesh.filter.PageFilter.doFilter(PageFilter.java:58) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188) at com.redhat.rhn.frontend.servlets.LocalizedEnvironmentFilter.doFilter(LocalizedEnvironmentFilter.java:67) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188) at com.redhat.rhn.frontend.servlets.EnvironmentFilter.doFilter(EnvironmentFilter.java:108) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188) at com.redhat.rhn.frontend.servlets.SessionFilter.doFilter(SessionFilter.java:55) May 31, 2011 3:44:38 PM org.apache.catalina.core.StandardHostValve status WARNING: Exception Processing ErrorPage[errorCode=500, location=/WEB-INF/pages/common/errors/500.jsp] ClientAbortException: java.net.SocketException: Broken pipe at org.apache.catalina.connector.OutputBuffer.realWriteBytes(OutputBuffer.java:366) at org.apache.tomcat.util.buf.ByteChunk.flushBuffer(ByteChunk.java:432) at org.apache.catalina.connector.OutputBuffer.doFlush(OutputBuffer.java:311) at org.apache.catalina.connector.OutputBuffer.flush(OutputBuffer.java:293) at org.apache.catalina.connector.Response.flushBuffer(Response.java:539) at org.apache.catalina.core.StandardHostValve.status(StandardHostValve.java:287) at org.apache.catalina.core.StandardHostValve.throwable(StandardHostValve.java:229) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:135) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:108) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:151) at org.apache.jk.server.JkCoyoteHandler.invoke(JkCoyoteHandler.java:200) at org.apache.jk.common.HandlerRequest.invoke(HandlerRequest.java:291) at org.apache.jk.common.ChannelSocket.invoke(ChannelSocket.java:775) at org.apache.jk.common.ChannelSocket.processConnection(ChannelSocket.java:704) at org.apache.jk.common.ChannelSocket$SocketConnection.runIt(ChannelSocket.java:897) at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:685) at java.lang.Thread.run(Thread.java:619) Caused by: java.net.SocketException: Broken pipe at java.net.SocketOutputStream.socketWrite0(Native Method) at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92) at java.net.SocketOutputStream.write(SocketOutputStream.java:136) at org.apache.jk.common.ChannelSocket.send(ChannelSocket.java:538) at org.apache.jk.common.JkInputStream.doWrite(JkInputStream.java:162) at org.apache.coyote.Response.doWrite(Response.java:560) at org.apache.catalina.connector.OutputBuffer.realWriteBytes(OutputBuffer.java:361) ... 17 more May 31, 2011 3:44:38 PM org.apache.jk.core.MsgContext action WARNING: Error sending end packet java.net.SocketException: Broken pipe at java.net.SocketOutputStream.socketWrite0(Native Method) at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92) at java.net.SocketOutputStream.write(SocketOutputStream.java:136) at org.apache.jk.common.ChannelSocket.send(ChannelSocket.java:538) at org.apache.jk.common.JkInputStream.endMessage(JkInputStream.java:127) at org.apache.jk.core.MsgContext.action(MsgContext.java:302) at org.apache.coyote.Response.action(Response.java:183) at org.apache.coyote.Response.finish(Response.java:305) at org.apache.jk.server.JkCoyoteHandler.invoke(JkCoyoteHandler.java:205) at org.apache.jk.common.HandlerRequest.invoke(HandlerRequest.java:291) at org.apache.jk.common.ChannelSocket.invoke(ChannelSocket.java:775) at org.apache.jk.common.ChannelSocket.processConnection(ChannelSocket.java:704) at org.apache.jk.common.ChannelSocket$SocketConnection.runIt(ChannelSocket.java:897) at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:685) at java.lang.Thread.run(Thread.java:619) May 31, 2011 3:44:38 PM org.apache.jk.common.ChannelSocket processConnection WARNING: processCallbacks status 2
Aligning under space16.
Hey Josh, first of all - LocalizationService errors just notify translations not being found. They're not critical at all and they definitely don't make configuration files diff to fail. ERROR com.redhat.rhn.common.localization.LocalizationService - ** ERROR: Message with id: [filedetails.details.jspf.revnum] not found.*** Comparisons of config revisions aren't dependent on their age. I tried to reproduce it, but didn't succeed. I've compared config revisions one year old. java.lang.OutOfMemoryError: Java heap space error in your catalina.out file really tells about diffed config revisions being too large - larger than available java heap. It doesn't tell anything about the revisions being too old. (in these cases we recommend to enlarge the heap size similar to what you've described in the mailing list). The other error "java.net.SocketException: Broken pipe" is most probably just a consequence of the OutOfMemoryError - generally a timeout. If you still can reproduce the problem, do you see another traceback in the catalina.out?
Yes, however the heap size has already been increased to 5GB. I also verified this is still an issue on a v1.5 upgrade. Here is the reason I think it has something to do with time and not size... __Total of 3 revisions__ - Current revision #3 (120KB) - Revision #2 11 weeks old (22KB) - Comparison works and returns immediately - Revision #1 82 weeks old (8KB) - Comparison fails and heap runs out Thanks for your help. -Josh
Hmm, I can compare 55 weeks old revision without troubles. (I have no older, but I could try to modify the revision time.) If it's now not a memory issue, do you still see the the same traceback as reported in Comment#2?
Tomas, Yes, still same tomcat traceback. According to the DBA's the query returns immediately, so it is successfully retrieving the diff from the database, however tomcat is holding onto it for some reason. Also note that while the revision comparison is running the tomcat java process spikes up to 400% cpu. So it's doing a lot of something. Not sure if this will help.. When I do an strace on tomcat pid while the diff is running, after I see the the diff data come through, it then segfaults: [pid 8817] --- SIGSEGV (Segmentation fault) @ 0 (0) --- [pid 8817] rt_sigreturn(0x2aaaab3b3c2b <unfinished ...> [pid 8817] <... rt_sigreturn resumed> ) = 28874502360 [pid 8817] --- SIGSEGV (Segmentation fault) @ 0 (0) --- [pid 8817] rt_sigreturn(0x2aaaab2764b9) = 28874510976 [pid 8781] --- SIGSEGV (Segmentation fault) @ 0 (0) --- [pid 8781] rt_sigreturn(0x2b7bab00b010) = 1077075968 These may be normal operation, but I see a TON of these when doing an "strace -f" on the tomcat pid: clock_gettime(CLOCK_REALTIME, {1319155003, 532295000}) = 0 <... futex resumed> ) = -1 ETIMEDOUT (Connection timed out) sched_yield( <unfinished ...> <... sched_yield resumed> Thanks again. -Josh
(In reply to comment #7) > Tomas, > > Yes, still same tomcat traceback. > > > According to the DBA's the query returns immediately, so it is successfully > retrieving the diff from the database, however tomcat is holding onto it for > some reason. > > Also note that while the revision comparison is running the tomcat java process > spikes up to 400% cpu. > So it's doing a lot of something. So the issue is not database operation but the Java operation. On the Java side, I don't see anything specific to the _age_ of the revision. Can you investigate what is specific about the _content_ of those revisions, or about the diff? More specifically, can you reproduce the problem with some pair of revisions, no matter how old they are? Can you attach those config files to this bugzilla?
The content does not seem to affect the outcome since this issue happens with different content. I can even put random text in the body and recreate the issue after letting it set for a month. I have a question: When comparing say.. revision 50 to revision 1, does it just retrieve and compare those 2, or does it retrieve and compare all the revisions?
Only the current and the chosen revisions are compared, revisions in between stay untouched.
(In reply to comment #9) > The content does not seem to affect the outcome since this issue happens with > different content. > > I can even put random text in the body and recreate the issue after letting it > set for a month. Any news?
Also, what is the OS and tomcat version?
Red Hat Enterprise Linux Server release 5.7 (Tikanga) 2.6.18-274.7.1.el5 tomcat5-5.5.23-0jpp.19.el5_6
Josh, we were not able to reproduce the issue. We are pretty certain that the age of the revisions does not play any role in the cause of the problem. It really has to be the content which forces the algorithm to run out of memory. In the long run, we will investigate other approaches of generating the diff that should cap the memory consumption.
Fixed in Spacewalk master: 623df35a7ee271e713f908dced5f04d0ce06d434.
Confirmed working! Thanks all! Feel free to close this bug.
Moving ON_QA. Packages that address this bugzilla should now be available in yum repos at http://yum.spacewalkproject.org/nightly/
Spacewalk 1.8 has been released: https://fedorahosted.org/spacewalk/wiki/ReleaseNotes18