Description of problem: If you have fresh clean Satellite and loads really big channel (in my situation 8000 packages 1800 erratas), then taskomatic crash on repomd generations. Version-Release number of selected component (if applicable): Sat541 How reproducible: deterministic Steps to Reproduce: 1. have Satellite with rhel5 with all the packages and errata that ever apperead in that channel (currently 8000 packages 1800 erratas) 2. have another Satellite and do ISS sync 3. watch /var/log/rhn/rhn_taskomatic_daemon.log Actual results: Taskomatic crash. The crash is either timeout by tanukiwrapper due high load: ERROR | wrapper | 2011/04/28 11:21:35 | JVM appears hung: Timed out waiting for signal from JVM. ERROR | wrapper | 2011/04/28 11:21:35 | JVM did not exit on request, terminated And if you set up higher timeout by: wrapper.ping.timeout=300 in /etc/rhn/default/rhn_taskomatic_daemon.conf you will get OutOfMemoryError: INFO | jvm 1 | 2011/07/14 01:23:05 | Exception in thread "Thread-59" java.lang.OutOfMemoryError INFO | jvm 1 | 2011/07/14 01:23:05 | at java.net.URLClassLoader.findClass(URLClassLoader.java:434) INFO | jvm 1 | 2011/07/14 01:23:05 | at sun.misc.Launcher$ExtClassLoader.findClass(Launcher.java:281) INFO | jvm 1 | 2011/07/14 01:23:05 | at java.lang.ClassLoader.loadClass(ClassLoader.java:653) INFO | jvm 1 | 2011/07/14 01:23:05 | at java.lang.ClassLoader.loadClass(ClassLoader.java:645) INFO | jvm 1 | 2011/07/14 01:23:05 | at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:358) INFO | jvm 1 | 2011/07/14 01:23:05 | at java.lang.ClassLoader.loadClass(ClassLoader.java:619) INFO | jvm 1 | 2011/07/14 01:23:05 | at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1291) INFO | jvm 1 | 2011/07/14 01:23:05 | at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:188) INFO | jvm 1 | 2011/07/14 01:23:05 | at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:452) INFO | jvm 1 | 2011/07/14 01:23:05 | at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:354) INFO | jvm 1 | 2011/07/14 01:23:05 | at org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:258) INFO | jvm 1 | 2011/07/14 01:23:05 | at com.mchange.v2.c3p0.impl.NewProxyPreparedStatement.executeQuery(NewProxyPreparedStatement.java:50) INFO | jvm 1 | 2011/07/14 01:23:05 | at com.redhat.rhn.taskomatic.task.repomd.PackageCapabilityIterator.<init>(PackageCapabilityIterator.java:78) INFO | jvm 1 | 2011/07/14 01:23:05 | at com.redhat.rhn.taskomatic.task.repomd.FilelistsXmlWriter.begin(FilelistsXmlWriter.java:87) INFO | jvm 1 | 2011/07/14 01:23:05 | at com.redhat.rhn.taskomatic.task.repomd.RpmRepositoryWriter.writeRepomdFiles(RpmRepositoryWriter.java:167) INFO | jvm 1 | 2011/07/14 01:23:05 | at com.redhat.rhn.taskomatic.task.repomd.ChannelRepodataWorker.run(ChannelRepodataWorker.java:104) INFO | jvm 1 | 2011/07/14 01:23:05 | at EDU.oswego.cs.dl.util.concurrent.PooledExecutor$Worker.run(PooledExecutor.java:761) INFO | jvm 1 | 2011/07/14 01:23:05 | at java.lang.Thread.run(Thread.java:736) ERROR | wrapper | 2011/07/14 01:23:07 | JVM appears hung: Timed out waiting for signal from JVM. ERROR | wrapper | 2011/07/14 01:23:07 | JVM did not exit on request, terminated Even if you give 1500 MB to taskomatic, it still crash. Taskomatic is then restarted, making the load on server 5+. Expected results: no errors, repomd regenerated.
This is cause PackageCapabilityIterator.java which loads all records from rhnPackageCapability for given channel into memory. That is more then 4 milion records in my case. Fixed in spacewalk.git in commit b803f6ef78362eb34d2cfc07ca0c400e40b89dd9
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Previously taskomatic loaded all capabilities of all packages of one channel to memory during repomd creation. This sometimes caused OutOfMemory errors. Now only 200 records are loaded to memory at the same time and OutOfMemory errors does not occur even for large channels.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2011-1162.html