Bug 975220
Summary: | createrepo 0.9.9 fails with MemoryError | ||
---|---|---|---|
Product: | [Retired] Beaker | Reporter: | Stefanie Forrester <dakini> |
Component: | scheduler | Assignee: | Dan Callaghan <dcallagh> |
Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | tools-bugs <tools-bugs> |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | 0.12 | CC: | aigao, asaha, azelinka, bturner, dcallagh, dkutalek, jburke, lmohanty, mmalik, qwan, rmancy, tools-bugs |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2014-08-22 00:42:57 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Stefanie Forrester
2013-06-17 20:27:24 UTC
If createrepo is leaking files to /var/tmp for no reason then it's really a createrepo bug. But I will look into this further. It seems like the actual problem is that createrepo was using too much memory and bailing out without cleaning up its temp files in /var/tmp. So the disk filling up was just a side-effect of that. 2013-06-17 11:50:37,757 bkr.server.xmlrpccontroller ERROR Error handling XML-RPC method Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/bkr/server/xmlrpccontroller.py", line 54, in RPC2 response = self.process_rpc(method,params) File "/usr/lib/python2.6/site-packages/bkr/server/xmlrpccontroller.py", line 43, in process_rpc response = obj(*params) File "<string>", line 3, in upload File "/usr/lib/python2.6/site-packages/turbogears/identity/conditions.py", line 249, in require return fn(self, *args, **kwargs) File "/usr/lib/python2.6/site-packages/bkr/server/tasks.py", line 205, in upload Task.update_repo() File "/usr/lib/python2.6/site-packages/bkr/server/model.py", line 6351, in update_repo % (retcode, output)) ValueError: createrepo failed with exit status 1: Traceback (most recent call last): File "/usr/share/createrepo/genpkgmetadata.py", line 291, in <module> main(sys.argv[1:]) File "/usr/share/createrepo/genpkgmetadata.py", line 265, in main mdgen.doPkgMetadata() File "/usr/lib/python2.6/site-packages/createrepo/__init__.py", line 412, in doPkgMetadata self.writeMetadataDocs(packages) File "/usr/lib/python2.6/site-packages/createrepo/__init__.py", line 567, in writeMetadataDocs self.flfile.write(old_po.xml_dump_filelists_metadata()) File "/usr/lib/python2.6/site-packages/yum/packages.py", line 1215, in xml_dump_filelists_metadata msg += misc.to_unicode(self._dump_files()) MemoryError Our production Beaker instance has 20382 task RPMs totalling 2.3GB. I tried experimenting with running createrepo on RHEL6 against a copy of the task RPMs from production, using RLIMIT_AS 1500000000: ulimit -v 1464843 /usr/bin/time createrepo -q --update --checksum sha . With createrepo-0.9.8-4.el6.noarch: 13.28user 0.57system 0:14.57elapsed 95%CPU (0avgtext+0avgdata 2757840maxresident)k 4560inputs+17360outputs (0major+173449minor)pagefaults 0swaps With createrepo-0.9.9-17.el6.noarch: 88.45user 2.73system 1:38.56elapsed 92%CPU (0avgtext+0avgdata 865264maxresident)k 334392inputs+326152outputs (0major+87653minor)pagefaults 0swaps So with createrepo 0.9.9 the memory usage appears to actually be *lower*, but that's just because the real work is done in a worker process now. I'm not sure of a good way to find the max resident size of the worker process. But neither version is hitting MemoryError in my environment. (Note that GNU time has a bug [1] where the max resident size is misreported by a factor of 4 so the max resident size for createrepo 0.9.8 is 673MB.) So I'm not sure why it would be using more memory when it runs in the production environment. [1] https://groups.google.com/forum/?fromgroups#!topic/gnu.utils.help/u1MOsHL4bhg I have had absolutely no success in reproducing this MemoryError from createrepo, even when run inside the Beaker application in mod_wsgi with a copy of the task library from our production Beaker. I can run createrepo 0.9.9 successfully with ulimit -v as low as 520000. At 500000 I get the following stack trace which looks completely different than the one above: Traceback (most recent call last): File "/usr/share/createrepo/genpkgmetadata.py", line 291, in <module> main(sys.argv[1:]) File "/usr/share/createrepo/genpkgmetadata.py", line 269, in main mdgen.doRepoMetadata() File "/usr/lib/python2.6/site-packages/createrepo/__init__.py", line 1003, in doRepoMetadata compressFile(resultpath, result_compressed, compress_type) File "/usr/lib/python2.6/site-packages/createrepo/utils.py", line 108, in compressFile bzipFile(source, dest) File "/usr/lib/python2.6/site-packages/createrepo/utils.py", line 58, in bzipFile destination = bz2.BZ2File(dest, 'w', compresslevel=9) MemoryError Also, Beaker was running happily for several days with createrepo 0.9.9 (including accept new task uploads into the task library) before things first went bad at 2013-06-17 08:30:36 UTC. So there must have been some other complicating factor which was causing createrepo to get MemoryError. I'm out of ideas though. Dan, I met this problem a few days sooner - on Saturday June 15 at about 6 UTC: php-CoreOS-php-security-CVE-2007-1285-1.5-3.noarch.rpm Exception: <Fault 1: '<type \'exceptions.ValueError\'>:createrepo failed with exit status 1:\nTraceback (most recent call last):\n File "/usr/share/createrepo/genpkgmetadata.py", line 291, in <module>\n main(sys.argv[1:])\n File "/usr/share/createrepo/genpkgmetadata.py", line 265, in main\n mdgen.doPkgMetadata()\n File "/usr/lib/python2.6/site-packages/createrepo/__init__.py", line 412, in doPkgMetadata\n self.writeMetadataDocs(packages)\n File "/usr/lib/python2.6/site-packages/createrepo/__init__.py", line 567, in writeMetadataDocs\n self.flfile.write(old_po.xml_dump_filelists_metadata())\n File "/usr/lib/python2.6/site-packages/yum/packages.py", line 1215, in xml_dump_filelists_metadata\n msg += misc.to_unicode(self._dump_files())\nMemoryError\n'> make: *** [bkradd] Error 1 I was doing make bkradd (ie adding task package) for 38 tests, last 4 failed with this error. About an hour or two later, I tried two of them once more manually and it again failed. Same day about 10 hours later (about 16 UTC) I tried loading beaker web gui but got repeatedly 500 Internal server error - not sure whether this is connected issue though. I decided to not file a ticket/bug because of weekend :-o. I am too getting this create repo issue along with "500 Internal server error -" + '[' -d /home/msvbhat/rpmbuild/BUILDROOT/rhs-tests-rhs-tests-beaker-rhs-gluster-qe-libs-dev-vbhat-1.32.9-0.x86_64 ']' + rm -rf /home/msvbhat/rpmbuild/BUILDROOT/rhs-tests-rhs-tests-beaker-rhs-gluster-qe-libs-dev-vbhat-1.32.9-0.x86_64 + exit 0 /mnt/testarea/rhts-build-CflehXkC/install /work/rhs-tests/beaker/rhs/gluster-qe-libs /home/msvbhat/.beaker_client/config not found, using /etc/beaker/client.conf rhs-tests-rhs-tests-beaker-rhs-gluster-qe-libs-dev-vbhat-1.32.9-0.noarch.rpm Exception: <Fault 1: '<type \'exceptions.ValueError\'>:createrepo failed with exit status 1:\nTraceback (most recent call last):\n File "/usr/share/createrepo/genpkgmetadata.py", line 291, in <module>\n main(sys.argv[1:])\n File "/usr/share/createrepo/genpkgmetadata.py", line 265, in main\n mdgen.doPkgMetadata()\n File "/usr/lib/python2.6/site-packages/createrepo/__init__.py", line 412, in doPkgMetadata\n self.writeMetadataDocs(packages)\n File "/usr/lib/python2.6/site-packages/createrepo/__init__.py", line 567, in writeMetadataDocs\n self.flfile.write(old_po.xml_dump_filelists_metadata())\n File "/usr/lib/python2.6/site-packages/yum/packages.py", line 1215, in xml_dump_filelists_metadata\n msg += misc.to_unicode(self._dump_files())\nMemoryError\n'> make: *** [bkradd] Error 1 Please note that we are in heavy development of beaker tasks and this greatly affects our ability to develop automation. If the logs are filling up the system can we be sure to implement a workaround that doesn't affect our ability to add/update packages Just now I tried to upload again a new package and it worked fine. May be somebody cleaned temp files in /var/tmp in production, so it worked. Thanks Dan, so you were unable to duplicate this issue on another beaker instance? Do you think this is an environment-specific error, rather than a createrepo bug? I'll do some testing on stage and possibly even set up another beaker instance just to compare. It'll take time though... we have a lot of other priorities, and a temporary fix is in place. Lalatendu, I cleaned up the files yesterday and downgraded the createrepo package. It's stable now and shouldn't cause any more problems. That'll hold while we figure out the root cause of the issue. (In reply to David Kutálek from comment #5) This is a useful data point, thanks David. The server logs were unfortunately truncated at 2013-06-16 23:36:49 (I guess in an attempt to recover disk space) so I didn't see your failure from 2013-06-15. But nevertheless there were some successful task uploads between 2013-06-16 23:36:49 and 2013-06-17 08:30:36 so there must have been some other factor which came into play causing the MemoryErrors. (In reply to Stefanie Forrester from comment #8) > Dan, so you were unable to duplicate this issue on another beaker instance? > Do you think this is an environment-specific error, rather than a createrepo > bug? The only bug I can see with createrepo is that it uses 600MB of memory to generate 18MB of repodata, which is not very efficient. It doesn't leak anything in /var/tmp *unless* it dies in the middle with MemoryError. So there is definitely something environment-specific on beaker-02 which is causing either: (a) createrepo to use far more memory when run on beaker-02 than on my systems; or (b) RLIMIT_AS to be set lower than I think it is; or (c) malloc to return NULL in createrepo for some reason other than exceeding RLIMIT_AS. I'm pretty sure it has to be (a) but I can't come up with any reasons why it would be happening. Closing this since we could never reproduce the MemoryError in createrepo, and we have been successfully using createrepo_c (which uses much less memory and is faster) in production for several months now instead. |