Bug 873313
Summary: | Very high memory usage during repo sync | ||
---|---|---|---|
Product: | [Retired] Pulp | Reporter: | Preethi Thomas <pthomas> |
Component: | rpm-support | Assignee: | Todd Sanders <tsanders> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Preethi Thomas <pthomas> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 2.0.6 | CC: | jason.dobies, jmatthew, jsherril, mhrivnak, rbarlow, strobert, tsanders |
Target Milestone: | --- | Keywords: | Triaged |
Target Release: | 2.1.1 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | 872190 | Environment: | |
Last Closed: | 2013-05-08 14:08:33 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 872190 | ||
Bug Blocks: | 854726 |
Description
Preethi Thomas
2012-11-05 14:35:00 UTC
*** Bug 883612 has been marked as a duplicate of this bug. *** It does look like the patch Michael Hrivnak pointed to is the one that led to the memory. from looking at what goes into units_rpm it is a large chunk of information. I did the mongodb dumps while looking into bug 885264 and saw the fairly large objects. I'm wondering if it wouldn't be better to trim down what pulp stores in the db and envoke 'createrepo -u' during publish. the '-u' flag does an update so if just adding a few RPMs runs fairly quickly. we have used it for a bit on our current yum repo management system. Originally, the publish was pulling in all of the information for all of the RPMs in a giant bulk DB hit. That's one reason we were jacking up the memory so rapidly. There have been changes to both pull only the data that's needed when it's needed and chunk the processing so instead of pulling the entire repository contents into memory, we pull smaller batches of RPMs at a time. I've seen the 500 batch publish chunks, I don't think that is the largest memory hit though. Is there a fix in port 2.0.6-0.14.beta? I am still seeing the large memory hit in that version. I did a fresh rhn-rhels6 pull on a fresh box yesterday running that version and still saw the memory hit. needed to restart httpd after the sync operation to get some RAM back (this was only a 3GB machine). And do to bug 885264 the publish aborted after 500 rpms, so since it didn't complete the publish, not thinking the memory item is the publish. FYI, 2.0.6-0.17.beta, mem footprint still fairly large. VSZ > 3GB, RSS ~1.2 GB doing the big repo (rhel/centos) pulls. I didn't see anything in the install guide prereq's for needed memory. at home a little tougher to do the 3GB, but doable. at work really okay to do the bigger box. Just wondering if that is a normal size or if I am seeing something odd... FYI, on 2.0.7-0.1.beta after doing a few syncs (of just updates) from RedHat's CDN pulp-server RAM at about 6GB. I'm running on a RHEL6 64bit machine. currently with 6GB RAM and 4 CPUs. probably going to up it to 8GB shortly. Also thinking of tossing a '/sbin/service httpd restart' into cron. If it comes to that, I recommend considering a graceful restart with "apachectl graceful". I doubt the service restart is so kind. (In reply to comment #7) > Also thinking of tossing a '/sbin/service httpd restart' into cron. true, doing a "/sbin/service httpd graceful" would be a bit less rude :) If there is anything I can provide to help test/debug this, let me know. happens quite reliably when I sync the bigger repos. This still seems to be a pretty major issue, should this still be ON_QA? FYI, i'm seeing OOM errors while syncing ~6 large repos sequentially on a machine with 12 Gigs of ram. the http processes gradually rise in memory consumption until they get killed. I've yet been able to sync all 6 large repos. With all the comments above I am moving this back to assigned. We've made changes to the YumImporter and Grinder, below wiki page has stats on improvement. https://fedorahosted.org/pulp/wiki/PerformanceTesting/SyncPerformance_v2_Improvements With these changes we are seeing a sync of RHEL 6.2 of 7k packages completing in ~35minutes with memory numbers around RSS: ~550MB-800MB Pull Requests with the changes: https://github.com/pulp/grinder/pull/2 https://github.com/pulp/pulp_rpm/pull/159 Definitely a lot better. some rough numbers of doing repo (re-) syncs: beta 26: 5 hours, 4G RAM in use beta 26+ patches form comment #98: 3 hours, 1.5 GB in use patched as above and num-threads set to 4 per repo: 2 hr 40 min, 1.5 GB in use So memory usage seems to be really under control with the new version. And sync duration is a lot better too. did a fresh repo sync of rhels6 (10238 rpms). 0.26-beta with num-threads set to 4 on the repo. took 39 minutes. overall download itself was about 10 minutes. memory for wsgi:pulp seems to match John's number. VSS 2340m RSZ 845m. re-sync took 9 min. Steven, Thank you for the update, appreciate the time you spent to give the patches a run through :) build: 2.1.1-0.1.beta Moving to verified [root@preethi ~]# rpm -q pulp-server pulp-server-2.1.1-0.4.beta.fc17.noarch [root@preethi ~]# [root@preethi ~]# pulp-admin rpm repo create --repo-id rhel6_2 --feed https://cdn.redhat.com/content/dist/rhel/rhui/server/6/6.2/x86_64/os/ --feed-ca-cert CDN/cdn.redhat.com-chain.crt --feed-cert CDN/1359391926_4512.crt --feed-key CDN/1359391926_4512.key Successfully created repository [rhel6_2] [root@preethi ~]# pulp-admin rpm repo sync run --repo-id rhel6_2 +----------------------------------------------------------------------+ Synchronizing Repository [rhel6_2] +----------------------------------------------------------------------+ This command may be exited by pressing ctrl+c without affecting the actual operation on the server. Downloading metadata... [-] ^C[root@preethi ~]# time pulp-admin rpm repo sync run --repo-id rhel6_2 +----------------------------------------------------------------------+ Synchronizing Repository [rhel6_2] +----------------------------------------------------------------------+ A sync task is already in progress for this repository. Its progress will be tracked below. This command may be exited by pressing ctrl+c without affecting the actual operation on the server. Downloading metadata... [|] ... completed Downloading repository content... [====================== ] 44% [==================================================] 100% RPMs: 7281/7281 items Delta RPMs: 0/0 items Tree Files: 6/6 items Files: 0/0 items ... completed Importing errata... [|] ... completed Importing package groups/categories... [-] ... completed Publishing packages... [==================================================] 100% Packages: 7281/7281 items ... completed Publishing distributions... [==================================================] 100% Distributions: 6/6 items ... completed Generating metadata [-] ... completed Publishing repository over HTTPS [-] ... completed real 156m1.673s user 3m9.345s sys 0m7.178s [root@preethi ~]# [root@preethi ~]# [root@preethi ~]# time pulp-admin rpm repo sync run --repo-id rhel6_2 +----------------------------------------------------------------------+ Synchronizing Repository [rhel6_2] +----------------------------------------------------------------------+ This command may be exited by pressing ctrl+c without affecting the actual operation on the server. Downloading metadata... [/] ... completed Downloading repository content... [==================================================] 100% RPMs: 0/0 items Delta RPMs: 0/0 items Tree Files: 6/6 items Files: 0/0 items ... completed Importing errata... [\] ... completed Importing package groups/categories... [-] ... completed Publishing packages... [==================================================] 100% Packages: 7281/7281 items ... completed Publishing distributions... [==================================================] 100% Distributions: 6/6 items ... completed Generating metadata [|] ... completed Publishing repository over HTTPS [-] ... completed real 10m49.861s user 0m13.677s sys 0m0.625s [root@preethi ~]# 2.1.1 released |