Bug 971468
Summary: | Request to search units times while memory consumption is high | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Retired] Pulp | Reporter: | Mike McCune <mmccune> | ||||||||
Component: | user-experience | Assignee: | Michael Hrivnak <mhrivnak> | ||||||||
Status: | CLOSED WORKSFORME | QA Contact: | Preethi Thomas <pthomas> | ||||||||
Severity: | high | Docs Contact: | |||||||||
Priority: | high | ||||||||||
Version: | 2.1.1 | CC: | jsherril, mmccune, skarmark | ||||||||
Target Milestone: | --- | Keywords: | Triaged | ||||||||
Target Release: | 2.1.2 | ||||||||||
Hardware: | Unspecified | ||||||||||
OS: | Unspecified | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | 960278 | Environment: | |||||||||
Last Closed: | 2013-06-11 14:31:16 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | 960278 | ||||||||||
Bug Blocks: | 950743 | ||||||||||
Attachments: |
|
Description
Mike McCune
2013-06-06 15:17:32 UTC
We need some more context. What is the query you're running? The answer may be to limit the fields that are returned. For instance, I doubt you care to download the repomd snippets. We actually do trim it to certain fields. For packages for example: ['name', 'version', 'release', 'arch', 'suffix', 'epoch', 'download_url', 'checksum', 'checksumtype', 'license', 'group', 'children', 'vendor', 'filename', 'relativepath', 'requires','provides', 'description', 'size', 'buildhost', '_id', '_content_type_id', '_href', '_storage_path', '_type'] I'm fairly certain the issue is with the large memory footprint, as the query works fine after a bounce of httpd. I can give you an exact query in a moment. The memory footprint you mention, is that what's left hanging around after the syncs? If you do just steps b-d is it still an issue? We first do a post to /pulp/api/v2/repositories/ACME_Corporation-DEV-View_62-TestProduct1-TestRepo/search/units/ {"criteria":{"type_ids":["rpm"],"fields":{"unit":[],"association":["unit_id"]}}} We then take the unit_id from that list and fetch 200 at a time: POST pulp/api/v2/content/units/rpm/search/ { "criteria": { "fields": [ "name", "version", "release", "arch", "suffix", "epoch", "download_url", "checksum", "checksumtype", "license", "group", "children", "vendor", "filename", "relativepath", "requires", "provides", "description", "size", "buildhost", "_id", "_content_type_id", "_href", "_storage_path", "_type" ], "filters": { "_id": { "": [ "a9a56b83-e046-4f38-8523-c2e5a0719d8f", ".......TRIM.......", "13913de5-c74b-4228-822a-5df06501df0a" ] } } }, "include_repos": true } I am fairly certain the majority of the memory 'gain' is from steps B & C. Step A by itself does not seem to cause all that much. Ok, let's take a look at the copy calls then. You're applying the field limiting to the copy calls that Mike Hrivnak told you about as well, correct? These are the 4 unit copy calls we make for the 'content view publish': post "https://abed.usersys.redhat.com/pulp/api/v2/repositories/ACME_Corporation-Library-AnotherTestView-TestProduct1-TestRepo/actions/associate/", "{\"source_repo_id\":\"ACME_Corporation-TestProduct1-TestRepo\",\"criteria\":{\"type_ids\":[\"rpm\"],\"filters\":{},\"fields\":{\"unit\":[\"name\",\"epoch\",\"version\",\"release\",\"arch\",\"checksumtype\",\"checksum\"]}}}" post "https://abed.usersys.redhat.com/pulp/api/v2/repositories/ACME_Corporation-Library-AnotherTestView-TestProduct1-TestRepo/actions/associate/", "{\"source_repo_id\":\"ACME_Corporation-TestProduct1-TestRepo\",\"criteria\":{\"type_ids\":[\"distribution\"],\"filters\":{}}}", post "https://abed.usersys.redhat.com/pulp/api/v2/repositories/ACME_Corporation-Library-AnotherTestView-TestProduct1-TestRepo/actions/associate/", "{\"source_repo_id\":\"ACME_Corporation-TestProduct1-TestRepo\",\"criteria\":{\"type_ids\":[\"erratum\"],\"filters\":{}},\"override_config\":{\"copy_children\":false}}", post "https://abed.usersys.redhat.com/pulp/api/v2/repositories/ACME_Corporation-Library-AnotherTestView-TestProduct1-TestRepo/actions/associate/", "{\"source_repo_id\":\"ACME_Corporation-TestProduct1-TestRepo\",\"criteria\":{\"type_ids\":[\"package_group\"],\"filters\":{}},\"override_config\":{\"copy_children\":false}}", As you can see we only specify the fields for rpms. Created attachment 757932 [details]
Publish log
Created attachment 757933 [details]
pulp refresh log with single-type unassociate.
Confirmed that unassociating RPMs takes a tremendous amount of RAM. You can pass the same criteria (with fields specified) as with the associate operation to keep the memory footprint down. Can you try adding the "fields" to RPM unassociate and see if this problem still exists? So the memory issue seems to have gone away with the a) sync RHEL 6.1, 6.2, 6.3, 6.4, 6server b) copy them all to new repos c) copy them all to new repos again (potentially repeat this a few times) d) within each repo, request all the unit_ids of all rpms (see comment #4) e) from that list of rpms, request the full units from unit_search (see comment #4) result in e) one of the requests for units will time out (take more than 30 seconds to complete) This isn't 100% reproducable, but seems to happen fairly often. errr missed the end of the last sentence: So the memory issue seems to have gone away with the unit unassociate change. Thanks! Timeouts seem to still occur at least for me. Created attachment 759227 [details]
apache log for unit fetch requests
This is from /var/log/httpd/ssl_access_log, showing the activity for a series of requests identical to what katello does. The first request gets the unit IDs for all RPMs in a rhel6 repo, and then details about those are fetched 200 at a time. I am requesting the same fields as the katello requests.
Notice that one of the requests has a much larger response size than the others. On my system, both apache and my python client both work very hard to generate, serialize, send, and deserialize that response, taking just over 30 seconds consistently.
You can see from the time stamps that this particular request took about 31 seconds, whereas most of the requests took less than 1 second.
I was able to eliminate the time delay of this particular request by not asking for the "provides" or "requires" fields. In that case, I was able to execute all of the requests in a total of 19s, compared to 64s previously. The particularly large response, previously around 46MB, is now 166KB.
I'm not sure if this is related to your timeouts, but it seems like a reasonable suspect. If you can manage without the requires and provides fields, that would help a lot. Otherwise, requesting smaller chunks may help.
Please advise if I should look into this further, or if you are satisfied to proceed with the advice given so far (specifying fields for unassociate, and the above notes about "requires" and "provides"). I am not able to reproduce the timeouts, so I would need more detail or access to a machine where it is reproducible in order to dig in further. Michael, So we current index provides and requires information to make it easier for searching. We could reduce the page size that we request and see if that helps. I will try cutting it in half down to ~100 units at a time. Thanks. |