Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1185868

Summary: Errata unit copy with recursive copy and dep solve causes worker to balloon memory usage and never free
Product: [Retired] Pulp Reporter: Justin Sherrill <jsherril>
Component: rpm-supportAssignee: Chris Duryee <cduryee>
Status: CLOSED UPSTREAM QA Contact: pulp-qe-list
Severity: high Docs Contact:
Priority: high    
Version: 2.5CC: cduryee, mhrivnak, skarmark
Target Milestone: ---Keywords: Reopened, Triaged
Target Release: 2.6.1   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1185869 (view as bug list) Environment:
Last Closed: 2015-02-28 23:21:22 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1185869    
Attachments:
Description Flags
reproducer script none

Description Justin Sherrill 2015-01-26 13:54:50 UTC
Description of problem:

When doing a unit copy of a single errata with a large source repo if recursive and resolve_dependencies are set to true the pulp worker doing the copy will balloon to a very large memory size and never free that memory even when the unit copy is completed.

post "https://localhost/pulp/api/v2/repositories/new_repo/actions/associate/"

{"source_repo_id":"bigrepo","criteria":{"type_ids":["erratum"],"filters":{"association":{"unit_id":{"$in":["26b996e9-ab64-4f09-ba94-7b846c71573b"]}}}},"override_config":{"recursive":true,"resolve_dependencies":true}}


The source repo in this case was rhel 6 x86_64, containing ~14400 packages and ~2800 errata.


USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
apache   28498  0.0 25.3 1843224 1483900 ?     S    Jan23   0:28 /usr/bin/python -m celery.__main__ worker -c 1 -n reserved_resource_worker-2.redhat.com --events --app=


Version-Release number of selected component (if applicable):
2.5.1

How reproducible:
always

Steps to Reproduce:
1. Perform a recursive errata unit copy with dep resolution turned on 
2. Watch memory usage for processes "watch 'ps aux | sort -nk +4 | tail'"  will show the top processes by memory usage.


Actual results:
pulp worker will quickly creep up in memory usage and never go back down.

Expected results:
Memory usage goes back down after unit copy

Additional info:

Comment 1 Justin Sherrill 2015-02-04 20:47:25 UTC
The errata in this example was:  RHBA-2014:1965

Comment 2 Chris Duryee 2015-02-04 21:01:11 UTC
repro steps:

create and sync a rhel6 repo

create a second empty rpm repo (copyrepo in this case)

run this:

pulp-admin rpm repo copy errata --from-repo-id el6 --to-repo-id copyrepo --str-eq="id=RHBA-2014:1965" --recursive

This will either OOM or hold onto a lot of memory.

Comment 3 Chris Duryee 2015-02-05 00:05:11 UTC
I was not able to get this to repro with 4GB mem on 2.6.0. The repo copy (sans "fields" attribute) ate a lot of mem but gave it up at the end of the operation.

If you still have the 2.5.x system that exhibits this behavior, we can take a look at it tomorrow.

Comment 4 Chris Duryee 2015-02-06 18:32:48 UTC
Justin,

I did further research on this yesterday and today. I set up the following scenarios, all with python 2.6:

* copy an erratum from RHEL 6.6 to 6.1
* copy an erratum for RHEL 5.11 to 5.6

Both of these make the memory usage grow, but it appears to be stable after growing. I assume this is from cPython not calling free() after performing these operations.

I'm going to mark this as CLOSED/NOTABUG since we already have a BZ to fix the underlying issue (https://bugzilla.redhat.com/show_bug.cgi?id=1158545). If the amount of memory in this particular scenario becomes problematic (OOMs, etc), let us know and we can try to come up with a workaround. FWIW I was not able to get it to OOM on a 4GB machine.

Comment 5 Chris Duryee 2015-02-09 15:17:44 UTC
reopening bz

Comment 7 Chris Duryee 2015-02-12 23:00:41 UTC
merged to 2.6-dev and master

QE note to repro: comment #2 is incorrect, the first repo needs to be RHEL 6.6 and the second RHEL 6.1. After that, copying with --recursive should do the trick.

Comment 8 Michael Hrivnak 2015-02-13 15:38:57 UTC
Created attachment 991413 [details]
reproducer script

This is the script I was using to reproduce the problem. Note the comments at the top that tell you what repos you need to sync before running it.

If you use this on rhel6, you'll have to modify the "systemctl" statement accordingly.

Comment 9 Brian Bouterse 2015-02-28 23:21:22 UTC
Moved to https://pulp.plan.io/issues/676