Bug 1876782

Summary: Pulp worker can consume high memory when publishing a repository with large metadata files
Product: Red Hat Satellite Reporter: Hao Chang Yu <hyu>
Component: PulpAssignee: satellite6-bugs <satellite6-bugs>
Status: CLOSED CURRENTRELEASE QA Contact: Lai <ltran>
Severity: high Docs Contact:
Priority: unspecified    
Version: 6.7.0CC: ahumbe, arahaman, bbuckingham, bmbouter, dgross, dkliban, ggainey, gpadholi, ipanova, jbhatia, jjeffers, jkrajice, jsherril, ktordeur, kupadhya, ldelouw, mmccune, musman, pmoravec, rchan, ttereshc, vsedmik, wclark
Target Milestone: UnspecifiedKeywords: PrioBumpGSS, Triaged
Target Release: Unused   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1890916 1890917 1890919 1899312 (view as bug list) Environment:
Last Closed: 2020-11-25 16:49:09 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
HOTFIX RPM for Satellite 6.7.4 none

Description Hao Chang Yu 2020-09-08 07:22:41 UTC
Description of problem:
When publishing a repository with large metadata file (such as the others.xml.gz file in rhel-7-server-rpms). The Pulp worker can consumes more than 3GB of RAM for a few minutes. After that, the memory is freed to normally usage which is ok.

When calculating the open-size of a metadata, Pulp opens the gzip file which loads the whole gzip file into the memory.

plugins/distributors/yum/metadata/repomd.py
---------------------------------------------------------------------
        if file_path.endswith('.gz'):

            open_size_element = ElementTree.SubElement(data_element, 'open-size')

            open_checksum_attributes = {'type': self.checksum_type}
            open_checksum_element = ElementTree.SubElement(data_element, 'open-checksum',
                                                           open_checksum_attributes)

            try:
                file_handle = gzip.open(file_path, 'r')   <============= Here

            except:
                # cannot have an else clause to the try without an except clause
                raise

            else:
                try:
                    content = file_handle.read()
                    open_size_element.text = str(len(content))
                    open_checksum_element.text = self.checksum_constructor(content).hexdigest()

                finally:
                    file_handle.close()
---------------------------------------------------------------------

This is not quite an issue if user is syncing only a few repos. In the case of Satellite, user may sync large repositories at the same time, such as the Optimized Capsule sync. If one Capsule has 8 workers and each worker consumes 4GB+ of memory then the Capsule will run out of memory.


Steps to Reproduce:
1. Set Pulp to use only 1 worker so that we can monitor the progress easily.
2. Force full publish a rhel-7-server-rpms repository.
3. Use the following command to monitor the memory usage.

watch 'ps -aux | grep reserved_resource_worker-0'

4. The high memory consumption happens when Pulp finalizing the others.xml.gz file. You can use the following command to monitor the pulp working directory.

cd /var/cache/pulp/reserved_resource_worker-0@<satellite fqdn>/<pulp task id>/
watch 'ls -alrth'

Comment 3 pulp-infra@redhat.com 2020-09-10 16:08:23 UTC
The Pulp upstream bug status is at POST. Updating the external tracker on this bug.

Comment 4 pulp-infra@redhat.com 2020-09-10 16:08:24 UTC
The Pulp upstream bug priority is at Normal. Updating the external tracker on this bug.

Comment 7 pulp-infra@redhat.com 2020-09-14 13:05:45 UTC
The Pulp upstream bug status is at MODIFIED. Updating the external tracker on this bug.

Comment 8 pulp-infra@redhat.com 2020-09-14 14:06:05 UTC
All upstream Pulp bugs are at MODIFIED+. Moving this bug to POST.

Comment 9 wclark 2020-10-07 15:58:03 UTC
Created attachment 1719800 [details]
HOTFIX RPM for Satellite 6.7.4

Comment 10 wclark 2020-10-07 16:00:46 UTC
HOTFIX RPM is available for Satellite 6.7.4

INSTALLATION INSTRUCTIONS:

1. Download the attached hotfix RPM to each affected Satellite and Capsule server

2. # yum install ./pulp-rpm-plugins-2.21.0.6-2.HOTFIXRHBZ1876782.el7sat.noarch.rpm --disableplugin=foreman-protector

3. # satellite-maintain service restart

Comment 12 pulp-infra@redhat.com 2020-11-02 16:08:27 UTC
The Pulp upstream bug status is at CLOSED - CURRENTRELEASE. Updating the external tracker on this bug.

Comment 13 James Jeffers 2020-11-18 21:01:30 UTC
*** Bug 1890919 has been marked as a duplicate of this bug. ***

Comment 14 Mike McCune 2020-11-25 16:49:09 UTC
This is resolved in 6.8.1:

https://bugzilla.redhat.com/show_bug.cgi?id=1890916#c13

please upgrade if you need this bug resolved.

Comment 18 Mike McCune 2021-02-09 05:15:35 UTC
The hotfix mentioned above is applicable and usable on 6.7.5, feel free to apply.