Bug 2061224 - MemoryError when importing large repo to disconnected Satellite [NEEDINFO]
Summary: MemoryError when importing large repo to disconnected Satellite
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: Pulp
Version: 6.10.2
Hardware: Unspecified
OS: Unspecified
high
high vote
Target Milestone: 6.11.2
Assignee: satellite6-bugs
QA Contact: Lai
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-03-07 01:55 UTC by Hao Chang Yu
Modified: 2022-08-30 19:43 UTC (History)
18 users (show)

Fixed In Version: tfm-pulpcore-python-pulpcore-3.16.8
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 2103102 (view as bug list)
Environment:
Last Closed: 2022-08-30 19:43:07 UTC
Target Upstream Version:
pulp-infra: needinfo? (dkliban)
bbuckingham: needinfo? (dkliban)


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github pulp pulpcore issues 2307 0 None closed Memory error when importing large repository 2022-05-11 16:45:20 UTC
Red Hat Product Errata RHBA-2022:6233 0 None None None 2022-08-30 19:43:20 UTC

Description Hao Chang Yu 2022-03-07 01:55:11 UTC
Description of problem:
Getting MemoryError when importing large repository, such as rhel-7-server-rpms repo.

The PackageResource.json file of the rhel-7-server-rpms repo is about 5.5GB. Pulp uses "json.load" method to decode the json file. When decoding the json, it will use up to 18GB+ memory and returns a 5.5GB python dictionary. If the system doesn't have enough memory, the json decode will fail with MemoryError.


hammer content-import version --path /var/lib/pulp/imports/Default_Organization/rhel-7-imported/1.0/2022-03-04T17-17-09-11-00/ --organization-id 1
[................................................                                                                                                                                                           ] [24%]
Error: 1 subtask(s) failed for task group /pulp/api/v3/task-groups/ebc7514a-f606-4965-a599-eab74101e9b0/.


# /var/log/messages
pulpcore-worker-2: pulp [6eda5a34-529f-42b9-909b-fb526c9f35e0]: pulpcore.tasking.pulpcore_worker:INFO: Task d5f26164-087a-4acb-aca1-7aed30851040 failed ()
pulpcore-worker-2: pulp [6eda5a34-529f-42b9-909b-fb526c9f35e0]: pulpcore.tasking.pulpcore_worker:INFO:   File "/usr/lib/python3.6/site-packages/pulpcore/tasking/pulpcore_worker.py", line 317, in _perform_task
pulpcore-worker-2: result = func(*args, **kwargs)
pulpcore-worker-2: File "/usr/lib/python3.6/site-packages/pulpcore/app/tasks/importer.py", line 161, in import_repository_version
pulpcore-worker-2: a_result = _import_file(os.path.join(rv_path, filename), res_class, do_raise=False)
pulpcore-worker-2: File "/usr/lib/python3.6/site-packages/pulpcore/app/tasks/importer.py", line 62, in _import_file
pulpcore-worker-2: data = Dataset().load(json_file.read(), format="json")
pulpcore-worker-2: File "/usr/lib64/python3.6/codecs.py", line 321, in decode
pulpcore-worker-2: (result, consumed) = self._buffer_decode(data, self.errors, final)


# Output of top command
16011 pulp      20   0 5924132   2.5g   2496 D  25.0 16.7  21:43.78 pulpcore-worker                                                                                                                                                                                                                                                              
16011 pulp      20   0 5924132   2.8g   2496 D   6.2 18.7  21:44.76 pulpcore-worker                                                                                                                                                                                                                                                             
16011 pulp      20   0 5924132   3.1g   2496 D  12.5 20.7  21:45.55 pulpcore-worker                                                                                                                                                                                                                                                              
16011 pulp      20   0 5924132   3.5g   2496 D  12.5 23.2  21:46.38 pulpcore-worker                                                                                                                                                                                                                                                               
16011 pulp      20   0 5924132   3.9g   2496 R  25.0 25.7  21:47.39 pulpcore-worker                                                                                                                                                                                                                                                               
16011 pulp      20   0 5924132   4.2g   2496 D  12.5 27.7  21:48.38 pulpcore-worker                                                                                                                                                                                                                                                            
16011 pulp      20   0 5924132   4.5g   2496 R  66.7 29.5  21:49.35 pulpcore-worker                                                                                                                                                                                                                                                               
16011 pulp      20   0 5924132   4.9g   2496 D  18.8 32.2  21:50.58 pulpcore-worker                                                                                                                                                                                                                                                               
16011 pulp      20   0 5924132   5.3g   2496 R  26.7 34.7  21:51.53 pulpcore-worker


# Output of free command
              total        used        free      shared  buff/cache   available
Mem:       15879256    10213472      167648      193168     5498136     5135336
Swap:       8060924         352     8060572
              total        used        free      shared  buff/cache   available
Mem:       15879256    10516964      162816      193168     5199476     4832256
Swap:       8060924         352     8060572
              total        used        free      shared  buff/cache   available
Mem:       15879256    10774956      179184      193168     4925116     4573800
Swap:       8060924         352     8060572
              total        used        free      shared  buff/cache   available
Mem:       15879256    11009476      168300      193168     4701480     4339684
Swap:       8060924         352     8060572
              total        used        free      shared  buff/cache   available
Mem:       15879256    11354424      179644      193168     4345188     3994460
Swap:       8060924         352     8060572
              total        used        free      shared  buff/cache   available
Mem:       15879256    11653416      154880      193168     4070960     3695840
Swap:       8060924         352     8060572
              total        used        free      shared  buff/cache   available
Mem:       15879256    11960864      150900      193168     3767492     3388684
Swap:       8060924         352     8060572
              total        used        free      shared  buff/cache   available
Mem:       15879256    12341972      150652      193168     3386632     3007040
Swap:       8060924         352     8060572
              total        used        free      shared  buff/cache   available
Mem:       15879256    12741088      157716      193168     2980452     2608400
Swap:       8060924         352     8060572
              total        used        free      shared  buff/cache   available
Mem:       15879256    13012212      159016      193168     2708028     2337108
Swap:       8060924         352     8060572
              total        used        free      shared  buff/cache   available
Mem:       15879256    13309296      171804      193168     2398156     2039872
Swap:       8060924         352     8060572
              total        used        free      shared  buff/cache   available
Mem:       15879256    13726688      180136      193168     1972432     1622400
              total        used        free      shared  buff/cache   available
Mem:       15879256    14151480      169480      193168     1558296     1197956
Swap:       8060924         352     8060572
 

# Test to decode the PackageResource.json file in the python console.
# python3
>>> fh = open("/var/lib/pulp/imports/Default_Organization/rhel-7-imported/1.0/2022-03-04T17-17-09-11-00/repository-Default_Organization-Red_Hat_Enterprise_Linux_Server-Red_Hat_Enterprise_Linux_7_Server_RPMs_x86_64_7Server_13/pulp_rpm.app.modelresource.PackageResource.json")
>>> import json
>>> json.load(fh)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib64/python3.6/json/__init__.py", line 296, in load
    return loads(fp.read(),
  File "/usr/lib64/python3.6/codecs.py", line 321, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
MemoryError  <===========



# top  -o %MEM
top - 11:50:42 up 1 day, 14:42,  3 users,  load average: 0.49, 0.15, 0.09
Tasks: 134 total,   3 running, 131 sleeping,   0 stopped,   0 zombie
%Cpu0  :  0.0 us,  3.3 sy,  0.0 ni, 96.7 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu1  :  0.0 us,  3.7 sy,  0.0 ni, 96.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu2  :  0.0 us,  7.0 sy,  0.0 ni, 93.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu3  : 31.8 us, 68.2 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem : 99.2/19906644 [||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||]
KiB Swap:  1.0/8060924  [|                                                                                                   ]

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                                                                        
10034 root      20   0   20.8g  18.1g   3268 R 100.0 95.4   1:23.00 python3 



Steps to Reproduce:
1. Prepare a disconnected Satellite with only 15GB RAM to easily reproduce the issue
2. Import a content view with only the rhel-7-server-rpms repo to the disconnected Satellite


Actual results:
Failed with memory error

Expected results:
No error and should consume only reasonable amount of memory.

Comment 7 pulp-infra@redhat.com 2022-06-17 12:48:28 UTC
Requesting needsinfo from upstream developer dkliban, ggainey because the 'FailedQA' flag is set.

Comment 15 pulp-infra@redhat.com 2022-06-21 18:27:46 UTC
Requesting needsinfo from upstream developer ggainey because the 'FailedQA' flag is set.

Comment 36 Daniel Alley 2022-08-29 19:02:09 UTC
This issue should technically be closed-currentrelease should it not?  I understand it was in a different state to go through QE verification again, but 6.11.1.1 already has the fixed package.

Comment 37 errata-xmlrpc 2022-08-30 19:43:07 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Satellite 6.11.2 Async Bug Fix Update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:6233


Note You need to log in before you can comment on or make changes to this bug.