Description of problem: Getting MemoryError when importing large repository, such as rhel-7-server-rpms repo. The PackageResource.json file of the rhel-7-server-rpms repo is about 5.5GB. Pulp uses "json.load" method to decode the json file. When decoding the json, it will use up to 18GB+ memory and returns a 5.5GB python dictionary. If the system doesn't have enough memory, the json decode will fail with MemoryError. hammer content-import version --path /var/lib/pulp/imports/Default_Organization/rhel-7-imported/1.0/2022-03-04T17-17-09-11-00/ --organization-id 1 [................................................ ] [24%] Error: 1 subtask(s) failed for task group /pulp/api/v3/task-groups/ebc7514a-f606-4965-a599-eab74101e9b0/. # /var/log/messages pulpcore-worker-2: pulp [6eda5a34-529f-42b9-909b-fb526c9f35e0]: pulpcore.tasking.pulpcore_worker:INFO: Task d5f26164-087a-4acb-aca1-7aed30851040 failed () pulpcore-worker-2: pulp [6eda5a34-529f-42b9-909b-fb526c9f35e0]: pulpcore.tasking.pulpcore_worker:INFO: File "/usr/lib/python3.6/site-packages/pulpcore/tasking/pulpcore_worker.py", line 317, in _perform_task pulpcore-worker-2: result = func(*args, **kwargs) pulpcore-worker-2: File "/usr/lib/python3.6/site-packages/pulpcore/app/tasks/importer.py", line 161, in import_repository_version pulpcore-worker-2: a_result = _import_file(os.path.join(rv_path, filename), res_class, do_raise=False) pulpcore-worker-2: File "/usr/lib/python3.6/site-packages/pulpcore/app/tasks/importer.py", line 62, in _import_file pulpcore-worker-2: data = Dataset().load(json_file.read(), format="json") pulpcore-worker-2: File "/usr/lib64/python3.6/codecs.py", line 321, in decode pulpcore-worker-2: (result, consumed) = self._buffer_decode(data, self.errors, final) # Output of top command 16011 pulp 20 0 5924132 2.5g 2496 D 25.0 16.7 21:43.78 pulpcore-worker 16011 pulp 20 0 5924132 2.8g 2496 D 6.2 18.7 21:44.76 pulpcore-worker 16011 pulp 20 0 5924132 3.1g 2496 D 12.5 20.7 21:45.55 pulpcore-worker 16011 pulp 20 0 5924132 3.5g 2496 D 12.5 23.2 21:46.38 pulpcore-worker 16011 pulp 20 0 5924132 3.9g 2496 R 25.0 25.7 21:47.39 pulpcore-worker 16011 pulp 20 0 5924132 4.2g 2496 D 12.5 27.7 21:48.38 pulpcore-worker 16011 pulp 20 0 5924132 4.5g 2496 R 66.7 29.5 21:49.35 pulpcore-worker 16011 pulp 20 0 5924132 4.9g 2496 D 18.8 32.2 21:50.58 pulpcore-worker 16011 pulp 20 0 5924132 5.3g 2496 R 26.7 34.7 21:51.53 pulpcore-worker # Output of free command total used free shared buff/cache available Mem: 15879256 10213472 167648 193168 5498136 5135336 Swap: 8060924 352 8060572 total used free shared buff/cache available Mem: 15879256 10516964 162816 193168 5199476 4832256 Swap: 8060924 352 8060572 total used free shared buff/cache available Mem: 15879256 10774956 179184 193168 4925116 4573800 Swap: 8060924 352 8060572 total used free shared buff/cache available Mem: 15879256 11009476 168300 193168 4701480 4339684 Swap: 8060924 352 8060572 total used free shared buff/cache available Mem: 15879256 11354424 179644 193168 4345188 3994460 Swap: 8060924 352 8060572 total used free shared buff/cache available Mem: 15879256 11653416 154880 193168 4070960 3695840 Swap: 8060924 352 8060572 total used free shared buff/cache available Mem: 15879256 11960864 150900 193168 3767492 3388684 Swap: 8060924 352 8060572 total used free shared buff/cache available Mem: 15879256 12341972 150652 193168 3386632 3007040 Swap: 8060924 352 8060572 total used free shared buff/cache available Mem: 15879256 12741088 157716 193168 2980452 2608400 Swap: 8060924 352 8060572 total used free shared buff/cache available Mem: 15879256 13012212 159016 193168 2708028 2337108 Swap: 8060924 352 8060572 total used free shared buff/cache available Mem: 15879256 13309296 171804 193168 2398156 2039872 Swap: 8060924 352 8060572 total used free shared buff/cache available Mem: 15879256 13726688 180136 193168 1972432 1622400 total used free shared buff/cache available Mem: 15879256 14151480 169480 193168 1558296 1197956 Swap: 8060924 352 8060572 # Test to decode the PackageResource.json file in the python console. # python3 >>> fh = open("/var/lib/pulp/imports/Default_Organization/rhel-7-imported/1.0/2022-03-04T17-17-09-11-00/repository-Default_Organization-Red_Hat_Enterprise_Linux_Server-Red_Hat_Enterprise_Linux_7_Server_RPMs_x86_64_7Server_13/pulp_rpm.app.modelresource.PackageResource.json") >>> import json >>> json.load(fh) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib64/python3.6/json/__init__.py", line 296, in load return loads(fp.read(), File "/usr/lib64/python3.6/codecs.py", line 321, in decode (result, consumed) = self._buffer_decode(data, self.errors, final) MemoryError <=========== # top -o %MEM top - 11:50:42 up 1 day, 14:42, 3 users, load average: 0.49, 0.15, 0.09 Tasks: 134 total, 3 running, 131 sleeping, 0 stopped, 0 zombie %Cpu0 : 0.0 us, 3.3 sy, 0.0 ni, 96.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu1 : 0.0 us, 3.7 sy, 0.0 ni, 96.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu2 : 0.0 us, 7.0 sy, 0.0 ni, 93.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu3 : 31.8 us, 68.2 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem : 99.2/19906644 [||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||] KiB Swap: 1.0/8060924 [| ] PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 10034 root 20 0 20.8g 18.1g 3268 R 100.0 95.4 1:23.00 python3 Steps to Reproduce: 1. Prepare a disconnected Satellite with only 15GB RAM to easily reproduce the issue 2. Import a content view with only the rhel-7-server-rpms repo to the disconnected Satellite Actual results: Failed with memory error Expected results: No error and should consume only reasonable amount of memory.
Requesting needsinfo from upstream developer dkliban, ggainey because the 'FailedQA' flag is set.
Requesting needsinfo from upstream developer ggainey because the 'FailedQA' flag is set.
This issue should technically be closed-currentrelease should it not? I understand it was in a different state to go through QE verification again, but 6.11.1.1 already has the fixed package.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Satellite 6.11.2 Async Bug Fix Update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:6233
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days