Description of problem: Katello::Errors::Pulp3Error: invalid memory alloc request size while syncing third-party repos Version-Release number of selected component (if applicable): satellite-6.10.0-0.9.beta How reproducible: New installation of satellite 6.10 Beta Steps to Reproduce: 1. Install new satellite 6.10 Beta 2. Configure custom product with https://packages.microsoft.com/rhel/8/prod/ 3. Start syncing Actual results: Katello::Errors::Pulp3Error: invalid memory alloc request size 1073741824 Expected results: Additional info: {"pulp_tasks"=> [{"pulp_href"=>"/pulp/api/v3/tasks/6142627e-7c88-4617-a3be-0819984ff30d/", "pulp_created"=>"2021-09-28T06:07:28.972+00:00", "state"=>"failed", "name"=>"pulp_rpm.app.tasks.synchronizing.synchronize", "logging_cid"=>"55bc4cb4-3824-4de3-9e93-25abcbe9f6a5", "started_at"=>"2021-09-28T06:07:29.119+00:00", "finished_at"=>"2021-09-28T06:21:39.754+00:00", "error"=> {"traceback"=> " File \"/usr/lib/python3.6/site-packages/pulpcore/tasking/pulpcore_worker.py\", line 272, in _perform_task\n" + " result = func(*args, **kwargs)\n" + " File \"/usr/lib/python3.6/site-packages/pulp_rpm/app/tasks/synchronizing.py\", line 475, in synchronize\n" + " version = dv.create()\n" + " File \"/usr/lib/python3.6/site-packages/pulpcore/plugin/stages/declarative_version.py\", line 151, in create\n" + " loop.run_until_complete(pipeline)\n" + " File \"/usr/lib64/python3.6/asyncio/base_events.py\", line 484, in run_until_complete\n" + " return future.result()\n" + " File \"/usr/lib/python3.6/site-packages/pulpcore/plugin/stages/api.py\", line 225, in create_pipeline\n" + " await asyncio.gather(*futures)\n" + " File \"/usr/lib/python3.6/site-packages/pulpcore/plugin/stages/api.py\", line 43, in __call__\n" + " await self.run()\n" + " File \"/usr/lib/python3.6/site-packages/pulpcore/plugin/stages/content_stages.py\", line 113, in run\n" + " d_content.content.save()\n" + " File \"/usr/lib/python3.6/site-packages/pulpcore/app/models/base.py\", line 149, in save\n" + " return super().save(*args, **kwargs)\n" + " File \"/usr/lib/python3.6/site-packages/django_lifecycle/mixins.py\", line 134, in save\n" + " save(*args, **kwargs)\n" + " File \"/usr/lib/python3.6/site-packages/django/db/models/base.py\", line 744, in save\n" + " force_update=force_update, update_fields=update_fields)\n" + " File \"/usr/lib/python3.6/site-packages/django/db/models/base.py\", line 782, in save_base\n" + " force_update, using, update_fields,\n" + " File \"/usr/lib/python3.6/site-packages/django/db/models/base.py\", line 873, in _save_table\n" + " result = self._do_insert(cls._base_manager, using, fields, update_pk, raw)\n" + " File \"/usr/lib/python3.6/site-packages/django/db/models/base.py\", line 911, in _do_insert\n" + " using=using, raw=raw)\n" + " File \"/usr/lib/python3.6/site-packages/django/db/models/manager.py\", line 82, in manager_method\n" + " return getattr(self.get_queryset(), name)(*args, **kwargs)\n" + " File \"/usr/lib/python3.6/site-packages/django/db/models/query.py\", line 1186, in _insert\n" + " return query.get_compiler(using=using).execute_sql(return_id)\n" + " File \"/usr/lib/python3.6/site-packages/django/db/models/sql/compiler.py\", line 1377, in execute_sql\n" + " cursor.execute(sql, params)\n" + " File \"/usr/lib/python3.6/site-packages/django/db/backends/utils.py\", line 67, in execute\n" + " return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)\n" + " File \"/usr/lib/python3.6/site-packages/django/db/backends/utils.py\", line 76, in _execute_with_wrappers\n" + " return executor(sql, params, many, context)\n" + " File \"/usr/lib/python3.6/site-packages/django/db/backends/utils.py\", line 84, in _execute\n" + " return self.cursor.execute(sql, params)\n" + " File \"/usr/lib/python3.6/site-packages/django/db/utils.py\", line 89, in __exit__\n" + " raise dj_exc_value.with_traceback(traceback) from exc_value\n" + " File \"/usr/lib/python3.6/site-packages/django/db/backends/utils.py\", line 84, in _execute\n" + " return self.cursor.execute(sql, params)\n", "description"=>"invalid memory alloc request size 1073741824\n"}, "worker"=>"/pulp/api/v3/workers/a938363f-03b0-4940-baa9-4a8be9979658/", "child_tasks"=>[], "progress_reports"=> [{"message"=>"Downloading Metadata Files", "code"=>"sync.downloading.metadata", "state"=>"completed", "done"=>8}, {"message"=>"Downloading Artifacts", "code"=>"sync.downloading.artifacts", "state"=>"completed", "done"=>154}, {"message"=>"Associating Content", "code"=>"associating.content", "state"=>"canceled", "done"=>0}, {"message"=>"Parsed Packages", "code"=>"sync.parsing.packages", "state"=>"completed", "total"=>154, "done"=>154}], "created_resources"=>[], "reserved_resources_record"=> ["/pulp/api/v3/repositories/rpm/rpm/53932542-5096-4944-a6cc-5a0f3b8f7a5a/", "/pulp/api/v3/remotes/rpm/rpm/beb2e274-f21f-4beb-a1f8-9ba517c056bd/"]}], "create_version"=>true, "task_groups"=>[], "poll_attempts"=>{"total"=>52, "failed"=>1}}
The Pulp upstream bug status is at ASSIGNED. Updating the external tracker on this bug.
The Pulp upstream bug priority is at High. Updating the external tracker on this bug.
This is not a problem with Pulp / Satellite per se. This repo seems to contain a set of packages with *millions* of duplicated file entries - the same set of files are listed nearly 13 million times for one particular package and nearly 3 million times for another. The duplicate file listings cause the metadata to be so inflated that it exceeds the amount of data that can be inserted into Postgresql in a single insert. I've now reported it upstream to Microsoft and closed the associated Pulp issues. https://github.com/dotnet/core/issues/6706#issuecomment-986330681
The Pulp upstream bug status is at CLOSED - NOTABUG. Updating the external tracker on this bug.
As of today it appears that Microsoft has fixed their repository, it should now be able to be synced by Pulp / Satellite.