Bug 2051970 - pulp2to3 migration fails to migrate docker_blob content due to aggregate mongo 100M limit
Summary: pulp2to3 migration fails to migrate docker_blob content due to aggregate mong...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: Pulp
Version: 6.9.8
Hardware: x86_64
OS: Linux
high
high
Target Milestone: 6.9.9
Assignee: satellite6-bugs
QA Contact: Akhil Jha
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-02-08 12:57 UTC by Pavel Moravec
Modified: 2022-11-30 20:03 UTC (History)
7 users (show)

Fixed In Version: pulp-2to3-migration-0.11.10
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-04-20 20:34:53 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github pulp pulp-2to3-migration issues 511 0 None closed pulp2to3 migration fails to migrate docker_blob content due to aggregate mongo 100M limit 2023-03-01 19:16:25 UTC
Red Hat Knowledge Base (Solution) 6716081 0 None None None 2022-02-09 10:21:52 UTC
Red Hat Product Errata RHSA-2022:1478 0 None None None 2022-04-20 20:35:01 UTC

Description Pavel Moravec 2022-02-08 12:57:55 UTC
Description of problem:
Having many docker_blobs, pulp2to3 migration fails with:

    Processing Pulp2 repositories, importers, distributors 3253/3283
    Pre-migrating Pulp 2 docker_blob content 10000/25670
    Initial Migration steps complete. Migration failed, You will want to investigate: https://satellite.example.com/foreman_tasks/tasks/fd24355e-9fea-4aab-8a49-4751e9dfb9ab rake aborted!
    ForemanTasks::TaskError: Task fd24355e-9fea-4aab-8a49-4751e9dfb9ab: Katello::Errors::Pulp3Error: Sort exceeded memory limit of 104857600 bytes

The reason is an aggregate method requires from mongo more memory than mongo limit is (100MB).



Version-Release number of selected component (if applicable):
Sat6.9


How reproducible:
100% in a scaled environment


Steps to Reproduce:
1. Sync 25k+ docker blobs to Sat6.9
2. Run pulp2to3 pre-migration

Actual results:
2 fails with the above error, plus /var/log/messages have backtrace:

Feb  8 10:44:06 satellite pulpcore-worker-1: pulp: rq.worker:ERROR: Traceback (most recent call last):
Feb  8 10:44:06 satellite pulpcore-worker-1: File "/usr/lib/python3.6/site-packages/rq/worker.py", line 936, in perform_job
Feb  8 10:44:06 satellite pulpcore-worker-1: rv = job.perform()
Feb  8 10:44:06 satellite pulpcore-worker-1: File "/usr/lib/python3.6/site-packages/rq/job.py", line 684, in perform
Feb  8 10:44:06 satellite pulpcore-worker-1: self._result = self._execute()
Feb  8 10:44:06 satellite pulpcore-worker-1: File "/usr/lib/python3.6/site-packages/rq/job.py", line 690, in _execute
Feb  8 10:44:06 satellite pulpcore-worker-1: return self.func(*self.args, **self.kwargs)
Feb  8 10:44:06 satellite pulpcore-worker-1: File "/usr/lib/python3.6/site-packages/pulp_2to3_migration/app/tasks/migrate.py", line 77, in migrate_from_pulp2
Feb  8 10:44:06 satellite pulpcore-worker-1: pre_migrate_all_content(plan)
Feb  8 10:44:06 satellite pulpcore-worker-1: File "/usr/lib/python3.6/site-packages/pulp_2to3_migration/app/pre_migration.py", line 70, in pre_migrate_all_content
Feb  8 10:44:06 satellite pulpcore-worker-1: pre_migrate_content_type(content_model, mutable_type, lazy_type, premigrate_hook)
Feb  8 10:44:06 satellite pulpcore-worker-1: File "/usr/lib/python3.6/site-packages/pulp_2to3_migration/app/pre_migration.py", line 124, in pre_migrate_content_type
Feb  8 10:44:06 satellite pulpcore-worker-1: pulp2_content_ids = premigrate_hook()
Feb  8 10:44:06 satellite pulpcore-worker-1: File "/usr/lib/python3.6/site-packages/pulp_2to3_migration/app/plugin/docker/utils.py", line 17, in find_tags
Feb  8 10:44:06 satellite pulpcore-worker-1: result = pulp2_models.Tag.objects.aggregate([sort_stage, group_stage1, group_stage2])
Feb  8 10:44:06 satellite pulpcore-worker-1: File "/usr/lib/python3.6/site-packages/mongoengine/queryset/base.py", line 1318, in aggregate
Feb  8 10:44:06 satellite pulpcore-worker-1: return collection.aggregate(final_pipeline, cursor={}, **kwargs)
Feb  8 10:44:06 satellite pulpcore-worker-1: File "/usr/lib64/python3.6/site-packages/pymongo/collection.py", line 2458, in aggregate
Feb  8 10:44:06 satellite pulpcore-worker-1: **kwargs)
Feb  8 10:44:06 satellite pulpcore-worker-1: File "/usr/lib64/python3.6/site-packages/pymongo/collection.py", line 2377, in _aggregate
Feb  8 10:44:06 satellite pulpcore-worker-1: retryable=not cmd._performs_write)
Feb  8 10:44:06 satellite pulpcore-worker-1: File "/usr/lib64/python3.6/site-packages/pymongo/mongo_client.py", line 1471, in _retryable_read
Feb  8 10:44:06 satellite pulpcore-worker-1: return func(session, server, sock_info, slave_ok)
Feb  8 10:44:06 satellite pulpcore-worker-1: File "/usr/lib64/python3.6/site-packages/pymongo/aggregation.py", line 148, in get_cursor
Feb  8 10:44:06 satellite pulpcore-worker-1: user_fields=self._user_fields)
Feb  8 10:44:06 satellite pulpcore-worker-1: File "/usr/lib64/python3.6/site-packages/pymongo/pool.py", line 694, in command
Feb  8 10:44:06 satellite pulpcore-worker-1: exhaust_allowed=exhaust_allowed)
Feb  8 10:44:06 satellite pulpcore-worker-1: File "/usr/lib64/python3.6/site-packages/pymongo/network.py", line 162, in command
Feb  8 10:44:06 satellite pulpcore-worker-1: File "/usr/lib64/python3.6/site-packages/pymongo/network.py", line 162, in command
Feb  8 10:44:06 satellite pulpcore-worker-1: parse_write_concern_error=parse_write_concern_error)
Feb  8 10:44:06 satellite pulpcore-worker-1: File "/usr/lib64/python3.6/site-packages/pymongo/helpers.py", line 168, in _check_command_response
Feb  8 10:44:06 satellite pulpcore-worker-1: max_wire_version)
Feb  8 10:44:06 satellite pulpcore-worker-1: pymongo.errors.OperationFailure: Sort exceeded memory limit of 104857600 bytes, but did not opt in to external sorting. Aborting operation. Pass allowDiskUse:true to opt in., full error: {'ok': 0.0, 'errmsg': 'Sort exceeded memory limit of 104857600 bytes, but did not opt in to external sorting. Aborting operation. Pass allowDiskUse:true to opt in.', 'code': 16819, 'codeName': 'Location16819'}


Expected results:
Clean pre-migration.

Additional info:
I guess /usr/lib/python3.6/site-packages/pulp_2to3_migration/app/plugin/docker/utils.py", line 17 should be:

result = pulp2_models.Tag.objects.aggregate([sort_stage, group_stage1, group_stage2], allowDiskUse=true)

Comment 5 Grant Gainey 2022-02-22 16:56:50 UTC
Upstream issue merged and backported to 0.11:

main: https://github.com/pulp/pulp-2to3-migration/pull/512
0.11: https://github.com/pulp/pulp-2to3-migration/pull/514

Comment 13 Pavel Moravec 2022-04-06 14:15:56 UTC
On the system prepared by jhutar++ :

# rpm -q python3-pulp-2to3-migration
python3-pulp-2to3-migration-0.11.10-1.el7pc.noarch
#

# mongo pulp_database --eval "db.units_docker_blob.count()"
MongoDB shell version v3.4.9
connecting to: mongodb://127.0.0.1:27017/pulp_database
MongoDB server version: 3.4.9
65139
#

All content was migrated successfully:

# satellite-maintain content prepare
..
2022-04-06 07:39:11 -0400: Pre-migrating Pulp 2 docker_blob content (general info) 64950/65139
..
2022-04-06 07:41:21 -0400: Pre-migrating Pulp 2 docker_manifest content (detail info) 33400/37867
..
2022-04-06 07:42:21 -0400: Pre-migrating Pulp 2 docker_tag content (general info) 17250/21793
..
2022-04-06 08:47:55 -0400: Migrating docker_blob content to Pulp 3 64962/65139
..
2022-04-06 08:54:16 -0400: Migrating docker_manifest content to Pulp 3 37001/37867
..
2022-04-06 08:55:36 -0400: Migrating docker_tag content to Pulp 3 17017/21793
..
2022-04-06 09:04:56 -0400: Importing migrated content type docker_manifest: 23625/23966
..
2022-04-06 09:09:57 -0400: Importing migrated content type docker_tag: 23345/23635
Content Migration completed successfully


So all 65k docker blobs migrated successfully => BZ verified in my eyes.

Comment 14 Akhil Jha 2022-04-07 08:38:51 UTC
Verified. 
Satellite 6.9.9-1.0.

Steps:
Synced few of the docker blobs(~27k) on as mentioned in the above machine.
Ran `satellite-maintain content prepare`.

Migration was successful.

Comment 18 errata-xmlrpc 2022-04-20 20:34:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Satellite 6.9.9 Async Bug Fix Update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:1478


Note You need to log in before you can comment on or make changes to this bug.