Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
Red Hat Satellite engineering is moving the tracking of its product development work on Satellite to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "Satellite project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs will be migrated starting at the end of May. If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "Satellite project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/SAT-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 2149138

Summary: Getting "Artifact matching query does not exist." error when importing contents to the disconnected Satellite.
Product: Red Hat Satellite Reporter: Hao Chang Yu <hyu>
Component: PulpAssignee: satellite6-bugs <satellite6-bugs>
Status: CLOSED CURRENTRELEASE QA Contact: Satellite QE Team <sat-qe-bz-list>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 6.11.3CC: avnkumar, dalley, dkliban, ggainey, jalviso, rchan, saydas
Target Milestone: UnspecifiedKeywords: Triaged
Target Release: Unused   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-09-09 14:37:08 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Hao Chang Yu 2022-11-29 01:15:29 UTC
Description of problem:
This issue appears to only affecting module streams artifacts. If an existing module stream has been changed/updated in the upstream repo, syncing the repository will not cause repository version increase. Pulp will just silently create a new artifact and associate it to the existing module stream content id.

Since the repository version didn't change, the incremental export can't detect any changes in the repository so it doesn't include the updated module stream artifact to the exported tarball.

When importing the exported incremental tarball, user will get the following error in /var/log/messages:
-------------------------
pulpcore-worker-2: pulp [db0b189e-54a4-4cdb-ac88-00ad05bc80d5]: pulpcore.tasking.pulpcore_worker:INFO: Task 4e32a09e-25f2-4771-9b0b-d13b69b934ce failed (Artifact matching query does not exist.)
pulpcore-worker-2: pulp [db0b189e-54a4-4cdb-ac88-00ad05bc80d5]: pulpcore.tasking.pulpcore_worker:INFO:   File "/opt/theforeman/tfm-pulpcore/root/usr/lib/python3.8/site-packages/pulpcore/tasking/pulpcore_worker.py", line 380, in _perform_task
pulpcore-worker-2: result = func(*args, **kwargs)
pulpcore-worker-2: File "/opt/theforeman/tfm-pulpcore/root/usr/lib/python3.8/site-packages/pulpcore/app/tasks/importer.py", line 250, in import_repository_version
pulpcore-worker-2: for a_batch in _import_file(ca_path, ContentArtifactResource, retry=True):
pulpcore-worker-2: File "/opt/theforeman/tfm-pulpcore/root/usr/lib/python3.8/site-packages/pulpcore/app/tasks/importer.py", line 130, in _import_file
pulpcore-worker-2: a_result = resource.import_data(data, raise_errors=True)
pulpcore-worker-2: File "/opt/theforeman/tfm-pulpcore/root/usr/lib/python3.8/site-packages/import_export/resources.py", line 757, in import_data
pulpcore-worker-2: return self.import_data_inner(dataset, dry_run, raise_errors, using_transactions, collect_failed_rows, **kwargs)
pulpcore-worker-2: File "/opt/theforeman/tfm-pulpcore/root/usr/lib/python3.8/site-packages/import_export/resources.py", line 805, in import_data_inner
pulpcore-worker-2: raise row_result.errors[-1].error
pulpcore-worker-2: File "/opt/theforeman/tfm-pulpcore/root/usr/lib/python3.8/site-packages/import_export/resources.py", line 673, in import_row
pulpcore-worker-2: self.import_obj(instance, row, dry_run, **kwargs)
pulpcore-worker-2: File "/opt/theforeman/tfm-pulpcore/root/usr/lib/python3.8/site-packages/import_export/resources.py", line 524, in import_obj
pulpcore-worker-2: self.import_field(field, obj, data, **kwargs)
pulpcore-worker-2: File "/opt/theforeman/tfm-pulpcore/root/usr/lib/python3.8/site-packages/import_export/resources.py", line 507, in import_field
pulpcore-worker-2: field.save(obj, data, is_m2m, **kwargs)
pulpcore-worker-2: File "/opt/theforeman/tfm-pulpcore/root/usr/lib/python3.8/site-packages/import_export/fields.py", line 110, in save
pulpcore-worker-2: cleaned = self.clean(data, **kwargs)
pulpcore-worker-2: File "/opt/theforeman/tfm-pulpcore/root/usr/lib/python3.8/site-packages/import_export/fields.py", line 66, in clean
pulpcore-worker-2: value = self.widget.clean(value, row=data, **kwargs)
pulpcore-worker-2: File "/opt/theforeman/tfm-pulpcore/root/usr/lib/python3.8/site-packages/import_export/widgets.py", line 406, in clean
pulpcore-worker-2: return self.get_queryset(value, row, *args, **kwargs).get(**{self.field: val})
pulpcore-worker-2: File "/opt/theforeman/tfm-pulpcore/root/usr/lib/python3.8/site-packages/django/db/models/query.py", line 435, in get
-------------------------


Steps to Reproduce:
On the connected Satellite:
1. Reposync "Red Hat Directory Server 11 for RHEL 8 x86_64" to the Satellite (RHEL 8)'s "/var/www/html/pub/dirsrv-11-for-rhel-8-x86_64-rpms" directory.
~~~
reposync -p /root/rhel_directory_repo --download-metadata --repo=dirsrv-11-for-rhel-8-x86_64-rpms
~~~

2. Create a custom repository with upstream url of the pub directory. e.g. http://satellite.fqdn/pub/dirsrv-11-for-rhel-8-x86_64-rpms
3. Sync the repository.
4. Perform complete export of the repository.
~~~
hammer content-export complete repository --id <repo id> --organization-id 1
~~~

5. cd to the "/var/www/html/pub/dirsrv-11-for-rhel-8-x86_64-rpms/repodata" directory. Unzip the xxxxxxxxxxxxxxxxxxxxxx-modules.yaml.gz.
6. Edit the xxxxxxxxxxxxxxxxxxxxxx-modules.yaml file and then modify the description of any module streams and save it.
7. Run the following commands to record the checksum of the modified "xxxxxxxxxxxxxxxxxxxxxx-modules.yaml" file.
~~~
sha256sum xxxxxxxxxxxxxxxxxxxxxx-modules.yaml
ls -l xxxxxxxxxxxxxxxxxxxxxx-modules.yaml.gz
~~~

8. Zip the "xxxxxxxxxxxxxxxxxxxxxx-modules.yaml" file and record its checksum and file size
~~~
gzip xxxxxxxxxxxxxxxxxxxxxx-modules.yaml
sha256sum xxxxxxxxxxxxxxxxxxxxxx-modules.yaml.gz
ls -l xxxxxxxxxxxxxxxxxxxxxx-modules.yaml.gz
~~~

9. Edit "repomd.xml" file. Increase the "<revision>" by 1 and modify the '<data type="modules">' section with the recorded info above. Save the file.
10. Sync custom repository again to the Satellite.
11. Perform incremental export of the repository.
~~~
hammer content-export incremental repository --id <repo id> --organization-id 1
~~~

12. Copy all the exported files to the disconnected Satellite.


On the disconnected Satellite.
13. Complete import the repository first. E.g.
~~~
hammer content-import repository --organization-id 1 --path /var/lib/pulp/imports/redhat/Export-rhel_directory_11-464/1.0/2022-11-28T17-08-25-10-00
~~~
14. After that increment import the repository. E.g.
~~~
hammer content-import repository --organization-id 1 --path /var/lib/pulp/imports/redhat/Export-rhel_directory_11-464/2.0/2022-11-28T17-21-55-10-00
~~~

Actual results:
# Hammer output:
Error: 1 subtask(s) failed for task group /pulp/api/v3/task-groups/19832237-33a9-4562-a9ed-dc0263e1c64f/.

# In /var/log/messages:
pulpcore-worker-2: pulp [db0b189e-54a4-4cdb-ac88-00ad05bc80d5]: pulpcore.tasking.pulpcore_worker:INFO: Task 4e32a09e-25f2-4771-9b0b-d13b69b934ce failed (Artifact matching query does not exist.)


Expected results:
Repository can be imported successfully.


Additional info:
I am not exactly sure how often an existing module streams can be modified or updated in the real world.

Comment 1 Hao Chang Yu 2022-11-29 01:26:57 UTC
Comparing the "pulpcore.app.modelresource.ContentArtifactResource.json", we can see that the artifact has been changed but the content id remain the same.

Export v1.0
~~~
# pulpcore.app.modelresource.ContentArtifactResource.json
    {
        "artifact": "7cdd95237c2737c55bb169ef7bda9b0e9ad55e1d63f46a46ddba647b2a8a647c",
        "content": "ca71a0f2-1c5f-4662-88b7-9df52838c191",
        "relative_path": "redhat-ds11802002020042814185451c5a973x86_64snippet"
    },
~~~

Export v2.0
~~~
# pulpcore.app.modelresource.ContentArtifactResource.json
    {
        "artifact": "faa29c4c34db629d4d5abc802fe39d9170d88b2baf0c076e7611934c6667a0b0",
        "content": "ca71a0f2-1c5f-4662-88b7-9df52838c191",
        "relative_path": "redhat-ds11802002020042814185451c5a973x86_64snippet"
    },

# cat pulpcore.app.modelresource.ArtifactResource.json
[]
~~~

Comment 2 Hao Chang Yu 2022-12-01 00:35:39 UTC
Compared a few affected module streams, the only change seems like Pulp previously treat "stream" as integer now it treat it as string


# diff -u redhat-ds118010020191031235358e747bfc9x86_64snippet.old  redhat-ds118010020191031235358e747bfc9x86_64snippet.new
--- redhat-ds118010020191031235358e747bfc9x86_64snippet.old	2022-12-01 10:23:52.395807029 +1000
+++ redhat-ds118010020191031235358e747bfc9x86_64snippet.new	2022-12-01 10:24:08.176704421 +1000
@@ -3,7 +3,7 @@
 version: 2
 data:
   name: redhat-ds
-  stream: 11
+  stream: "11"
   version: 8010020191031235358
   context: e747bfc9
   arch: x86_64


# diff -u redhat-ds11802002021021017510051c5a973x86_64snippet.old  redhat-ds11802002021021017510051c5a973x86_64snippet.new
--- redhat-ds11802002021021017510051c5a973x86_64snippet.old	2022-12-01 10:27:46.953281930 +1000
+++ redhat-ds11802002021021017510051c5a973x86_64snippet.new	2022-12-01 10:28:07.280149764 +1000
@@ -3,7 +3,7 @@
 version: 2
 data:
   name: redhat-ds
-  stream: 11
+  stream: "11"
   version: 8020020210210175100
   context: 51c5a973
   arch: x86_64


diff -u redhat-ds11804002021032614371545c09202x86_64snippet.old  redhat-ds11804002021032614371545c09202x86_64snippet.new
--- redhat-ds11804002021032614371545c09202x86_64snippet.old	2022-12-01 10:30:54.279063666 +1000
+++ redhat-ds11804002021032614371545c09202x86_64snippet.new	2022-12-01 10:31:09.919961895 +1000
@@ -3,7 +3,7 @@
 version: 2
 data:
   name: redhat-ds
-  stream: 11
+  stream: "11"
   version: 8040020210326143715
   context: 45c09202
   arch: x86_64

Comment 3 Hao Chang Yu 2022-12-01 01:20:45 UTC
I can reproduce the integer "stream" by syncing the "Red Hat Directory Server 11 for RHEL 8 x86_64" to the Satellite (RHEL 8)" repository in Satellite 6.10 . 

I think the repositories with module streams were previously synced in Satellite 6.10 and then synced the repo again after upgraded to Satellite 6.11 which converted the stream to string.

Comment 4 Hao Chang Yu 2022-12-01 02:44:54 UTC
I think this is something to do with the libmodulemd versions used between Satellite 6.10 and 6.11.

# Satellite 6.10
libmodulemd2-2.9.3-1.el7pc.x86_64

# Satellite 6.11
libmodulemd2-2.12.1-1.el7pc.x86_64

# Satellite 6.13
libmodulemd-2.13.0-1.el8.x86_64


Tests:
# Satellite 6.10 dumps stream to YAML as integer.
-------------------------------------
# PULP_SETTINGS=/etc/pulp/settings.py pulpcore-manager shell
Python 3.6.8 (default, Aug 13 2020, 07:46:32) 
[GCC 4.8.5 20150623 (Red Hat 4.8.5-39)] on linux
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> import gi
>>> gi.require_version("Modulemd", "2.0")
>>> from gi.repository import Modulemd as mmdlib  # noqa: E402
>>> modulemd_index = mmdlib.ModuleIndex.new()
>>> modulemd_index.update_from_file("/tmp/aaff7205bcb4f728fcae3876704f2896c0b9c1c5ef202928247d7f44289bfa37-modules.yaml", True)
(True, failures=[])
>>> modulemd_index.get_module_names()
['redhat-ds']
>>> streams = modulemd_index.get_module('redhat-ds').get_all_streams()
>>> temp_index = mmdlib.ModuleIndex.new()
>>> temp_index.add_module_stream(streams[0])
True
>>> temp_index.dump_to_string()
'---\ndocument: modulemd\nversion: 2\ndata:\n  name: redhat-ds\n  stream: 11\n  version: 8020020200428141854\n  context: 51c5a973\n  arch: x86_64\n  summary: Red Hat Directory Server 11.1\n  description: >-\n    The Red Hat Directory Server is an LDAPv3 compliant server.\n  license:\n    module:\n    - MIT\n    content:\n    - GPLv3+\n  dependencies:\n  - buildrequires:\n      build: [rhel-8.2.0-dirsrv-11.1]\n      platform: [el8.2.0.z]\n    requires:\n      platform: [el8]\n  profiles:\n    default:\n      rpms:\n      - 389-ds-base\n      - cockpit-389-ds\n    legacy:\n      rpms:\n      - 389-ds-base\n      - 389-ds-base-legacy-tools\n      - cockpit-389-ds\n    minimal:\n      rpms:\n      - 389-ds-base\n  components:\n    rpms:\n      389-ds-base:\n        rationale: Package in api\n        ref: stream-redhat-ds-11-LP-rhel-8.2.0-dirsrv-11.1\n        arches: [x86_64]\n  artifacts:\n    rpms:\n    - 389-ds-base-0:1.4.2.12-2.module+el8dsrv+6428+6e54c518.src\n    - 389-ds-base-0:1.4.2.12-2.module+el8dsrv+6428+6e54c518.x86_64\n    - 389-ds-base-debuginfo-0:1.4.2.12-2.module+el8dsrv+6428+6e54c518.x86_64\n    - 389-ds-base-debugsource-0:1.4.2.12-2.module+el8dsrv+6428+6e54c518.x86_64\n    - 389-ds-base-devel-0:1.4.2.12-2.module+el8dsrv+6428+6e54c518.x86_64\n    - 389-ds-base-legacy-tools-0:1.4.2.12-2.module+el8dsrv+6428+6e54c518.x86_64\n    - 389-ds-base-legacy-tools-debuginfo-0:1.4.2.12-2.module+el8dsrv+6428+6e54c518.x86_64\n    - 389-ds-base-libs-0:1.4.2.12-2.module+el8dsrv+6428+6e54c518.x86_64\n    - 389-ds-base-libs-debuginfo-0:1.4.2.12-2.module+el8dsrv+6428+6e54c518.x86_64\n    - 389-ds-base-snmp-0:1.4.2.12-2.module+el8dsrv+6428+6e54c518.x86_64\n    - 389-ds-base-snmp-debuginfo-0:1.4.2.12-2.module+el8dsrv+6428+6e54c518.x86_64\n    - cockpit-389-ds-0:1.4.2.12-2.module+el8dsrv+6428+6e54c518.noarch\n    - python3-lib389-0:1.4.2.12-2.module+el8dsrv+6428+6e54c518.noarch\n...\n'
-------------------------------------


# Both Satellite 6.11 and 6.12 dump stream to YAML as string.
-------------------------------------
# PULP_SETTINGS=/etc/pulp/settings.py pulpcore-manager shell
Python 3.9.13 (main, Nov  9 2022, 13:16:24) 
[GCC 8.5.0 20210514 (Red Hat 8.5.0-15)] on linux
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> import gi
>>> gi.require_version("Modulemd", "2.0")
>>> from gi.repository import Modulemd as mmdlib  # noqa: E402
>>> modulemd_index = mmdlib.ModuleIndex.new()
>>> modulemd_index.update_from_file("/tmp/aaff7205bcb4f728fcae3876704f2896c0b9c1c5ef202928247d7f44289bfa37-modules.yaml", True)
(True, failures=[])
>>> modulemd_index.get_module_names()
['redhat-ds']
>>> streams = modulemd_index.get_module('redhat-ds').get_all_streams()
>>> temp_index = mmdlib.ModuleIndex.new()
>>> temp_index.add_module_stream(streams[0])
True
>>> temp_index.dump_to_string()
'---\ndocument: modulemd\nversion: 2\ndata:\n  name: redhat-ds\n  stream: "11"\n  version: 8020020200428141854\n  context: 51c5a973\n  arch: x86_64\n  summary: Red Hat Directory Server 11.1\n  description: >-\n    The Red Hat Directory Server is an LDAPv3 compliant server.\n  license:\n    module:\n    - MIT\n    content:\n    - GPLv3+\n  dependencies:\n  - buildrequires:\n      build: [rhel-8.2.0-dirsrv-11.1]\n      platform: [el8.2.0.z]\n    requires:\n      platform: [el8]\n  profiles:\n    default:\n      rpms:\n      - 389-ds-base\n      - cockpit-389-ds\n    legacy:\n      rpms:\n      - 389-ds-base\n      - 389-ds-base-legacy-tools\n      - cockpit-389-ds\n    minimal:\n      rpms:\n      - 389-ds-base\n  components:\n    rpms:\n      389-ds-base:\n        rationale: Package in api\n        ref: stream-redhat-ds-11-LP-rhel-8.2.0-dirsrv-11.1\n        arches: [x86_64]\n  artifacts:\n    rpms:\n    - 389-ds-base-0:1.4.2.12-2.module+el8dsrv+6428+6e54c518.src\n    - 389-ds-base-0:1.4.2.12-2.module+el8dsrv+6428+6e54c518.x86_64\n    - 389-ds-base-debuginfo-0:1.4.2.12-2.module+el8dsrv+6428+6e54c518.x86_64\n    - 389-ds-base-debugsource-0:1.4.2.12-2.module+el8dsrv+6428+6e54c518.x86_64\n    - 389-ds-base-devel-0:1.4.2.12-2.module+el8dsrv+6428+6e54c518.x86_64\n    - 389-ds-base-legacy-tools-0:1.4.2.12-2.module+el8dsrv+6428+6e54c518.x86_64\n    - 389-ds-base-legacy-tools-debuginfo-0:1.4.2.12-2.module+el8dsrv+6428+6e54c518.x86_64\n    - 389-ds-base-libs-0:1.4.2.12-2.module+el8dsrv+6428+6e54c518.x86_64\n    - 389-ds-base-libs-debuginfo-0:1.4.2.12-2.module+el8dsrv+6428+6e54c518.x86_64\n    - 389-ds-base-snmp-0:1.4.2.12-2.module+el8dsrv+6428+6e54c518.x86_64\n    - 389-ds-base-snmp-debuginfo-0:1.4.2.12-2.module+el8dsrv+6428+6e54c518.x86_64\n    - cockpit-389-ds-0:1.4.2.12-2.module+el8dsrv+6428+6e54c518.noarch\n    - python3-lib389-0:1.4.2.12-2.module+el8dsrv+6428+6e54c518.noarch\n...\n'
-------------------------------------


Based on the tests above, Satellite 6.11 and 6.12 have the consistent behaviour so I believe this issue could be just a one time issue. Once all affected repositories with module streams have been synced in Satellite 6.11 and complete reimported to the disconnected Satellite then everything should be good.

Comment 7 Daniel Alley 2023-09-08 03:06:44 UTC
I'm about 99% certain that the original issue no longer exists in 6.12+ since we use a different method of storing module metadata (it's stored in the database alongside other metadata rather than in files in artifact storage).

I'm less certain that import will change any content units in the general case - that applies to other content types too, like packages - say the changelogs get trimmed for instance, does import update the existing content units?  I don't think it does.  

However if that issue exists it should probably be filed as a new BZ.  I'd like to close this one w/ Closed - Currentrelease.  Any objections to doing this?

Comment 9 Grant Gainey 2023-09-08 14:19:17 UTC
(In reply to Daniel Alley from comment #7)
> I'm less certain that import will change any content units in the general
> case - that applies to other content types too, like packages - say the
> changelogs get trimmed for instance, does import update the existing content
> units?  I don't think it does.  

Even when doing an incremental-export/import, we carry all the metadata for the ending-repo-version's content-units. django-import-export looks up content=units based on their "natural key", and will create new ones *or update existing ones* if the incoming version of a given content-unit has changed.

For future reference-, here's the function that creates the content-export-files for a specified repository-version: 

https://github.com/pulp/pulpcore/blob/main/pulpcore/app/importexport.py#L125

Comment 10 Daniel Alley 2023-09-08 15:06:42 UTC
Thanks Grant.  So my concern shouldn't be an issue. That just leaves the original bug report, which I'm certain would have been resolved by 6.12

If Sayan also gives +1 then let's close this.

Comment 13 Robin Chan 2023-09-11 04:04:30 UTC
The Pulp upstream bug status is at closed. Updating the external tracker on this bug.