Bug 1920511 - 3rd party repository sync fails with 'InvalidStringData: strings in documents must be valid UTF-8'
Summary: 3rd party repository sync fails with 'InvalidStringData: strings in documents...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: Pulp
Version: 6.8.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: 6.9.7
Assignee: satellite6-bugs
QA Contact: Lai
URL:
Whiteboard:
: 1965595 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-01-26 12:46 UTC by Jan Senkyrik
Modified: 2023-11-18 01:49 UTC (History)
16 users (show)

Fixed In Version: pulp-rpm-2.21.5.2-1
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-11-10 16:23:39 UTC
Target Upstream Version:
Embargoed:
jsenkyri: needinfo-
jsenkyri: needinfo-


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Pulp Redmine 8893 0 High MODIFIED 3rd party repository sync fails with 'InvalidStringData: strings in documents must be valid UTF-8' 2021-07-19 14:07:39 UTC
Pulp Redmine 9109 0 Normal CLOSED - CURRENTRELEASE Backport #8982 "Fix migrations for clients that have applied the fix for 8893" to 0.11.z 2021-10-12 14:08:51 UTC
Red Hat Knowledge Base (Solution) 6288061 0 None None None 2021-08-29 05:11:03 UTC
Red Hat Product Errata RHBA-2021:4612 0 None None None 2021-11-10 16:23:52 UTC

Description Jan Senkyrik 2021-01-26 12:46:02 UTC
Description of problem:

Satellite is unable to synchronize repo from the following URL:
~~~
https://developer.download.nvidia.com/compute/cuda/repos/rhel8/ppc64le/
~~~

# messages
~~~
Jan 26 10:37:09 ktordeur-sat65 pulp: pulp_rpm.plugins.importers.yum.sync:ERROR: [5a597d1d] (8816-83104) strings in documents must be valid UTF-8: '\xe7\x00\x00\x00\x04450\x0
0\x14\x00\x00\x00\x020\x00\x08\x00\x00\x00default\x00\x00\x04460\x00\x14\x00\x00\x00\x020\x00\x08\x00\x00\x00default\x00\x00\x04455\x00\x14\x00\x00\x00\x020\x00\x08\x00\x00\
x00default\x00\x00\x04455-dkms\x00\x14\x00\x00\x00\x020\x00\x08\x00\x00\x00default\x00\x00\x04460-dkms\x00\x14\x00\x00\x00\x020\x00\x08\x00\x00\x00default\x00\x00\x04latest-
dkms\x00\x14\x00\x00\x00\x020\x00\x08\x00\x00\x00default\x00\x00\x04450-dkms\x00\x14\x00\x00\x00\x020\x00\x08\x00\x00\x00default\x00\x00\x04latest\x00\x14\x00\x00\x00\x020\x
00\x08\x00\x00\x00default\x00\x00\x00'
Jan 26 10:37:09 ktordeur-sat65 pulp: pulp_rpm.plugins.importers.yum.sync:ERROR: [5a597d1d] (8816-83104) Traceback (most recent call last):
Jan 26 10:37:09 ktordeur-sat65 pulp: pulp_rpm.plugins.importers.yum.sync:ERROR: [5a597d1d] (8816-83104)   File "/usr/lib/python2.7/site-packages/pulp_rpm/plugins/importers/y
um/sync.py", line 312, in run
Jan 26 10:37:09 ktordeur-sat65 pulp: pulp_rpm.plugins.importers.yum.sync:ERROR: [5a597d1d] (8816-83104)     repair=self.validate)
Jan 26 10:37:09 ktordeur-sat65 pulp: pulp_rpm.plugins.importers.yum.sync:ERROR: [5a597d1d] (8816-83104)   File "/usr/lib/python2.7/site-packages/pulp_rpm/plugins/importers/y
um/modularity.py", line 415, in synchronize
Jan 26 10:37:09 ktordeur-sat65 pulp: pulp_rpm.plugins.importers.yum.sync:ERROR: [5a597d1d] (8816-83104)     remainder = add_defaults(repository, defaults, repair=repair)
Jan 26 10:37:09 ktordeur-sat65 pulp: pulp_rpm.plugins.importers.yum.sync:ERROR: [5a597d1d] (8816-83104)   File "/usr/lib/python2.7/site-packages/pulp_rpm/plugins/importers/y
um/modularity.py", line 340, in add_defaults
Jan 26 10:37:09 ktordeur-sat65 pulp: pulp_rpm.plugins.importers.yum.sync:ERROR: [5a597d1d] (8816-83104)     add_default(repository, default, model)
Jan 26 10:37:09 ktordeur-sat65 pulp: pulp_rpm.plugins.importers.yum.sync:ERROR: [5a597d1d] (8816-83104)   File "/usr/lib/python2.7/site-packages/pulp_rpm/plugins/importers/y
um/modularity.py", line 287, in add_default
Jan 26 10:37:09 ktordeur-sat65 pulp: pulp_rpm.plugins.importers.yum.sync:ERROR: [5a597d1d] (8816-83104)     model.save_and_import_content(path)
Jan 26 10:37:09 ktordeur-sat65 pulp: pulp_rpm.plugins.importers.yum.sync:ERROR: [5a597d1d] (8816-83104)   File "/usr/lib/python2.7/site-packages/pulp/server/db/model/__init_
_.py", line 935, in save_and_import_content
Jan 26 10:37:09 ktordeur-sat65 pulp: pulp_rpm.plugins.importers.yum.sync:ERROR: [5a597d1d] (8816-83104)     self.save()
Jan 26 10:37:09 ktordeur-sat65 pulp: pulp_rpm.plugins.importers.yum.sync:ERROR: [5a597d1d] (8816-83104)   File "/usr/lib/python2.7/site-packages/mongoengine/document.py", line 324, in save
Jan 26 10:37:09 ktordeur-sat65 pulp: pulp_rpm.plugins.importers.yum.sync:ERROR: [5a597d1d] (8816-83104)     object_id = collection.save(doc, **write_concern)
Jan 26 10:37:09 ktordeur-sat65 pulp: pulp_rpm.plugins.importers.yum.sync:ERROR: [5a597d1d] (8816-83104)   File "/usr/lib64/python2.7/site-packages/pymongo/collection.py", line 2180, in save
Jan 26 10:37:09 ktordeur-sat65 pulp: pulp_rpm.plugins.importers.yum.sync:ERROR: [5a597d1d] (8816-83104)     check_keys, False, manipulate, write_concern)
Jan 26 10:37:09 ktordeur-sat65 pulp: pulp_rpm.plugins.importers.yum.sync:ERROR: [5a597d1d] (8816-83104)   File "/usr/lib64/python2.7/site-packages/pymongo/collection.py", line 709, in _update
Jan 26 10:37:09 ktordeur-sat65 pulp: pulp_rpm.plugins.importers.yum.sync:ERROR: [5a597d1d] (8816-83104)     codec_options=self.codec_options).copy()
Jan 26 10:37:09 ktordeur-sat65 pulp: pulp_rpm.plugins.importers.yum.sync:ERROR: [5a597d1d] (8816-83104)   File "/usr/lib64/python2.7/site-packages/pymongo/pool.py", line 216, in command
Jan 26 10:37:09 ktordeur-sat65 pulp: pulp_rpm.plugins.importers.yum.sync:ERROR: [5a597d1d] (8816-83104)     self._raise_connection_failure(error)
Jan 26 10:37:09 ktordeur-sat65 pulp: pulp_rpm.plugins.importers.yum.sync:ERROR: [5a597d1d] (8816-83104)   File "/usr/lib64/python2.7/site-packages/pymongo/pool.py", line 343
Jan 26 10:37:09 ktordeur-sat65 pulp: pulp_rpm.plugins.importers.yum.sync:ERROR: [5a597d1d] (8816-83104)     raise error
Jan 26 10:37:09 ktordeur-sat65 pulp: pulp_rpm.plugins.importers.yum.sync:ERROR: [5a597d1d] (8816-83104) InvalidStringData: strings in documents must be valid UTF-8: '\xe7\x00\x00\x00\x04450\x00\x14\x00\x00\x00\x020\x00\x08\x00\x00\x00default\x00\x00\x04460\x00\x14\x00\x00\x00\x020\x00\x08\x00\x00\x00default\x00\x00\x04455\x00\x14\x00\x00\x00\x020\x00\x08\x00\x00\x00default\x00\x00\x04455-dkms\x00\x14\x00\x00\x00\x020\x00\x08\x00\x00\x00default\x00\x00\x04460-dkms\x00\x14\x00\x00\x00\x020\x00\x08\x00\x00\x00default\x00\x00\x04latest-dkms\x00\x14\x00\x00\x00\x020\x00\x08\x00\x00\x00default\x00\x00\x04450-dkms\x00\x14\x00\x00\x00\x020\x00\x08\x00\x00\x00default\x00\x00\x04latest\x00\x14\x00\x00\x00\x020\x00\x08\x00\x00\x00default\x00\x00\x00'
~~~

# production.log
~~~
2021-01-26T10:37:09 [E|bac|] PLP0000: Importer indicated a failed response (Katello::Errors::PulpError)
/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.16.0.16/app/lib/actions/pulp/abstract_async_task.rb:121:in `block in external_task='
/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.16.0.16/app/lib/actions/pulp/abstract_async_task.rb:119:in `each'
/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.16.0.16/app/lib/actions/pulp/abstract_async_task.rb:119:in `external_task='
/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.16.0.16/app/lib/actions/pulp/repository/sync.rb:28:in `external_task='
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-1.4.7/lib/dynflow/action/polling.rb:100:in `poll_external_task_with_rescue'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-1.4.7/lib/dynflow/action/polling.rb:22:in `run'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-1.4.7/lib/dynflow/action/cancellable.rb:14:in `run'
/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.16.0.16/app/lib/actions/pulp/abstract_async_task.rb:45:in `run'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-1.4.7/lib/dynflow/action.rb:571:in `block (3 levels) in execute_run'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-1.4.7/lib/dynflow/middleware/stack.rb:27:in `pass'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-1.4.7/lib/dynflow/middleware.rb:19:in `pass'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-1.4.7/lib/dynflow/middleware.rb:32:in `run'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-1.4.7/lib/dynflow/middleware/stack.rb:23:in `call'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-1.4.7/lib/dynflow/middleware/stack.rb:27:in `pass'
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-1.4.7/lib/dynflow/middleware.rb:19:in `pass'
/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.16.0.16/app/lib/actions/middleware/remote_action.rb:16:in `block in run'
...
...
/opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-1.4.7/lib/dynflow/executors/sidekiq/serialization.rb:27:in `perform'
[ sidekiq ]
[ concurrent-ruby ]
2021-01-26T10:37:10 [I|bac|] Task {label: Actions::Katello::Repository::Sync, id: 3695e3e6-d6a1-4174-8641-bdc75c00cbc1, execution_plan_id: 27738963-e86c-49ef-a62a-cb5c8032e66c} state changed: stopped  result: warning
2021-01-26T10:37:10 [I|bac|] Task {label: Actions::Katello::Repository::Sync, id: 3695e3e6-d6a1-4174-8641-bdc75c00cbc1, execution_plan_id: 27738963-e86c-49ef-a62a-cb5c8032e66c} state changed: stopped  result: warning
~~~


Version-Release number of selected component (if applicable):
Satellite 6.8


How reproducible:
Always


Steps to Reproduce:
1. Create custom product; create custom repo
2. Sync the repo
3. Sync fails

Comment 7 Tanya Tereshchenko 2021-06-08 11:55:25 UTC
*** Bug 1965595 has been marked as a duplicate of this bug. ***

Comment 10 qfz769 2021-06-11 10:47:14 UTC
I have been dealing with this issue since March, and can maybe add a few references to this bug.
This issue is also open at nvidia: https://github.com/NVIDIA/yum-packaging-precompiled-kmod/issues/19
There are no obvious invalid UTF-8 in the yaml-file refered in the module section of http://developer.download.nvidia.com/compute/cuda/repos/rhel8/x86_64/repodata/repomd.xml

This seems to be a pulp issue. Hope this bug gets visible sometime - it is not 'just another 3rd party malformed repo'.

Comment 12 Daniel Alley 2021-06-11 13:06:04 UTC
@qfz769, Have you encountered this recently? Some other users who have reported this issue have reported the following:

"Either Nvidia has updated the repo, or something has changed on my end which I am unaware of, because suddenly I am able to sync the repo the same as you. This was not the case during last week."

We have also been unable to reproduce this ourselves. It's possible that there are environmental factors like particular geographic locations hitting different CDNs (one of which could be corrupted), or perhaps specific libraries on a host system causing problems. The fact that the problem seems to have occurred spontaneously (without changes to Pulp / Satellite) and potentially been resolved spontaneously (at least for some users) does make it difficult to explain in the context of "it's a Satellite bug". Since this now has the attention of someone at Nvidia I will go discuss it in the threads you've provided (thank you for bringing them up).

If you could also try again and let us know how it went, that would be great.

Comment 14 Daniel Alley 2021-06-12 03:32:31 UTC
@qfz769, I believe I may have discovered the issue. Will post an update next week.

Comment 18 qfz769 2021-06-14 07:55:14 UTC
I just did a fresh sync - still the same error.

Comment 20 Grant Gainey 2021-06-14 14:19:52 UTC
(In reply to qfz769 from comment #18)
> I just did a fresh sync - still the same error.

Thanks to @dalley we know what's going on here. See attahced pulp issue for details!

Comment 21 pulp-infra@redhat.com 2021-06-14 15:08:12 UTC
The Pulp upstream bug status is at NEW. Updating the external tracker on this bug.

Comment 22 pulp-infra@redhat.com 2021-06-14 15:08:14 UTC
The Pulp upstream bug priority is at High. Updating the external tracker on this bug.

Comment 23 pulp-infra@redhat.com 2021-06-29 03:09:40 UTC
The Pulp upstream bug status is at POST. Updating the external tracker on this bug.

Comment 24 pulp-infra@redhat.com 2021-06-29 03:09:41 UTC
The Pulp upstream bug priority is at Normal. Updating the external tracker on this bug.

Comment 25 pulp-infra@redhat.com 2021-06-29 19:06:06 UTC
The Pulp upstream bug status is at CLOSED - DUPLICATE. Updating the external tracker on this bug.

Comment 26 pulp-infra@redhat.com 2021-06-29 19:06:09 UTC
The Pulp upstream bug status is at POST. Updating the external tracker on this bug.

Comment 28 pulp-infra@redhat.com 2021-07-19 14:07:33 UTC
The Pulp upstream bug status is at POST. Updating the external tracker on this bug.

Comment 29 pulp-infra@redhat.com 2021-07-19 14:07:41 UTC
The Pulp upstream bug status is at MODIFIED. Updating the external tracker on this bug.

Comment 30 pulp-infra@redhat.com 2021-07-20 19:06:25 UTC
The Pulp upstream bug status is at MODIFIED. Updating the external tracker on this bug.

Comment 31 pulp-infra@redhat.com 2021-07-20 19:06:27 UTC
All upstream Pulp bugs are at MODIFIED+. Moving this bug to POST.

Comment 35 qfz769 2021-08-27 08:51:34 UTC
I allow myself to add a workaround that worked for me, undtil this issue hopefully is fixed in a minor 6.9.z

# cp  /usr/lib/python2.7/site-packages/pulp_rpm/plugins/importers/yum/repomd/modules.py /usr/lib/python2.7/site-packages/pulp_rpm/plugins/importers/yum/repomd/modules.py-backup

Edit: 
# vim  /usr/lib/python2.7/site-packages/pulp_rpm/plugins/importers/yum/repomd/modules.py 

    profile_defaults = {}
    for stream, defaults in module.peek_profile_defaults().items():
        profile_defaults[stream] = defaults.get()
    return bson.BSON.encode(profile_defaults)        <----- 

Change to : 

    return bson.binary.Binary(bson.BSON.encode(profile_defaults))           <---

Restart the services and try to sync the repo:
# satellite-maintain service restart

Comment 36 Daniel Alley 2021-08-27 12:53:33 UTC
@qfz769.ku.dk That's not just a workaround, it's the proper patch :)

Yes, I'm hoping that it can be slipped into the next z-stream, I think it barely missed the window for the last one.

Comment 42 pulp-infra@redhat.com 2021-10-12 14:08:52 UTC
The Pulp upstream bug status is at CLOSED - CURRENTRELEASE. Updating the external tracker on this bug.

Comment 43 pulp-infra@redhat.com 2021-10-12 14:08:54 UTC
The Pulp upstream bug priority is at Normal. Updating the external tracker on this bug.

Comment 44 Lai 2021-10-21 14:42:38 UTC
Steps to test

1. Create a custom repo using the following repos:
  [0] https://developer.download.nvidia.com/compute/cuda/repos/rhel8/ppc64le/
  [1] http://developer.download.nvidia.com/compute/cuda/repos/rhel8/x86_64
2. Sync repos


Expected result:
Syncing should sync successfully

Actual result:
Syncing does sync successfully

Verified on 6.9.7_01 with pulp-rpm-plugins-2.21.5.2-1.el7sat.noarch

Comment 49 errata-xmlrpc 2021-11-10 16:23:39 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Satellite 6.9.7 Async Bug Fix Update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:4612


Note You need to log in before you can comment on or make changes to this bug.