Bug 1693751 - rpm.labelCompare expects strings, but RPM headers are bytes
Summary: rpm.labelCompare expects strings, but RPM headers are bytes
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: rpm
Version: 31
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Panu Matilainen
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On: 1693759 1693760 1693762 1693766 1693767 1693768 1693771 1693773 1693774 1693787 1693788 1713107 1722868
Blocks: 1779194
TreeView+ depends on / blocked
 
Reported: 2019-03-28 14:57 UTC by Panu Matilainen
Modified: 2023-03-24 14:40 UTC (History)
27 users (show)

Fixed In Version: rpm-4.14.2.1-6.fc31
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1779194 (view as bug list)
Environment:
Last Closed: 2020-11-24 20:21:54 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Panu Matilainen 2019-03-28 14:57:45 UTC
Description of problem:
With Python 3, the contents of RPM headers are returned as bytes and thus cannot be fed back to functions like rpm.labelCompare which expects strings.

Version-Release number of selected component (if applicable):
All released versions of rpm, including python3-rpm-4.14.2

How reproducible:
always

Steps to Reproduce:
import rpm
required_version = ('0', '4.8.0', '1')
transaction_set = rpm.TransactionSet()
db_result = transaction_set.dbMatch('name', 'rpm')
package = list(db_result)[0]
print(rpm.labelCompare((package['epoch'], package['version'], package['release']), required_version))

Actual results:
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: argument 1, item 1 must be str or None, not bytes

Expected results:
the result of labelCompare() is printed

Additional info:
rpm.labelCompare() is just the tip of the iceberg, this type mismatch is present all over the rpm-python API and makes the bindings unusable and fundamentally incompatible with all the rpm-related scripts ever written.

This has been fixed upstream by making the python3 bindings return ALL string data as surrogate-escaped utf-8 string objects to make things "just work" for the normal case while allowing non-utf8 data to still be handled. For further rationale see the upstream commit message:

https://github.com/rpm-software-management/rpm/commit/84920f898315d09a57a3f1067433eaeb7de5e830 

This is the Fedora side counterpart of bug 1631292, to be used as a tracking bug for known incompatibilities and other possible blockers for making this change.

Comment 1 Panu Matilainen 2019-04-10 09:29:31 UTC
This change is now active in rawhide as of rpm-4.14.2.1-6.fc31.

As a temporary crutch to allow existing users to migrate to the new behavior, the strings returned by rpm will have a fake .decode() method attached that will issue the following warning pointing to this bug if used:

    UnicodeWarning: decode() called on unicode string, see https://bugzilla.redhat.com/show_bug.cgi?id=1693751

The evil crutch will be removed once critical users have been adapted to the new behavior. Thank you for your attention.

Comment 2 Jan Pazdziora 2019-04-15 10:09:45 UTC
Thanks for the warning message. It does not however seem to catch

  p_nevra = b"%s-%s-%s.%s" % (p["name"], p["version"], p["release"], p["arch"])

which failes with

  TypeError: %b requires a bytes-like object, or an object that implements __bytes__, not 'str'

Should the compatibility layer implement __bytes__ as well?

Comment 3 Jan Pazdziora 2019-04-15 10:10:59 UTC
I also wonder how decoding to str already in the python3-rpm will affect working with ancient rpms that may have ISO-8859-1 values, or in general random content there.

Comment 4 Panu Matilainen 2019-04-15 11:06:01 UTC
The returned string is surrogate-escaped so it can handle fairly arbitrary data, and in the unlikely event that you actually know the original encoding you *can* get back to that:

>>> import rpm
>>> str = u'älämölö'
>>> b = str.encode('iso-8859-1')
>>> b
b'\xe4l\xe4m\xf6l\xf6'
>>> h = rpm.hdr()
>>> h['group'] = b
>>> d = h['group']
>>> d
'\udce4l\udce4m\udcf6l\udcf6'
>>> t = bytes(d, 'utf-8', 'surrogateescape')
>>> t
b'\xe4l\xe4m\xf6l\xf6'
>>> t.decode('iso-8859-1')
'älämölö'

But in reality, unless the header has an explicit 'encoding' tag there's no way to know what that encoding is, and the only value that the encoding tag can be is 'utf-8' in packages that have been validated to consist only of utf-8 strings. In practise all API users I've looked into have just had hardcoded .decode('utf-8') calls in place anyway so they'd fail anyway.

As for __bytes__, we'd prefer keeping the evil compat trickery to absolute bare minimum, both in content- and time-wise. But lets see how if goes, if that's actually a common pattern then maybe it makes sense to support it too.

Comment 5 Pavel Raiskup 2019-04-19 17:32:06 UTC
JFTR, those who so far expected that rpm returns bytes are probably
already doing `header.decode('utf-8')`.  But since rpm changed this to
return 'str', any attempt to `str.decode('utf-8')` raise error for them.

So it will likely mean python traceback for anyone who used this interface
before (example [1] against python-rpkg, detected in [2]).

[1] https://pagure.io/rpkg/pull-request/439
[2] https://pagure.io/copr/copr/issue/677

Comment 6 Miro Hrončok 2019-04-22 11:10:43 UTC
The warning workaround is somewhat naive. This is the problem we are hitting in FedoraReview:

https://pagure.io/FedoraReview/issue/356

Here comes the problem:

>>> import rpm
>>> required_version = ('0', '4.8.0', '1')
>>> transaction_set = rpm.TransactionSet()
>>> db_result = transaction_set.dbMatch('name', 'rpm')
>>> package = list(db_result)[0]

package['version'] is now unicode string:

>>> package['version']
'4.14.2.1'

But the decoding is possible, with warning:


>>> package['version'].decode('utf-8')
__main__:1: UnicodeWarning: decode() called on unicode string, see https://bugzilla.redhat.com/show_bug.cgi?id=1693751
'4.14.2.1'


However anything else is still a problem:

>>> package['version'].split(b'.')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: must be str or None, not bytes




I like the change to Unicode in general, however the "backwards compatibility hack" is not complete and this will break things.
This is a serious backward incompatible API change done in a minor version bump.

I for example have no idea how to write FedoraReview code that is both Fedora 29 and Fedora 31 compatible.
One idea is to write wrappers everywhere or to always decode (with the warning) first, both are horrible.


Is there a flag I can use to get this behavior on older Fedoras? See for example how python-ldap handles this:

https://www.python-ldap.org/en/latest/bytes_mode.html#the-bytes-mode

CCing Petr who might give more insight about more backwards compatible ways of handling this.

Comment 7 Miro Hrončok 2019-04-22 11:34:02 UTC
Other code that will break, comparisons:

  archs[0].lower() == b"noarch"

Comment 8 Robert-André Mauchin 🐧 2019-04-22 13:49:27 UTC
This breaks fedpkg too, I can't import srpm for example.

Comment 9 Panu Matilainen 2019-04-23 12:17:43 UTC
This is of course a drastic change and not something you'd even dream of doing outside major version bump normally, but in this case the API has been buggy all along and the change is needed to set things straight. There would be other ways to "fix" the API, but this is the only thing that makes sense going forward.

And sure the compat hack is incomplete, there's no way to make such a change work everywhere. The .decode() hack just happens to make a whole bunch of users "just work" with the warning, which is why it's there, and there are a thing or two that could be done to further help immediate compatibility, BUT: since these hacks also *introduce* some problems with compatibility detection, we'd like to get rid of it as soon as at all possible.

We could add some sort of flag for enabling headers to return bytes again. We could (and would like to) also introduce these things in older Fedora releases, just need to figure out how to go about it. Doing this in rawhide now is bit of a "see what breaks" exercise because there's no way to find out just by looking at sources (which I did)

Comment 10 Miro Hrončok 2019-04-23 12:27:25 UTC
I'd argue that rawhide should not be the place to land change unannounced and wait what happens.
IMHO this deserves some kind of proper communication - at last a HEADS UP e-mail to devel-announce, if not a Fedora 31 change proposal baked by FESCo.
FedoraReview didn't even get a depends on bugzilla report like some others did.


Either way, rather than having a flag that enables bytes behavior on rawhide, I'd appreciate a flag on EL7, Fedora 29 and Fedora 30 that can give me the new behavior, so we can finally stop using the (indeed horrible) bytest API. Is that reasonable to expect? How can I help with the "figure out how to go about it" part?

Comment 11 Panu Matilainen 2019-04-30 05:48:45 UTC
(In reply to Miro Hrončok from comment #10)
> I'd argue that rawhide should not be the place to land change unannounced
> and wait what happens.
> IMHO this deserves some kind of proper communication - at last a HEADS UP
> e-mail to devel-announce, if not a Fedora 31 change proposal baked by FESCo.
> FedoraReview didn't even get a depends on bugzilla report like some others
> did.

I generally agree about communicating such changes but this is ultimately a bugfix in a dark corner that few people use in practise, and those few affected were supposed to get a heads-up via the bug. Apologies for missing fedora-review, not sure what happened there, maybe accidentally skipped during mass-filing.

Anyway, this will soon get a wider audience with rpm 4.15 change proposal, consider this a test-drive for the feature: we're in fairly good shape now in that some of the most critical bits (anaconda and mock) have already been adjusted to the new behavior.

> Either way, rather than having a flag that enables bytes behavior on
> rawhide, I'd appreciate a flag on EL7, Fedora 29 and Fedora 30 that can give
> me the new behavior, so we can finally stop using the (indeed horrible)
> bytest API. Is that reasonable to expect? 

Landing changes into Fedora is not a problem (and planned, in a vague manner), EL7 is a different story entirely.

> How can I help with the "figure out how to go about it" part?

The devil is in the details.

If we add a forward-compat flag to older releases now, seems to me it actually just complicates the situation for callers because we dont really want such a flag in upstream so the presence (or lack of thereof) the flag doesn't mean a thing, and then it's also unlikely that we can introduce such a change into EL7 so people would still need to deal with both API styles. Not to mention all the other rpm versions in the wild (Suse, Mageia etc) - those broken Py3 bindings will be haunting people for quite some time still no matter what.

Concrete ideas on how to make this somehow sensible for the API users are most certainly welcome.

Comment 12 Miro Hrončok 2019-04-30 09:06:24 UTC
> Landing changes into Fedora is not a problem (and planned, in a vague manner), EL7 is a different story entirely.

If this is patchable on the EPEL python3-rpm package level (without the need to patch the RHEL7 rpm package itself), it should be good.

Comment 13 Panu Matilainen 2019-04-30 09:22:59 UTC
Ahem. Right. There is no python3-rpm in EL7 itself :D
In that case, no problem.

Comment 14 Miro Hrončok 2019-04-30 09:57:22 UTC
As a selfish maintainer of Fedroa/EPEL only software, this is the pseudo API I'd like to have everywhere:

    rpm.TransactionSet(text_type=str/bytes/(not set))


This is what the API would do:


On Fedora 30 and earlier:

 no text_type supplied: fallbacks to bytes, no change

 text_type=str: get's the new behavior, but no str.decode() method

 text_type=bytes: maintains the previous behavior


On Fedora 31 and newer:

 no text_type supplied: fallbacks to unicode strings with your custom str.decode() method (possibly fallbacks to bytes until 4.15 with DeprecationWarning)

 text_type=str: get's the new behavior, but no str.decode() method

 text_type=bytes: maintains the previous behavior, with DeprecationWarning

Comment 15 Panu Matilainen 2019-04-30 11:17:00 UTC
Thanks for the feedback. However the transaction set is not the right place, it has nothing to do with the data really and more importantly, headers do exist outside transaction sets. 

It'd have to be either per-header (which seems really painful), or a global module-level thing that only affects data returned from headers. So it'd probably have to be a "rpm.header_string_type" kind of thing with defaults similar to what you describe,  but it's still ugly and something upstream we wouldn't want upstream at all because it doesn't really make any sense there (having a real bytes-mode for the whole api might make some sense, but that'd be quite different from what the existing broken strings-as-bytes behavior is and wouldn't help compatibility)

Comment 16 Miro Hrončok 2019-04-30 11:24:21 UTC
> headers do exist outside transaction sets

I wasn't sure about this, although it crossed my mind.


> It'd have to be either per-header (which seems really painful)

Very painful.


> So it'd probably have to be a "rpm.header_string_type" kind of thing

Probably yes.


> something we wouldn't want upstream at all because it doesn't really make any sense there

I cannot help with that, ultimately, that's your choice.

Comment 17 Miro Hrončok 2019-05-27 08:42:25 UTC
Panu, what is the supported way to detect this feature early (e.g. without data)?

I want to do:


if <rpm returns unicode strings>:
    spec = rpm.spec
    hdr = rpm.hdr
else:
    class MyUnicodeSpecWrapper(rpm.spec):
        ...

    class MyUnicodeHdrWrapper(rpm.hdr):
        ...

    spec = MyUnicodeSpecWrapper
    hdr = MyUnicodeHdrWrapper

Comment 18 Panu Matilainen 2019-05-28 06:46:44 UTC
This will return True on versions where strings are strings and False on the broken python3-versions returning strings as bytes:

def rpm_is_sane():
    s = "aaa"
    h = rpm.hdr()
    h['name'] = s
    return (h['name'] == s)

This should work on any rpm version >= 4.8.0, if you need to support older, catch the TypeError from header instance creation and return True in that case (those old versions are limited to python2)

Comment 19 Panu Matilainen 2019-05-28 06:48:33 UTC
Oh and FWIW, once we have rpm 4.15 in rawhide I'll try to deal with bringing this to stable versions with some sort of compat flag. Details unknown for the time being, haven't had the bandwidth to think about it so far.

Comment 20 Miro Hrončok 2019-05-28 10:53:08 UTC
Thanks.

Comment 21 Ben Cotton 2019-08-13 16:56:07 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 31 development cycle.
Changing version to '31'.

Comment 22 Ben Cotton 2019-08-13 19:08:05 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 31 development cycle.
Changing version to 31.

Comment 23 Ben Cotton 2020-11-03 15:12:19 UTC
This message is a reminder that Fedora 31 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora 31 on 2020-11-24.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
Fedora 'version' of '31'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 31 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 24 ir. Jan Gerrit Kootstra 2020-11-06 22:28:36 UTC
Hello Ben Cotton,

Cannot the message referring to this bug be removed from the next update of rpm-4.14.3-4.el8.x86_64, it is confusing.

Regards,

Jan Gerrit Kootstra

Comment 25 Ben Cotton 2020-11-24 20:21:54 UTC
Fedora 31 changed to end-of-life (EOL) status on 2020-11-24. Fedora 31 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.

Comment 28 Dylan Lipp 2021-03-11 20:03:41 UTC
Not to necro a closed bug, and I recognize this was closed as EOL for Fedora 31, but this issue still persists in RHEL 8, is there a continuation of this bug elsewhere?

Comment 29 HERMES 2021-03-16 17:05:40 UTC
Hi, I agree,  running RHEL 8.3 

# yum update
Updating Subscription Management repositories.
/usr/lib/python3.6/site-packages/dateutil/parser/_parser.py:70: UnicodeWarning: decode() called on unicode string, see https://bugzilla.redhat.com/show_bug.cgi?id=1693751
  instream = instream.decode()

Comment 30 junior pro 2021-04-15 04:36:33 UTC
I solved the problem in rhel 8.3 where the message appears:

/usr/lib/python3.6/site-packages/dateutil/parser/_parser.py:70: UnicodeWarning: decode() called on unicode string, see https://bugzilla.redhat.com/show_bug.cgi?id=1693751
instream = instream.decode()


I don't speak English, I'm going to use the translator.


To enter the folder and file to edit

su root
password:


cd /usr/lib/python3.6/site-packages/dateutil/parser
gedit _parser.py


on line 63 I traded six.PY2 by six.PY3


this is the original code:


        if six.PY2:
            # In Python 2, we can't duck type properly because unicode has
            # a 'decode' function, and we'd be double-decoding
            if isinstance(instream, (bytes, bytearray)):
                instream = instream.decode()


this is the changed code:

        if six.PY3:
            # In Python 2, we can't duck type properly because unicode has
            # a 'decode' function, and we'd be double-decoding
            if isinstance(instream, (bytes, bytearray)):
                instream = instream.decode()



save the file and you're done

Comment 31 HERMES 2021-04-15 06:48:24 UTC
Hi junior pro,

Tested on "Red Hat Enterprise Linux 8.3 (Ootpa)"  on linux on Power Systems, works like a charm.

Huge thanks for your help, I appreciate.

I just added a first step to make a copy of the file, in case of :
       cp _parser.py  _parser.py.ori

Thank you again

Comment 32 Ben 2021-05-05 10:45:32 UTC
I'm also seeing references to this BZ on RHEL 8.3:

[yum transaction]
Installed products updated.
Uploading Tracer Profile
/usr/lib/python3.6/site-packages/tracer/packageManagers/rpm.py:201: UnicodeWarning: decode() called on unicode string, see https://bugzilla.redhat.com/show_bug.cgi?id=1693751
  package.description = hdr[rpm.RPMTAG_SUMMARY].decode()

/usr/lib/python3.6/site-packages/tracer/packageManagers/rpm.py:202: UnicodeWarning: decode() called on unicode string, see https://bugzilla.redhat.com/show_bug.cgi?id=1693751
  package.category = hdr[rpm.RPMTAG_GROUP].decode()

/usr/lib/python3.6/site-packages/tracer/packageManagers/rpm.py:208: UnicodeWarning: decode() called on unicode string, see https://bugzilla.redhat.com/show_bug.cgi?id=1693751
  package.version = hdr[rpm.RPMTAG_VERSION].decode()

/usr/lib/python3.6/site-packages/tracer/packageManagers/rpm.py:209: UnicodeWarning: decode() called on unicode string, see https://bugzilla.redhat.com/show_bug.cgi?id=1693751
  package.release = hdr[rpm.RPMTAG_RELEASE].decode()

Installed:
  apr-1.6.3-11.el8.x86_64
  apr-util-1.6.1-6.el8.x86_64
  apr-util-bdb-1.6.1-6.el8.x86_64
  apr-util-openssl-1.6.1-6.el8.x86_64
  httpd-2.4.37-30.module+el8.3.0+7001+0766b9e7.x86_64
  httpd-filesystem-2.4.37-30.module+el8.3.0+7001+0766b9e7.noarch
  httpd-tools-2.4.37-30.module+el8.3.0+7001+0766b9e7.x86_64
  mod_http2-1.15.7-2.module+el8.3.0+7670+8bf57d29.x86_64
  mod_ssl-1:2.4.37-30.module+el8.3.0+7001+0766b9e7.x86_64
  redhat-logos-httpd-81.1-1.el8.noarch

This seems like a regression.

The lines referenced in /usr/lib/python3.6/site-packages/tracer/packageManagers/rpm.py are:

def _load_package_info_from_hdr(self, package, hdr):
-->  package.description = hdr[rpm.RPMTAG_SUMMARY].decode()
-->  package.category = hdr[rpm.RPMTAG_GROUP].decode()

    epoch = hdr[rpm.RPMTAG_EPOCH]
    if epoch:
       package.epoch = epoch.decode()

--> package.version = hdr[rpm.RPMTAG_VERSION].decode()
--> package.release = hdr[rpm.RPMTAG_RELEASE].decode()


The package itself is python3-tracer-0.7.3-1.el8sat.noarch

Comment 33 Panu Matilainen 2021-05-24 05:23:13 UTC
This is a Fedora bug, not RHEL.

Comment 34 Ben 2021-05-24 15:15:17 UTC
Then can you explain why this BZ is being referenced on a RHEL8.3 system, please?  And how we should report it given this BZ number is what is being shown on my screen?

Comment 35 HERMES 2021-05-24 15:44:34 UTC
Hi Ben,

What I can say it is that I found the same bug on a RH8.3 as I wrote previously. The fix shared by "junior pro" has worked perfectly.

Thanks

Comment 36 Ben 2021-05-24 15:54:43 UTC
Sadly, I don't have a

/usr/lib/python3.6/site-packages/dateutil/parser/

directory, so there's no "_parser.py" file to edit.  And 

/usr/lib/python3.6/site-packages/dateutil/parser.py

doesn't have any reference to "PY2" in it.  Nor do I have any reference to "PY2" in 

/usr/lib/python3.6/site-packages/tracer/packageManagers/rpm.py

So I'm a bit stuck.

Comment 37 HERMES 2021-05-25 05:56:19 UTC
Hi, 
-is there a result to the command:
  ls -l /usr/lib/python*/site-packages/dateutil/parser/

  I replaced the version 3.6 by a star *,

-the package is python3-dateutil-2.8.1-3.ibm.el8.noarch
   # rpm -qf /usr/lib/python3.6/site-packages/dateutil/parser/_parser.py 
   python3-dateutil-2.8.1-3.ibm.el8.noarch

I hope this will help you

Comment 38 Ben 2021-05-25 06:41:06 UTC
Thanks for trying, but

ls -l /usr/lib/python*/site-packages/dateutil/parser/
ls: cannot access '/usr/lib/python*/site-packages/dateutil/parser/': No such file or directory

I have 

python3-dateutil-2.6.1-6.el8.noarch

Not a version from IBM.

Comment 39 Panu Matilainen 2021-05-25 06:55:23 UTC
> Then can you explain why this BZ is being referenced on a RHEL8.3 system, please?

Oh, totally forgot we end up referring this on RHEL too. Sorry.

The reference is basically a hint for rpm-python API users to fix their usage, but it ends up leaking to innocent parties as well. It was always a temporary transition period hack, and will be gone in the future. Hopefully 8.5 but it's not entirely my decision.

Comment 40 HERMES 2021-05-25 08:46:49 UTC
Hi,
You're right there is the word ibm in the name of this package.
To be transparent, the VM installed with RHEL 8.3 is used to implement a Power Virtualization Center from IBM. (see http://www.redbooks.ibm.com/redpieces/abstracts/sg248477.html?Open)
During the installation of this product many packages are installed as prerequisites. It may be the explanation.
This VM and so RHEL 8.3 is running on an IBM Power Systems

Comment 41 quino32 2021-05-31 16:37:02 UTC
Ben
       in this path:

  /usr/local/lib/python3.6/site-packages

  there is a file:

  python_dateutil-2.8.1-py3.6.egg

I do not know why it was not unzipped at the time of installation, if you unzip it with:

  unzip python_dateutil-2.8.1-py3.6.egg

  the missing directory will be generated

/usr/local/lib/python3.6/site-packages/dateutil/parser

and inside, there is the file to modify _parser.py following this solution:

https://bugzilla.redhat.com/attachment.cgi?id=1755940&action=diff

Hope it helped you fix the problem.

Comment 42 John 2021-07-15 04:36:53 UTC
I also have hit this problem on fully updated EL8.4, installing PowerVC (and other problems, due to DNF's stupid inability to specify no_proxy for local repositories, in dnf.conf)


Instead of suggesting people go and edit python files, HOW ABOUT REDHAT JUST FIX THE BUG PROPERLY, FOR ONCE.

Comment 43 John 2021-07-15 06:20:28 UTC
I AM NOW GETTING THIS ON EVERY DAMNED RPM COMMAND

WHY DOES EVERYTHING REDHAT TOUCHES TURN TO GARBAGE?

Comment 44 John 2021-07-15 06:28:19 UTC
Look at this garbage:

[root@audccfopvc01 07-15 16:25:56 ~]# dnf list | head -n 6
Updating Subscription Management repositories.
/usr/lib/python3.6/site-packages/dateutil/parser/_parser.py:70: UnicodeWarning: decode() called on unicode string, see https://bugzilla.redhat.com/show_bug.cgi?id=1693751
  instream = instream.decode()

Last metadata expiration check: 1:53:02 ago on Thu 15 Jul 2021 02:33:04 PM AEST.
Installed Packages


Completely infuriating.
If it was some new thing, it would be understandable, but, no.

As usual, i go to investigate the issue, and I find out Red Hat new about this ages ago, and completely failed to address the issue, in EL8.

Just pathetic.

Comment 45 Mujahid Ali 2021-07-23 14:30:20 UTC
While moving from python2 to python3, our team(PowerVC) faced a huge challenge with similar issue.
We have to fix each and every reference where this was reported.

Sharing our hack that we used to resolve the issue.

The variable we normally pass is a string, but python3 SHA512 expects bytes (to allow for unicode). More info at:  Crypto.Hash.SHA512 documentation[1].
The solution is to encode the password to bytes before hashing.

This is almost literally what the error message says (if you know that all strings in python3 are unicode objects):
TypeError: Unicode-objects must be encoded before hashing

The solution is to encode with utf8 :
SHA512.new(password.encode('utf8')).digest()


[1]https://www.dlitz.net/software/pycrypto/api/current/Crypto.Hash.SHA512-module.html

Comment 46 John 2021-07-26 04:29:14 UTC
Hi Mujahid,

after looking into this a bit more it doesnt look serious, but it is a nuisance - after installing PowerVC, I now see this error message every time I run a DNF command, even just a "dnf repolist".

# dnf repolist
Updating Subscription Management repositories.
/usr/lib/python3.6/site-packages/dateutil/parser/_parser.py:70: UnicodeWarning: decode() called on unicode string, see https://bugzilla.redhat.com/show_bug.cgi?id=1693751
  instream = instream.decode()

repo id                                                                                                                      repo name
(repo details deleted for brevity)

I was able to apply the diff at:
  https://bugzilla.redhat.com/attachment.cgi?id=1755940&action=diff

            # Quiet the decode() warning on RHEL8.  If it's a text_type,
            # (str), don't actually look for the decode() function, because
            # it is already decoded.  This includes strings produced by RPM,
            # on RHEL 8, which are both 'str' and have a decode() function
            # which produces a (quite annoying) warning.
            if not isinstance(instream, text_type) and getattr(instream, 'decode', None) is not None:
                instream = instream.decode()

to:
  # rpm -q --filesbypkg  python3-dateutil-2.8.1-3.ibm.el8 | grep _parser.py
  python3-dateutil          /usr/lib/python3.6/site-packages/dateutil/parser/_parser.py

And that shut it up, but would be great if IBM could fix that package so it is done.


Note You need to log in before you can comment on or make changes to this bug.