Bug 614767 - encoding problems
Summary: encoding problems
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora EPEL
Classification: Fedora
Component: koji
Version: el6
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Dennis Gilmore
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-07-15 08:17 UTC by Florian La Roche
Modified: 2017-02-21 16:16 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-02-21 16:16:38 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Florian La Roche 2010-07-15 08:17:33 UTC
Description of problem:

With koji-1.4 running on RHEL6-beta2, some email notifications
are not sent out correctly:

Traceback (most recent call last):
  File "/usr/sbin/kojid", line 1437, in runTask
    response = (handler.run(),)
  File "/usr/sbin/kojid", line 1513, in run
    return self.handler(*self.params,**self.opts)
  File "/usr/sbin/kojid", line 3649, in handler
    message = message.encode('utf-8')
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 546: ordinal not in range(128)

Thie seems to happen e.g. on dejavu-fonts, ctags, crontabs,
cronie, amanda, akonadi.

The following change seems to work (tested with ctags and crontabs):

--- builder/kojid
+++ builder/kojid
@@ -3455,7 +3455,7 @@ Status: %(status)s\r

         message = self.message_templ % locals()
         # ensure message is in UTF-8
-        message = message.encode('utf-8')
+        message = koji.fixEncoding(message)

         server = smtplib.SMTP(options.smtphost)
         #server.set_debuglevel(True)
@@ -3646,7 +3646,7 @@ Build Info: %(weburl)s/buildinfo?buildID
         subject = self.subject_templ % locals()
         message = self.message_templ % locals()
         # ensure message is in UTF-8
-        message = message.encode('utf-8')
+        message = koji.fixEncoding(message)

         server = smtplib.SMTP(options.smtphost)
         # server.set_debuglevel(True)



Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Florian La Roche 2010-08-31 08:00:32 UTC
Another fix suggestion for this problem from Toshio:
http://lists.fedoraproject.org/pipermail/buildsys/2010-August/003223.html

regards,

Florian La Roche

Comment 2 Toshio Ernie Kuratomi 2010-08-31 16:14:19 UTC
As Florian suggests, using fixEncoding() for that section of code and changing how fixEncoding works is probably the best outcome.  Here's my revised fixEncoding()::

import warnings
def fixEncoding(value, from_encoding=None, fallback=None):
    # fallback is used for backwards compatibility
    if not from_encoding:
        if fallback:
            warnings.warn('fixEncoding() no longer takes a fallback'
                ' keyword arg.  Use from_encoding instead.',
                DeprecationWarning, stacklevel=2)
            from_encoding = fallback
        else:
            from_encoding = 'utf8'

    if isinstance(value, unicode):
        # value is already unicode, so just convert it
        # to a utf8-encoded str
        # Note: with python3, this can fail unless you use an error
        # argument because a unicode string could have been created using
        # the surrogateescape error handler.
        return value.encode('utf8', 'replace')
    else:
        # value is a str but may not be valid utf8 (encoded in latin1, for
        # instance).  Note that the string is almost certain to be mangled
        # in these instances unless you know what encoding the string is in
        # and have set from_encoding to that encoding.
        return value.decode(from_encoding, 'replace').encode('utf8', 'replace')

Note that there's three separate issues with varying severity:

1) UnicodeError known to be thrown for certain notifications.  Florian's fix in this bug's description will solve that.

2) Potential for UnicodeError to be thrown with empty strings.  Removing ``if not value: return value`` from fixEncoding() will fix that.

3) Cosmetically and for debugability, using replacement characters instead of characters from a hardcoded charset is better.  My fixEncoding() will fix that.

Comment 4 Dennis Gilmore 2017-02-21 16:16:38 UTC
closing this bug, if issues still persist then please reopen it


Note You need to log in before you can comment on or make changes to this bug.