Bug 887685 - [abrt] setroubleshoot-server-3.1.19-3.fc18: catchall_boolean.py:60:get_if_text:UnicodeDecodeError: 'utf8' codec can't decode byte 0xd0 in position 0: unexpected end of data
Summary: [abrt] setroubleshoot-server-3.1.19-3.fc18: catchall_boolean.py:60:get_if_tex...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: setroubleshoot
Version: 18
Hardware: x86_64
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Daniel Walsh
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: abrt_hash:bc863cca750158792cbc9bba17a...
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-12-17 02:15 UTC by kitmaxter
Modified: 2013-11-20 14:17 UTC (History)
11 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2013-11-20 14:17:56 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
File: backtrace (1.68 KB, text/plain)
2012-12-17 02:15 UTC, kitmaxter
no flags Details
File: core_backtrace (496 bytes, text/plain)
2012-12-17 02:15 UTC, kitmaxter
no flags Details
File: dso_list (79 bytes, text/plain)
2012-12-17 02:15 UTC, kitmaxter
no flags Details
File: environ (588 bytes, text/plain)
2012-12-17 02:15 UTC, kitmaxter
no flags Details
I think this will fix the problem (443 bytes, patch)
2013-01-07 20:59 UTC, Daniel Walsh
no flags Details | Diff

Description kitmaxter 2012-12-17 02:15:00 UTC
Version-Release number of selected component:
setroubleshoot-server-3.1.19-3.fc18

Additional info:
cmdline:        /usr/bin/python -Es /usr/bin/sealert -s
executable:     /usr/bin/sealert
kernel:         3.6.10-4.fc18.x86_64
uid:            1000

Truncated backtrace:
catchall_boolean.py:60:get_if_text:UnicodeDecodeError: 'utf8' codec can't decode byte 0xd0 in position 0: unexpected end of data

Traceback (most recent call last):
  File "/usr/lib64/python2.7/site-packages/setroubleshoot/browser.py", line 678, in on_delete_button_clicked
    self.show_current_alert()
  File "/usr/lib64/python2.7/site-packages/setroubleshoot/browser.py", line 764, in show_current_alert
    rb = self.add_row(p, alert, args)
  File "/usr/lib64/python2.7/site-packages/setroubleshoot/browser.py", line 406, in add_row
    if_text = _("If ") + alert.substitute(plugin.get_if_text(avc, args))
  File "/usr/share/setroubleshoot/plugins/catchall_boolean.py", line 60, in get_if_text
    return _("you want to %s") % txt[0].lower() + txt[1:]
UnicodeDecodeError: 'utf8' codec can't decode byte 0xd0 in position 0: unexpected end of data

Local variables in innermost frame:
self: <plugins.catchall_boolean.plugin object at 0x790fa90>
txt: '\xd0\x9a\xd0\xb5\xd1\x80\xd1\x83\xd0\xb2\xd0\xb0\xd1\x82\xd0\xb8 \xd0\xbc\xd0\xbe\xd0\xb6\xd0\xbb\xd0\xb8\xd0\xb2\xd1\x96\xd1\x81\xd1\x82\xd1\x8e \xd0\xb2\xd0\xb8\xd0\xba\xd0\xbe\xd1\x80\xd0\xb8\xd1\x81\xd1\x82\xd0\xb0\xd0\xbd\xd0\xbd\xd1\x8f mmap \xd1\x83 \xd0\xbd\xd0\xb8\xd0\xb6\xd0\xbd\xd1\x96\xd1\x85 \xd0\xbe\xd0\xb1\xd0\xbb\xd0\xb0\xd1\x81\xd1\x82\xd1\x8f\xd1\x85 \xd0\xbf\xd1\x80\xd0\xbe\xd1\x81\xd1\x82\xd0\xbe\xd1\x80\xd1\x83 \xd0\xb0\xd0\xb4\xd1\x80\xd0\xb5\xd1\x81 \xd1\x83 \xd1\x81\xd0\xbf\xd0\xbe\xd1\x81\xd1\x96\xd0\xb1, \xd0\xb2\xd0\xb8\xd0\xb7\xd0\xbd\xd0\xb0\xd1\x87\xd0\xb5\xd0\xbd\xd0\xb8\xd0\xb9 /proc/sys/kernel/mmap_min_addr.'
args: ('mmap_low_allowed', '1', 'unconfined_selinux')
avc: [<setroubleshoot.audit_data.AuditRecord object at 0x791d6d0>]

Comment 1 kitmaxter 2012-12-17 02:15:03 UTC
Created attachment 664616 [details]
File: backtrace

Comment 2 kitmaxter 2012-12-17 02:15:05 UTC
Created attachment 664617 [details]
File: core_backtrace

Comment 3 kitmaxter 2012-12-17 02:15:07 UTC
Created attachment 664618 [details]
File: dso_list

Comment 4 kitmaxter 2012-12-17 02:15:10 UTC
Created attachment 664619 [details]
File: environ

Comment 5 kitmaxter 2012-12-17 02:18:42 UTC
I was looking through sealert entries when exception got thrown.

Comment 6 Daniel Walsh 2012-12-17 19:47:07 UTC
What lanquage were you using?

Comment 7 Eugene 2013-01-05 05:11:59 UTC
Every time on boot.

Package: setroubleshoot-server-3.1.19-3.fc18
OS Release: Fedora release 18 (Spherical Cow)

Comment 8 kitmaxter 2013-01-05 11:53:37 UTC
(In reply to comment #6)
> What lanquage were you using?

Sorry, missed your reply.

I'm using Ukrainian language.

$ echo $LANG
uk_UA.utf8

Comment 9 Eugene 2013-01-05 16:28:08 UTC
I'm using russian and suomi.

Comment 10 Daniel Walsh 2013-01-07 20:59:14 UTC
Created attachment 674340 [details]
I think this will fix the problem

Could some one apply this patch to catchall_boolean.py and see if it fixes the problem

# cd /usr/share/setroubleshoot/plugins
# patch < /tmp/unicode.patch
# python catchall_boolean.py

Then run sealert.

Comment 11 Dave Malcolm 2013-01-07 21:03:36 UTC
Dan: there are two issues here:
File "/usr/share/setroubleshoot/plugins/catchall_boolean.py", line 60, in get_if_text
    return _("you want to %s") % txt[0].lower() + txt[1:]

Issue 1:
I believe you're missing some parentheses: I believe you meant:

    return _("you want to %s") % (txt[0].lower() + txt[1:])

whereas precedence order means that python is interpreting this as:

    return (_("you want to %s") % txt[0].lower()) + txt[1:]

Issue 2:

FWIW you're relying on the side-effect of importing GTK on the default encoding:

>>> import sys
>>> sys.getdefaultencoding()
'ascii'
>>> import gtk
>>> sys.getdefaultencoding()
'utf-8'

>>> txt = '\xd0\x9a\xd0\xb5\xd1\x80\xd1\x83\xd0\xb2\xd0\xb0\xd1\x82\xd0\xb8 \xd0\xbc\xd0\xbe\xd0\xb6\xd0\xbb\xd0\xb8\xd0\xb2\xd1\x96\xd1\x81\xd1\x82\xd1\x8e \xd0\xb2\xd0\xb8\xd0\xba\xd0\xbe\xd1\x80\xd0\xb8\xd1\x81\xd1\x82\xd0\xb0\xd0\xbd\xd0\xbd\xd1\x8f mmap \xd1\x83 \xd0\xbd\xd0\xb8\xd0\xb6\xd0\xbd\xd1\x96\xd1\x85 \xd0\xbe\xd0\xb1\xd0\xbb\xd0\xb0\xd1\x81\xd1\x82\xd1\x8f\xd1\x85 \xd0\xbf\xd1\x80\xd0\xbe\xd1\x81\xd1\x82\xd0\xbe\xd1\x80\xd1\x83 \xd0\xb0\xd0\xb4\xd1\x80\xd0\xb5\xd1\x81 \xd1\x83 \xd1\x81\xd0\xbf\xd0\xbe\xd1\x81\xd1\x96\xd0\xb1, \xd0\xb2\xd0\xb8\xd0\xb7\xd0\xbd\xd0\xb0\xd1\x87\xd0\xb5\xd0\xbd\xd0\xb8\xd0\xb9 /proc/sys/kernel/mmap_min_addr.'
>>> print(txt)
Керувати можливістю використання mmap у нижніх областях простору адрес у спосіб, визначений /proc/sys/kernel/mmap_min_addr.

>>> print(u'foo: %s' % txt[0].lower())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'utf8' codec can't decode byte 0xd0 in position 0: unexpected end of data

Given that the format string is unicode, the above code implicitly attempts to convert the %s argument to unicode, using the system default encoding.

The issue is that txt is a string of bytes in UTF-8 encoding, and txt[0] is accessing the first *byte*, but 0xd0 is the first byte of a two-byte encoding of the unicode code-point 041A.  Hence the string that's a single byte 0xd0 is malformed UTF8, and python complains.

To do it correctly, you need to work at the level of unicode code points, not bytes:
>>> unitxt = unicode(txt, encoding='utf8')
>>> print(unitxt)
Керувати можливістю використання mmap у нижніх областях простору адрес у спосіб, визначений /proc/sys/kernel/mmap_min_addr.
>>> unitxt[0]
u'\u041a'

So you want something like:

unitxt = unicode(txt, encoding='utf8')
return _("you want to %s") % (unitxt[0].lower() + unitxt[1:])

so that it's working on unicode code points rather on individual bytes, and so that the precedence is definitely correct.

Comment 12 Daniel Walsh 2013-01-07 21:41:02 UTC
Fixed in setroubleshoot-plugins-3.0.47-1.fc18


Note You need to log in before you can comment on or make changes to this bug.