Bug 922149 - platform.platform() can throw Unicode error
Summary: platform.platform() can throw Unicode error
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: python3
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Dave Malcolm
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: 889784
TreeView+ depends on / blocked
 
Reported: 2013-03-15 15:13 UTC by Toshio Ernie Kuratomi
Modified: 2013-03-15 22:20 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-03-15 22:20:50 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
Use surrogateescape when reading from /etc/fedora-release (524 bytes, patch)
2013-03-15 15:13 UTC, Toshio Ernie Kuratomi
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Python 17429 0 None None None Never

Description Toshio Ernie Kuratomi 2013-03-15 15:13:14 UTC
Created attachment 710720 [details]
Use surrogateescape when reading from /etc/fedora-release

Tested on python-3.2 and python-3.3.  platform.platform() looks for a file in /etc/ that looks like it will contain the name of the Linux distribution that python3 is running on.  Once found, it reads the contents of the file to have a name for the Linux distribution.  Most Linux distributions do create files inside of /etc/ with a single line which is the distribution name so this is a good heuristic.  However, these files are created by the operating system vendor and so they can have a different encoding than the encoding of the locale the user uses.  This means that if there are non-ascii characters inside the file, user code that invokes platform.platform() may throw a traceback.

Test:

$ LC_ALL=en_US.utf8 sudo echo ' Café' >> /etc/fedora-release
$ LC_ALL=C python3
>>> import platform
>>> platform.platform()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib64/python3.2/platform.py", line 1538, in platform
    distname,distversion,distid = dist('')
  File "/usr/lib64/python3.2/platform.py", line 358, in dist
    full_distribution_name=0)
  File "/usr/lib64/python3.2/platform.py", line 329, in linux_distribution
    firstline = f.readline()
  File "/usr/lib64/python3.2/encodings/ascii.py", line 26, in decode
    return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 22: ordinal not in range(128)

It seems that the standard method of fixing these that we're promoting in python3 is to use surrogateescape.  I'll provide a patch that does that.

Comment 1 Toshio Ernie Kuratomi 2013-03-15 22:20:50 UTC
http://koji.fedoraproject.org/koji/taskinfo?taskID=5129382
http://koji.fedoraproject.org/koji/taskinfo?taskID=5129380

This is the implementation recommended by MvL upstream (default to utf-8 everywhere and fallback to surrogateescape when that is not available).


Note You need to log in before you can comment on or make changes to this bug.