Bug 922149

Summary: platform.platform() can throw Unicode error
Product: [Fedora] Fedora Reporter: Toshio Ernie Kuratomi <a.badger>
Component: python3Assignee: Dave Malcolm <dmalcolm>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: amcnabb, bkabrda, dmalcolm, tomspur
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-03-15 22:20:50 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 889784    
Attachments:
Description Flags
Use surrogateescape when reading from /etc/fedora-release none

Description Toshio Ernie Kuratomi 2013-03-15 15:13:14 UTC
Created attachment 710720 [details]
Use surrogateescape when reading from /etc/fedora-release

Tested on python-3.2 and python-3.3.  platform.platform() looks for a file in /etc/ that looks like it will contain the name of the Linux distribution that python3 is running on.  Once found, it reads the contents of the file to have a name for the Linux distribution.  Most Linux distributions do create files inside of /etc/ with a single line which is the distribution name so this is a good heuristic.  However, these files are created by the operating system vendor and so they can have a different encoding than the encoding of the locale the user uses.  This means that if there are non-ascii characters inside the file, user code that invokes platform.platform() may throw a traceback.

Test:

$ LC_ALL=en_US.utf8 sudo echo ' Café' >> /etc/fedora-release
$ LC_ALL=C python3
>>> import platform
>>> platform.platform()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib64/python3.2/platform.py", line 1538, in platform
    distname,distversion,distid = dist('')
  File "/usr/lib64/python3.2/platform.py", line 358, in dist
    full_distribution_name=0)
  File "/usr/lib64/python3.2/platform.py", line 329, in linux_distribution
    firstline = f.readline()
  File "/usr/lib64/python3.2/encodings/ascii.py", line 26, in decode
    return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 22: ordinal not in range(128)

It seems that the standard method of fixing these that we're promoting in python3 is to use surrogateescape.  I'll provide a patch that does that.

Comment 1 Toshio Ernie Kuratomi 2013-03-15 22:20:50 UTC
http://koji.fedoraproject.org/koji/taskinfo?taskID=5129382
http://koji.fedoraproject.org/koji/taskinfo?taskID=5129380

This is the implementation recommended by MvL upstream (default to utf-8 everywhere and fallback to surrogateescape when that is not available).