Bug 1917613

Summary: glibc: When LANG is set to es_ES@euro ellipsis can't be encoded (ISO-8859-15 vs. UTF-8)
Product: Red Hat Enterprise Linux 8 Reporter: Ryan Blakley <rblakley>
Component: glibcAssignee: glibc team <glibc-bugzilla>
Status: CLOSED NOTABUG QA Contact: qe-baseos-tools-bugs
Severity: medium Docs Contact:
Priority: medium    
Version: 8.3CC: ashankar, codonell, dj, fweimer, mnewsome, pfrankli, sipoyare
Target Milestone: rc   
Target Release: 8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-01-19 01:56:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Ryan Blakley 2021-01-18 23:07:10 UTC
Description of problem: When LANG is set to es_ES ellipsis can't be encoded in python3, it throws "UnicodeEncodeError: 'charmap' codec can't encode character '\u2026' in position X: character maps to <undefined>".


Version-Release number of selected component (if applicable): glibc-langpack-es-2.28-127.el8.x86_64 I also reproduced the issue on the latest fedora version glibc-langpack-es-2.32-3.fc33.x86_64.


Steps to Reproduce:
1. Install the glibc-langpack-es.
2. # export LANG=es_ES@euro
3. # cat > /tmp/test.py <<EOF
#!/usr/bin/python3
import sys
sys.stdout.write("Test…")
EOF
4. # python3 /tmp/test.py


Actual results:
root@ryan-rhel8 ~ # python3 /tmp/test.py 
Traceback (most recent call last):
  File "/tmp/test.py", line 3, in <module>
    sys.stdout.write("Test\u2026")
  File "/usr/lib64/python3.6/encodings/iso8859_15.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u2026' in position 4: character maps to <undefined>


Expected results:
root@ryan-rhel8 ~ # python3 /tmp/test.py 
Test…


Additional info: Not sure if this belongs in glibc or python3, but the issue doesn't occur when setting to en_US. Also I reproduced this on the latest packages on fedora 33.

Comment 1 Carlos O'Donell 2021-01-19 01:56:16 UTC
The es_ES@euro locale is a ISO-8859-15 based locale and as such has no representation for the Unicode U+2026 Horizontal Ellipsis.

If you want to be able to use a Unicode U+2026 Horzontal Ellipsis then I suggest using the es_ES.UTF-8 locale which includes the euro symbol and all the other Unicode characters (up to Unicode 11.0) supported with UTF-8.

I'm marking this as CLOSED / NOTABUG.

If you have any more questions please don't hesitate to reopen the issue.

If sosreport requires a UTF-8 locale then it should set the locale to C.UTF-8 (always provided UTF-8 locale), but that is a distinct issue in the design of sosreport.