Bug 1313818 - Have setlocale fallback to C.UTF-8 before C/POSIX.
Have setlocale fallback to C.UTF-8 before C/POSIX.
Status: NEW
Product: Fedora
Classification: Fedora
Component: glibc (Show other bugs)
26
All Linux
unspecified Severity medium
: ---
: ---
Assigned To: glibc team
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2016-03-02 07:07 EST by Carlos O'Donell
Modified: 2017-02-28 04:55 EST (History)
10 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Carlos O'Donell 2016-03-02 07:07:15 EST
In Fedora we guarantee a C.UTF-8 locale.

In order to support more applications that require C.UTF-8 and may be installed on systems with few or no locales, we must support the case where the application calls setlocale (LC_ALL, ""); and doesn't check the return code to make sure they have a UTF-8 capable locale.

Steps required:
* Review the differences between C.UTF-8 and C/POSIX
* Change setlocale code to try C.UTF-8 loading before falling back to the builtin C/POSIX locales.

This would fix several of the failure modes we saw during the glibc language pack split up.
Comment 1 Jakub Jelinek 2016-03-02 07:13:54 EST
But do you really want to do that unconditionally?  What if the LC_ALL env var (or others) request instead some 8-bit or other non-UTF-8 locale?  Getting C.UTF-8 instead of C would be certainly surprising and undesirable.
Comment 2 Florian Weimer 2016-03-02 09:52:29 EST
I think we should have a small table of well-known UTF-8 locale names, and use C.UTF-8 only if the locale is known as a UTF-8 locale.  This would help secondary locale implementations which implement their own charset conversion, but rely on nl_langinfo (CODESET) to get the current charset, too.
Comment 3 Jakub Jelinek 2016-03-02 09:56:14 EST
We already parse the names of the locales, canonicalizing UTF-8 vs. utf-8 etc., don't we?  Thus perhaps we could just recognize that and handle the *.UTF-8/*.utf-8 (perhaps with suffixes) locales that way; not sure if we have any UTF-8 locales without that suffix, those would need to be special cased.
Comment 4 Carlos O'Donell 2016-03-02 11:22:12 EST
(In reply to Jakub Jelinek from comment #3)
> We already parse the names of the locales, canonicalizing UTF-8 vs. utf-8
> etc., don't we?  Thus perhaps we could just recognize that and handle the
> *.UTF-8/*.utf-8 (perhaps with suffixes) locales that way; not sure if we
> have any UTF-8 locales without that suffix, those would need to be special
> cased.

This is a great idea. My initial idea was simply a strawman proposal to start the discussion of what we should and should not do.

We could and should certainly start with something like this since the motivating use-case is likely *_*.UTF-8 locales that don't exist and then cause gnome-terminal to fail to start (gnome-terminal won't start without a UTF-8 locale, see bug 1312960, and this is intended behaviour).
Comment 5 Jan Kurik 2016-07-26 00:09:36 EDT
This bug appears to have been reported against 'rawhide' during the Fedora 25 development cycle.
Changing version to '25'.
Comment 6 Fedora End Of Life 2017-02-28 04:55:26 EST
This bug appears to have been reported against 'rawhide' during the Fedora 26 development cycle.
Changing version to '26'.

Note You need to log in before you can comment on or make changes to this bug.