In Fedora we guarantee a C.UTF-8 locale.
In order to support more applications that require C.UTF-8 and may be installed on systems with few or no locales, we must support the case where the application calls setlocale (LC_ALL, ""); and doesn't check the return code to make sure they have a UTF-8 capable locale.
* Review the differences between C.UTF-8 and C/POSIX
* Change setlocale code to try C.UTF-8 loading before falling back to the builtin C/POSIX locales.
This would fix several of the failure modes we saw during the glibc language pack split up.
But do you really want to do that unconditionally? What if the LC_ALL env var (or others) request instead some 8-bit or other non-UTF-8 locale? Getting C.UTF-8 instead of C would be certainly surprising and undesirable.
I think we should have a small table of well-known UTF-8 locale names, and use C.UTF-8 only if the locale is known as a UTF-8 locale. This would help secondary locale implementations which implement their own charset conversion, but rely on nl_langinfo (CODESET) to get the current charset, too.
We already parse the names of the locales, canonicalizing UTF-8 vs. utf-8 etc., don't we? Thus perhaps we could just recognize that and handle the *.UTF-8/*.utf-8 (perhaps with suffixes) locales that way; not sure if we have any UTF-8 locales without that suffix, those would need to be special cased.
(In reply to Jakub Jelinek from comment #3)
> We already parse the names of the locales, canonicalizing UTF-8 vs. utf-8
> etc., don't we? Thus perhaps we could just recognize that and handle the
> *.UTF-8/*.utf-8 (perhaps with suffixes) locales that way; not sure if we
> have any UTF-8 locales without that suffix, those would need to be special
This is a great idea. My initial idea was simply a strawman proposal to start the discussion of what we should and should not do.
We could and should certainly start with something like this since the motivating use-case is likely *_*.UTF-8 locales that don't exist and then cause gnome-terminal to fail to start (gnome-terminal won't start without a UTF-8 locale, see bug 1312960, and this is intended behaviour).
This bug appears to have been reported against 'rawhide' during the Fedora 25 development cycle.
Changing version to '25'.
This bug appears to have been reported against 'rawhide' during the Fedora 26 development cycle.
Changing version to '26'.
This message is a reminder that Fedora 26 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 26. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora 'version'
Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version.
Thank you for reporting this issue and we are sorry that we were not
able to fix it before Fedora 26 is end of life. If you would still like
to see this bug fixed and are able to reproduce it against a later version
of Fedora, you are encouraged change the 'version' to a later Fedora
version prior this bug is closed as described in the policy above.
Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.
Change needs to happen upstream first and will need a built-in C.UTF-8 locale.
(In reply to Carlos O'Donell from comment #4)
> UTF-8 locale, see bug 1312960, and this is intended behaviour).
Should be bug 1312690 (noted in see also).
This bug appears to have been reported against 'rawhide' during the Fedora 29 development cycle.
Changing version to '29'.
We want to get C.UTF-8 upstream so this depends on upstream sourceware bug:
We are going to close this bug and keep tracking the upstream bug here:
Florian and I have discussed this internally and we don't want setlocale to ever hide a failure, so the most likely scenario is that applications will need to add code to handle an initial setlocale failure, and then attempt a second setlocale with C.UTF-8, and that second setlocale should always succeed either because upstream has a builtin C.UTF-8 or the distro provided a C.UTF-8 that can't be removed (depending on your vintage of glibc). Therefore the end-user experience should be the same, and the code is ready and correct.
Closing as CLOSED/UPSTREAM where we will track this.