1313818 – Have setlocale fallback to C.UTF-8 before C/POSIX.

Bug 1313818 - Have setlocale fallback to C.UTF-8 before C/POSIX.

Summary: Have setlocale fallback to C.UTF-8 before C/POSIX.

Keywords:
Status:	CLOSED UPSTREAM
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	glibc
Sub Component:
Version:	rawhide
Hardware:	All
OS:	Linux
Priority:	unspecified
Severity:	medium
Target Milestone:	---
Assignee:	Carlos O'Donell
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2016-03-02 12:07 UTC by Carlos O'Donell
Modified:	2020-02-13 15:49 UTC (History)
CC List:	12 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2019-11-12 14:20:56 UTC
Type:	Bug
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Red Hat Bugzilla	1312690	unspecified	CLOSED	gnome-terminal crashes on launch if only C locale is available	2021-02-22 00:41:40 UTC
Red Hat Bugzilla	1565718	unspecified	CLOSED	After upgrading to F28, my Czech locale is gone	2021-02-22 00:41:40 UTC
Red Hat Bugzilla	1574222	unspecified	CLOSED	dnf system upgrade forgot langpacks-de	2021-02-22 00:41:40 UTC
Sourceware	17318	P2	NEW	[RFE] Provide a C.UTF-8 locale by default	2020-02-13 15:44:56 UTC

Internal Links: 1312690 1565718 1574222

Description Carlos O'Donell 2016-03-02 12:07:15 UTC

In Fedora we guarantee a C.UTF-8 locale.

In order to support more applications that require C.UTF-8 and may be installed on systems with few or no locales, we must support the case where the application calls setlocale (LC_ALL, ""); and doesn't check the return code to make sure they have a UTF-8 capable locale.

Steps required:
* Review the differences between C.UTF-8 and C/POSIX
* Change setlocale code to try C.UTF-8 loading before falling back to the builtin C/POSIX locales.

This would fix several of the failure modes we saw during the glibc language pack split up.

Comment 1 Jakub Jelinek 2016-03-02 12:13:54 UTC

But do you really want to do that unconditionally?  What if the LC_ALL env var (or others) request instead some 8-bit or other non-UTF-8 locale?  Getting C.UTF-8 instead of C would be certainly surprising and undesirable.

Comment 2 Florian Weimer 2016-03-02 14:52:29 UTC

I think we should have a small table of well-known UTF-8 locale names, and use C.UTF-8 only if the locale is known as a UTF-8 locale.  This would help secondary locale implementations which implement their own charset conversion, but rely on nl_langinfo (CODESET) to get the current charset, too.

Comment 3 Jakub Jelinek 2016-03-02 14:56:14 UTC

We already parse the names of the locales, canonicalizing UTF-8 vs. utf-8 etc., don't we?  Thus perhaps we could just recognize that and handle the *.UTF-8/*.utf-8 (perhaps with suffixes) locales that way; not sure if we have any UTF-8 locales without that suffix, those would need to be special cased.

Comment 4 Carlos O'Donell 2016-03-02 16:22:12 UTC

(In reply to Jakub Jelinek from comment #3)
> We already parse the names of the locales, canonicalizing UTF-8 vs. utf-8
> etc., don't we?  Thus perhaps we could just recognize that and handle the
> *.UTF-8/*.utf-8 (perhaps with suffixes) locales that way; not sure if we
> have any UTF-8 locales without that suffix, those would need to be special
> cased.

This is a great idea. My initial idea was simply a strawman proposal to start the discussion of what we should and should not do.

We could and should certainly start with something like this since the motivating use-case is likely *_*.UTF-8 locales that don't exist and then cause gnome-terminal to fail to start (gnome-terminal won't start without a UTF-8 locale, see bug 1312960, and this is intended behaviour).

Comment 5 Jan Kurik 2016-07-26 04:09:36 UTC

This bug appears to have been reported against 'rawhide' during the Fedora 25 development cycle.
Changing version to '25'.

Comment 6 Fedora End Of Life 2017-02-28 09:55:26 UTC

This bug appears to have been reported against 'rawhide' during the Fedora 26 development cycle.
Changing version to '26'.

Comment 7 Fedora End Of Life 2018-05-03 08:49:04 UTC

This message is a reminder that Fedora 26 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 26. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '26'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not
able to fix it before Fedora 26 is end of life. If you would still like
to see this bug fixed and are able to reproduce it against a later version
of Fedora, you are encouraged  change the 'version' to a later Fedora
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.

Comment 8 Florian Weimer 2018-05-03 09:11:28 UTC

Change needs to happen upstream first and will need a built-in C.UTF-8 locale.

Comment 9 Carlos O'Donell 2018-05-10 15:14:55 UTC

(In reply to Carlos O'Donell from comment #4)
...
> UTF-8 locale, see bug 1312960, and this is intended behaviour).

Should be bug 1312690 (noted in see also).

Comment 10 Jan Kurik 2018-08-14 10:58:03 UTC

This bug appears to have been reported against 'rawhide' during the Fedora 29 development cycle.
Changing version to '29'.

Comment 11 Carlos O'Donell 2019-10-30 13:31:49 UTC

We want to get C.UTF-8 upstream so this depends on upstream sourceware bug:
https://sourceware.org/bugzilla/show_bug.cgi?id=17318

Comment 12 Carlos O'Donell 2019-11-12 14:20:56 UTC

We are going to close this bug and keep tracking the upstream bug here:
https://sourceware.org/bugzilla/show_bug.cgi?id=17318

Florian and I have discussed this internally and we don't want setlocale to ever hide a failure, so the most likely scenario is that applications will need to add code to handle an initial setlocale failure, and then attempt a second setlocale with C.UTF-8, and that second setlocale should always succeed either because upstream has a builtin C.UTF-8 or the distro provided a C.UTF-8 that can't be removed (depending on your vintage of glibc). Therefore the end-user experience should be the same, and the code is ready and correct.

Closing as CLOSED/UPSTREAM where we will track this.

Note You need to log in before you can comment on or make changes to this bug.