Bug 624158

Summary: [patch] locale fallback for LANGUAGE
Product: [Fedora] Fedora Reporter: ncfiedler
Component: systemdAssignee: Zbigniew Jędrzejewski-Szmek <zbyszek>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: rawhideCC: abelcheung, awilliam, bochecha, dwayne, harald, i18n-bugs, iarlyy, initscripts-maint-list, johannbg, jonathan, lnykryn, metherid, mfabian, msekleta, nkumar, piotrdrag, plautrba, pnemade, psatpute, rvokal, systemd-maint, vpavlin, zbyszek
Target Milestone: ---Keywords: FutureFeature, i18n
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: systemd-219-3.fc22 Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-02-18 21:42:10 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
patch for initscripts language-fallback
none
initscripts-9.21-3-LANGUAGE-fallback.patch none

Description ncfiedler 2010-08-13 21:11:49 UTC
Description of problem:
system-config-language only sets one language, but does not give the user the possibility to set an individual fallback language. But this would be appreciated not to fall back to English per default. In the case of Low German (nds) for example it does not make any sense to fallback to English but to German (de).
An example for this functionality could be ubuntus language-selector for GNOME desktops. (https://launchpad.net/language-selector)

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:
It would be nice, if system-config-language would be able to set individual fallback languages. (compare ubuntus language-selector)

Additional info:

Comment 1 Jens Petersen 2010-08-19 05:45:16 UTC
Some thoughts:

- anaconda/gdm/system-config-language could set fallback
for gettext apps by setting LANGUAGE
- or maybe glibc could be improved to handle this fallback.

Comment 2 Naveen Kumar 2010-10-05 10:35:15 UTC
After a lot of discussion within i18n, we came to the conclusion that this could be handled better by initscripts and more specifically by lang.sh. So reassigning the bug to initscripts. We also came up with a mechanism to do that using lang.sh. I will submit that as a patch after this comment. I hope it gets accepted.

Comment 3 Naveen Kumar 2010-10-05 10:37:23 UTC
Created attachment 451633 [details]
patch for initscripts language-fallback

the patch should apply to a pkg git checkout

Comment 4 ncfiedler 2010-10-05 12:00:12 UTC
Is it possible, that this patch itself already includes that Low German fallback to German (nds to de) like nds_DE:de_DE ?
At the moment the patch actually does not mention Low German (nds_DE) at all.

Comment 5 Bill Nottingham 2010-10-05 15:51:59 UTC
Some comments...

1) Having fallbacks listed that are just an identity transformation is wasteful; eliding those is best
2) Fallbacks like:

ar_AE ar_AE:ar_SA
ar_BH ar_BH:ar_SA
ar_DZ ar_DZ:ar_SA
ar_EG ar_EG:ar_SA
ar_IN ar_IN:ar_SA:en_IN
ar_IQ ar_IQ:ar_SA
ar_JO ar_JO:ar_SA
ar_KW ar_KW:ar_SA
ar_LB ar_LB:ar_SA
ar_LY ar_LY:ar_SA
ar_MA ar_MA:ar_SA
ar_OM ar_OM:ar_SA
ar_QA ar_QA:ar_SA
ar_SA ar_SA:ar_SA
ar_SD ar_SD:ar_SA
ar_SY ar_SY:ar_SA
ar_TN ar_TN:ar_SA
ar_YE ar_YE:ar_SA

seem pointless, given that *nothing* ships country-specific Arabic translations.

The approach could have some use, but I'm very concerned about the data population for it, and how it would stay maintained.

Comment 6 ncfiedler 2010-10-05 16:06:38 UTC
Ok, so whats with just implementing really needfull/ usefull fallbacks per default? Especially for Low German (nds) it doesnt make sense in any case to fallback to English, but (normal) German (de). 
Every Low German user will /has to set this fallback manually, because every Low German speaker naturally can speak German, but probably not English.
The only exception are people, who speak a dialect of Low German, called Plautdietsch. But this is not the same as the translations I do for Low German. These are called Plattdeutsch, Plattdüütsch or Niederdeutsch.
So you see, an automated fallback would be appreciated, if one fix fallback will not cause a data-size explosion.

Comment 7 ncfiedler 2010-10-05 16:09:27 UTC
So and so, I would like to see an entry for Low German (nds_DE) in that list. However an automated fallback can be in it or not.

Comment 8 Naveen Kumar 2010-10-06 05:49:45 UTC
(In reply to comment #5)
> Some comments...
> 
> 1) Having fallbacks listed that are just an identity transformation is
> wasteful; eliding those is best

I agree to that. These can be removed...

> 2) Fallbacks like:
> 
> ar_AE ar_AE:ar_SA
> ar_BH ar_BH:ar_SA
> .................
> .................
> ar_SY ar_SY:ar_SA
> ar_TN ar_TN:ar_SA
> ar_YE ar_YE:ar_SA
> 
> seem pointless, given that *nothing* ships country-specific Arabic
> translations.

I agree and these can be removed. only "ar" or "ar_SA" could exist.

> 
> The approach could have some use, but I'm very concerned about the data
> population for it, and how it would stay maintained.

The language fallback order can be requested by individual language teams. Hence an entry could be introduced through a bug/feature request. Or initially one bulk entry for fallback orders which are apparent, which could later be moderated through request.

Comment 9 Naveen Kumar 2010-10-06 08:50:00 UTC
Created attachment 451834 [details]
initscripts-9.21-3-LANGUAGE-fallback.patch

the patch should apply to a pkg git checkout

Changelog
- modified language-fallback
- add nds_DE

Comment 11 Jens Petersen 2010-12-09 05:19:58 UTC
Any chance of including this in F15? :)

Comment 12 Bill Nottingham 2010-12-09 18:13:39 UTC
My concern is I'd really prefer this be some sort of standard upstream data source (like iso-codes).

Comment 13 Bill Nottingham 2010-12-09 18:43:39 UTC
See https://bugzilla.redhat.com/show_bug.cgi?id=636290 for a similar usage case.

Comment 14 Jens Petersen 2010-12-21 05:58:47 UTC
*** Bug 246325 has been marked as a duplicate of this bug. ***

Comment 15 Abel Cheung 2012-11-06 16:50:53 UTC
(In reply to comment #12)
> My concern is I'd really prefer this be some sort of standard upstream data
> source (like iso-codes).

I beg to differ partially. Personally I'd also hope this fallback list under maintenance of single entity and then make it usable across all distro if somebody push it hard enough; but the nature of such language fallback is for showing how much i18n love the distro has for local users of various languages, instead of being some international standardization effort.

Comment 16 Adam Williamson 2012-11-10 01:27:30 UTC
Bill - the revival of this bug is due to https://bugzilla.redhat.com/show_bug.cgi?id=872282 , where we've found another obvious case, Chinese-speaking locales.

There are details in that bug, but to give a summary, there are four official Chinese locales: zh_CN (China itself, the People's Republic), zh_SG (Singapore), zh_TW (Taiwan/ROC), zh_HK (Hong Kong SAR).

We also noted that the case really exists in English too, except we don't notice it, because the _default_ fallback is to English (well, C, which in practice is English). So if you set en_UK and whatever you're reading doesn't have a 'UK English' translation, you get English anyway.

Anyhow, the Chinese case is much like the Arabian or Low German cases or UK English vs. U.S. English. There's really only two major variants of written Chinese, traditional and simplified. China and Chinese-speaking Singaporeans usually use simplified, and Hong Kong and Taiwan usually use traditional. zh_CN is used as the universal 'simplified Chinese' locale - so Singaporean users usually use the zh_CN locale - and Taiwan is used as the universal 'traditional Chinese' locale - so HK users usually use the zh_TW locale. Translations usually don't exist for the zh_HK and zh_SG locales. In the past we actually labelled zh_CN as 'Chinese (Simplifed)' and zh_TW as 'Chinese (Traditional)' in the installer, with no reference to the actual place, and only offered those choices.

Still, there may be other locale stuff that differs between Hong Kong and Taiwan and between Singapore and China, and there _are_ a few cases where written usage differs between the locales, so it would be good if we could allow users to pick the actual place they live, and if they pick Hong Kong or Singapore, use the locale info for those places and translations if actually available, but fall back on zh_TW or zh_CN (respectively) translations if 'native' translations aren't available. Setting the LANGUAGE variable would allow this.

It almost goes without saying, of course, that Chinese-speaking people are at least _potentially_ a gigantic user base, so covering their use cases well would be a good move.

I don't know if anything's changed in terms of there being a central canonical upstream reference for providing the necessary relationship data in the last couple of years, but it may be worth checking, and if there isn't, even trying to kickstart one. Ubuntu is still using the LANGUAGE var, apparently, so they're presumably maintaining the data somewhere.

Comment 17 Mathieu Bridon 2013-01-08 12:18:16 UTC
(In reply to comment #16)
> I don't know if anything's changed in terms of there being a central
> canonical upstream reference for providing the necessary relationship data
> in the last couple of years, but it may be worth checking, and if there
> isn't, even trying to kickstart one. Ubuntu is still using the LANGUAGE var,
> apparently, so they're presumably maintaining the data somewhere.

Didn't systemd take over this with localed?

This file gets set by localectl:
 # cat /etc/locale.conf 
 LANG=en_HK.utf8

It even allows specifying the values following the fallback syntax from above:
 # localectl set-locale "LANG=en_HK.utf8:en_GB.utf8"
 # cat /etc/locale.conf 
 LANG=en_HK.utf8:en_GB.utf8

I'm not sure whether or not this works though...

Comment 18 Harald Hoyer 2013-03-18 10:29:03 UTC
(In reply to comment #17)
> (In reply to comment #16)
> > I don't know if anything's changed in terms of there being a central
> > canonical upstream reference for providing the necessary relationship data
> > in the last couple of years, but it may be worth checking, and if there
> > isn't, even trying to kickstart one. Ubuntu is still using the LANGUAGE var,
> > apparently, so they're presumably maintaining the data somewhere.
> 
> Didn't systemd take over this with localed?
> 
> This file gets set by localectl:
>  # cat /etc/locale.conf 
>  LANG=en_HK.utf8
> 
> It even allows specifying the values following the fallback syntax from
> above:
>  # localectl set-locale "LANG=en_HK.utf8:en_GB.utf8"
>  # cat /etc/locale.conf 
>  LANG=en_HK.utf8:en_GB.utf8
> 
> I'm not sure whether or not this works though...

yes, initscripts is not involved in this anymore.

Comment 19 Lukáš Nykrýn 2013-03-26 13:53:16 UTC
This has to do nothing with initscripts anymore. Maybe we can do something with it in systemd.

Comment 20 Mike FABIAN 2013-06-18 07:34:03 UTC
(In reply to Mathieu Bridon from comment #17)
> (In reply to comment #16)
> > I don't know if anything's changed in terms of there being a central
> > canonical upstream reference for providing the necessary relationship data
> > in the last couple of years, but it may be worth checking, and if there
> > isn't, even trying to kickstart one. Ubuntu is still using the LANGUAGE var,
> > apparently, so they're presumably maintaining the data somewhere.
> 
> Didn't systemd take over this with localed?
> 
> This file gets set by localectl:
>  # cat /etc/locale.conf 
>  LANG=en_HK.utf8
> 
> It even allows specifying the values following the fallback syntax from
> above:
>  # localectl set-locale "LANG=en_HK.utf8:en_GB.utf8"
>  # cat /etc/locale.conf 
>  LANG=en_HK.utf8:en_GB.utf8
> 
> I'm not sure whether or not this works though...

No, that does *not* work!:

mfabian@ari:~
$ LANG=en_HK.utf8:en_GB.utf8 locale charmap
locale: Cannot set LC_CTYPE to default locale: No such file or directory
locale: Cannot set LC_MESSAGES to default locale: No such file or directory
locale: Cannot set LC_ALL to default locale: No such file or directory
ANSI_X3.4-1968
mfabian@ari:~
$ 

The correct way to specify a translation fallback from en_HK to en_GB is:

$ LANGUAGE=en_HK:en_GB LANG=en_HK.UTF-8 locale charmap  
UTF-8
mfabian@ari:~
$ 

(assuming all LC_* variables are unset here)

Comment 21 Zbigniew Jędrzejewski-Szmek 2014-10-12 22:19:06 UTC
Let's revive this thread...

tl;dr: I propose to add a new D-Bus method. Would this work for gnome/other desktops and other interested parties?

localectl and localed allow setting multiple values for LANGUAGE. We could do that, but I don't think that the current interface should be modified to automatically expand language tags. After all, I someone tells localectl to set LANGUAGE=x localed should obey. Instead, we could move the logic to all consumers of the api, e.g. gnome-control-center, and have those consumers append the fallbacks themselves. I don't like this because it would require signficant extension in every consumer. So in the end I think this should be implemented in localed, but as a new d-bus method:

org.freedesktop.locale1.SetLocaleWithFallback(
                in  as locale,
                in  b add_fallback,
                in  b user_interaction);

This would be like SetLocale, but if LANGUAGE is set, it would be extended with fallback from a table carried by systemd-localed. If LANG was set, but not LANGUAGE, LANGUAGE including fallbacks would be added.

All the clients would have to be modified. In case of localectl this would be relatively simple patch to add --no-fallback and to flip set-locale to use SetLocaleWithFallback by default, and SetLocale with the option.

Higher level like desktop environments would simply switch over to SetLocaleWithFallback.

I'd be happy to implement this scheme, but I'd like to have some confirmation that this would be useful from interested/knowledgeable people.

Comment 22 Zbigniew Jędrzejewski-Szmek 2015-01-31 17:23:21 UTC
Looking at the overwhelming response, I think that people don't actually care that much and I was overcomplicating things. So I'll modify org.freedesktop.locale1.SetLocale() to provide LANGUAGE including fallbacks if it was not provided in the args specified by the caller.

Comment 23 Zbigniew Jędrzejewski-Szmek 2015-02-06 14:20:07 UTC
Applied upstream, will be in systemd 219.

Comment 24 Zbigniew Jędrzejewski-Szmek 2015-02-18 21:42:10 UTC
Table of fallbacks is maintained upstream at http://cgit.freedesktop.org/systemd/systemd/tree/src/locale/language-fallback-map?id=HEAD. We intend to import the table from Ubuntu, but this hasn't happened yet. Bug can be filed here or at freedesktop.org to add new languages or change the table.