Bug 1707640 - Request for new sinitic languages
Summary: Request for new sinitic languages
Keywords:
Status: NEW
Alias: None
Product: Fedora Localization
Classification: Fedora
Component: l10n-requests
Version: unspecified
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: noriko
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-05-08 02:00 UTC by Wei-Lun Chao
Modified: 2019-08-13 07:01 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Embargoed:


Attachments (Terms of Use)
Proof of concept for translation in yue (247.80 KB, application/zip)
2019-05-20 03:34 UTC, Wei-Lun Chao
no flags Details

Description Wei-Lun Chao 2019-05-08 02:00:38 UTC
Please add following languages to zanata team:
nan, hak, yue, lzh, cmn

Thanks!

Comment 1 Zamir SUN 2019-05-08 02:32:16 UTC
Not all users are linguistics.

I strongly run AGAINST the language code *cmn*.

We already have all the simplified Chinese translation under language zh_Hans_CN.
The 'cmn' shows in installation media of Fedora 30 already got confusion for why there is a thing called '官话' listed. 

I don't quite agree with lzh. It's no longer actively using in the modern world.

Comment 2 Alick Zhao 2019-05-08 02:35:19 UTC
What's the rationale to add these languages? nan, hak, yue, and cmn are all spoken languages, not the written system. (yue can be written but not really standardized.) lzh is the classical language. Who will use it as the computer interface every day?

Besides, do we have communities to support each of them? It'll be far more challenging to provide up to date translation for Fedora which has two releases per year than Wikipedia.

Comment 3 TianShixiong 2019-05-10 03:48:19 UTC
I don't think it's necessary.
1. Even major domestic applications or softwares in Chinese market don't have these languages included.
2. Currently, Simplified Chinese localisation is already costing too much energy and time of the contributors. I definitely am sure that the contributors wouldn't and couldn't put their efforts on these languages.
3. Under these circumstances, adding these languages support would also waste the time of the maintainers.
4. Furthermore, as far as I know, these languages appear in Wikipedia, but Fedora Project is not like the former. 
In my humble opinion, localisation is a kind of work for written languages. Today, the most widely used and accepted written language in China mainland is Simplified Chinese, which is the official language.

Comment 4 Mike FABIAN 2019-05-10 10:36:16 UTC
https://github.com/mike-fabian/langtable/releases/tag/0.0.43

In this release, I reduced the rank of cmn_TW.UTF-8 and zh_SG.UTF-8 to 0 for languageId="zh".

So Anaconda will not show these anymore in future.

Comment 5 Wei-Lun Chao 2019-05-20 03:34:21 UTC
Created attachment 1571068 [details]
Proof of concept for translation in yue

28 machine-translated, unreviewed and unusable po-files in yue language for projects in Zanata.

Comment 6 Zamir SUN 2019-05-21 05:11:35 UTC
(In reply to Wei-Lun Chao from comment #5)
> 28 machine-translated, unreviewed and unusable po-files in yue language for
> projects in Zanata.

IMO Fedora do not allow machine translate so this won't help at all.

Comment 7 jibecfed 2019-05-28 21:18:36 UTC
from this list: nan, hak, yue, lzh, cmn

I could only find cmn in appstream translations[0]:

project			type	url					name	summary	description	package stats
xed			desktop	http://www.github.com/linuxmint/xed	yes	yes	0		1
mate-calc		desktop	http://www.mate-desktop.org		yes	yes	0		0.98
xreader			desktop	http://github.com/linuxmint/xreader	yes	yes	0		0.95
pluma			desktop	http://www.mate-desktop.org		0	yes	0		0.95
mate-screenshot		desktop	http://www.mate-desktop.org		yes	yes	0		0.92
mate-terminal		desktop	http://www.mate-desktop.org		yes	yes	0		0.92
mate-dictionary		desktop	http://www.mate-desktop.org		0	yes	0		0.92
mate-disk-usage-analyzerdesktop	http://www.mate-desktop.org		0	yes	0		0.92
mate-search-tool	desktop	http://www.mate-desktop.org		0	yes	0		0.92
eom			desktop	http://mate-desktop.org/		0	yes	0		0.9
caja			desktop	http://www.mate-desktop.org		0	0	0		0.88
atril			desktop	http://www.mate-desktop.org		yes	yes	0		0.83
engrampa		desktop	http://www.mate-desktop.org		0	yes	0		0.75
mate-system-monitor	desktop	http://www.mate-desktop.org		0	yes	0		0.59
audacious		desktop	https://audacious-media-player.org	0	0	0		0.56

Meaning on 1600 items in appstream:
* 6 desktop application have a translated name in cmn
* 13 desktop application have a translated summary in cmn
* none have a translated summary in cmn
* 12 desktop application have a translation progress > 80% (based on words, as far as I remember)

[0] https://github.com/Jibec/fedora-translation-statistics/blob/master/history/appdata/f30_AppData_Detailed_cmn.csv



I could only find cmn in workstation's LiveCD packages:
the RPM package is the only one containing a translation in cmn: https://pagure.io/fedora-localization-statistics/blob/master/f/results/f30/rpm-4.14.2.1-4.fc30.1.src.rpm.stats.csv#_26



On the weblate translation platfom, cmn is an alias for zh_hans
https://github.com/WeblateOrg/language-data/blob/master/aliases.csv#L65

It shows:
cmn;zh_Hans
zh_hans_cn;zh_Hans
zh_cmn_hans;zh_Hans

it's coming from iso_639-2 and iso_639-3:
https://github.com/WeblateOrg/language-data/blob/master/gen-iso-aliases


Does it means that for ISO, cmn = zh_Hans = zh_hans_cn = zh_cmn_hans?

Comment 8 Zamir SUN 2019-05-29 05:47:56 UTC
As a Chinese mandarin speaker from China mainland, for us, cmn in ISO 639-3 is Chinese mandarin, which is equally the same thing as zh_Hans. We call it 'mandarin'(普通话) only when we refer to the oral language. When referring to the writing system, we say 简体中文, which is zh_Hans_cn. So from my point of view, cmn is zh_hans. I don't speak for the people in China Taiwan, since they use different writing system which is probably called zh_Hans_TW. I don't know about Singapore.

Comment 9 Wei-Lun Chao 2019-08-13 07:01:18 UTC
(In reply to Mike FABIAN from comment #4)
> https://github.com/mike-fabian/langtable/releases/tag/0.0.43
> 
> In this release, I reduced the rank of cmn_TW.UTF-8 and zh_SG.UTF-8 to 0 for
> languageId="zh".
> 
> So Anaconda will not show these anymore in future.

cmn_TW uses traditional chinese, so "官话" should be "官話".


Note You need to log in before you can comment on or make changes to this bug.