Bug 882233

Summary: Chinese (Taiwan) shows Simplified Chinese instead of Traditional Chinese at left (native language view) side
Product: [Fedora] Fedora Reporter: pizza306
Component: anacondaAssignee: Anaconda Maintenance Team <anaconda-maint-list>
Status: CLOSED CANTFIX QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 18CC: anaconda-maint-list, awilliam, dshea, g.kaviyarasu, jonathan, pswo10680, sbueno, stephent98, vanmeeuwen+fedora
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-09-13 20:46:33 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 752665    
Attachments:
Description Flags
anaconda screenshot
none
screenshot showing Chinese language menu items for zh_Hans_CN and zh_Hant_TW
none
test patch to add zh_Hans_CN and zh_Hant_TW to mangleMap in localization.py none

Description pizza306 2012-11-30 13:08:19 UTC
Created attachment 655011 [details]
anaconda screenshot

Description of problem:


Version-Release number of selected component (if applicable):
fedora 18 beta's anaconda installer.
have tried on x86 LiveCD and x86_64 netinst

How reproducible:
no need, just scroll down and check if Chinese (Taiwan) shows in Simplified Chinese or not.
  
Actual results:
it show as 中文(台湾) instead of 中文(台灣)
>> 湾 (Simplified) and 灣 (Traditional) are different characters and Taiwan use Traditional Chinese.

Expected results:
it shows 中文(台灣)

Additional info:
none

Comment 1 Steve Tyler 2012-11-30 17:52:45 UTC
Thanks for pointing that out.[1] Those strings come from Babel[2], and the Chinese name for Taiwan appears to be inconsistent:[3]

$ python
...
>>> import babel
>>> print babel.Locale.parse('zh_TW').display_name
中文 (台湾)
>>> print babel.Locale.parse('zh_Hant_TW').display_name
中文 (繁體中文, 臺灣)
>>> print babel.Locale.parse('zh_TW').english_name
Chinese (Taiwan)
>>> print babel.Locale.parse('zh_Hant_TW').english_name
Chinese (Traditional Han, Taiwan)

[1] See also some of the comments here:
Bug 872282 - Add Chinese (Hong Kong) - zh_HK - locale to anaconda 

[2] The strings from Babel come from the CLDR. These pages have more about the CLDR "process for data collection, resolution, public feedback and release":

CLDR Process
http://cldr.unicode.org/index/process

CLDR Change Requests
http://cldr.unicode.org/index/bug-reports

[3] $ rpm -q python-babel
python-babel-0.9.6-3.fc17.noarch

Comment 2 Chris Lumens 2012-11-30 21:15:05 UTC
So, this is something anaconda needs to deal with, or what?

Comment 3 Steve Tyler 2012-11-30 21:25:36 UTC
Basically, it is an upstream bug (at http://cldr.unicode.org), but it could be reassigned to python-babel, so they can close it for the same reason. :-)

The best resolution would probably be for one of the Chinese translators to file a second bug against CLDR similar to this one:

繁體 is now officially called as 正體 in Taiwan
http://unicode.org/cldr/trac/ticket/5433

Comment 4 Steve Tyler 2012-11-30 22:30:20 UTC
(In reply to comment #3)
> Basically, it is an upstream bug (at http://cldr.unicode.org), but it could
> be reassigned to python-babel, so they can close it for the same reason. :-)
> 
> The best resolution would probably be for one of the Chinese translators to
> file a second bug against CLDR similar to this one:
> 
> 繁體 is now officially called as 正體 in Taiwan
> http://unicode.org/cldr/trac/ticket/5433

There is a Traditional Chinese component:

Product: 	Fedora Localization
Component: 	Chinese Traditional [zh_TW]

https://bugzilla.redhat.com/buglist.cgi?component=Chinese Traditional [zh_TW]&product=Fedora Localization

Comment 5 Cheng-Chia Tseng 2012-12-01 01:47:43 UTC
(In reply to comment #3)
> Basically, it is an upstream bug (at http://cldr.unicode.org), but it could
> be reassigned to python-babel, so they can close it for the same reason. :-)
> 
> The best resolution would probably be for one of the Chinese translators to
> file a second bug against CLDR similar to this one:
> 
> 繁體 is now officially called as 正體 in Taiwan
> http://unicode.org/cldr/trac/ticket/5433

Actually, I don't think that Unicode will do something about the bugs. Because some of them tend to believe these actions which make translations fit the situation in Taiwan more as "political" issues. You can see the comment 1 below my bug opened in CLDR.

Well, mainland China government always has some influence on things related to Taiwan because they want to let whole world believe that Taiwan is part of China. For example, ISO standard use "Taiwan, a province of China" instead of simply "Taiwan". Maybe that's the reason why "中文 (台湾)" is simplified Chinese instead of traditional one.

So I think that they will just keep pending the bugs and never change them. We have zero power on CLDR to make them change.

The best way to avoid these kind of political issues is that we let Anaconda deal with the translations of each languages and script style by itself like the way we did it before.

Comment 6 Cheng-Chia Tseng 2012-12-01 01:58:35 UTC
The second way is to use

>>> print babel.Locale.parse('zh_Hant_TW').display_name
中文 (繁體中文, 臺灣)

instead of 

>>> print babel.Locale.parse('zh_TW').display_name
中文 (台湾)

Comment 7 Steve Tyler 2012-12-04 03:31:44 UTC
Created attachment 657206 [details]
screenshot showing Chinese language menu items for zh_Hans_CN and zh_Hant_TW

(In reply to comment #6)
> The second way is to use
> 
> >>> print babel.Locale.parse('zh_Hant_TW').display_name
> 中文 (繁體中文, 臺灣)
> 
> instead of 
> 
> >>> print babel.Locale.parse('zh_TW').display_name
> 中文 (台湾)

Thanks for your comments and suggestion.

The mangleMap in localization.py already uses "sr_Latn_RS", so adding "zh_Hant_TW" would be a continuation of what already works. I patched anaconda-18.34-1 by adding these entries to mangleMap:
+                 "zh_CN":  "zh_Hans_CN",
+                 "zh_TW":  "zh_Hant_TW"

The attached screenshot shows the resulting language menu.

A test install succeeded. Both /etc/locale.conf and /etc/sysconfig/i18n had the expected contents:
LANG="zh_TW.UTF-8"

Of course the developers will need to be persuaded of that this bug is easy to fix ... :-)

Comment 8 Steve Tyler 2012-12-04 04:25:10 UTC
Created attachment 657211 [details]
test patch to add zh_Hans_CN and zh_Hant_TW to mangleMap in localization.py

This patch is against anaconda-18.34-1.

Tested with:
Patched anaconda-18.34-1
$ qemu-kvm -m 2048 -hda f18-test-2.img -cdrom ~/xfr/fedora/F18/F18-Beta/Final/Fedora-18-Beta-x86_64-Live-Desktop.iso -usb -vga qxl -boot menu=on -usbdevice mouse

Comment 9 Cheng-Chia Tseng 2012-12-04 14:51:14 UTC
Thank you Steve! That will be good.

By the way, the status is "closed as can't fix" now, should we change that or not?

Comment 10 Steve Tyler 2012-12-04 17:12:14 UTC
If the developers say it can't be fixed, they must know something I don't. I'm sorry, but I have done all that I can.

Comment 11 Cheng-Chia Tseng 2012-12-05 01:45:22 UTC
Well, the bug was closed before your patch. :S

(In reply to comment #2)
> So, this is something anaconda needs to deal with, or what?
Chris Lumens, could you consider Steve's patch? That result is better than doing nothing for this issue.

Comment 12 Cheng-Chia Tseng 2012-12-06 15:54:37 UTC
So what can I do for this bug now?

Steve, since I don't have power to change the bug status, should I file another bug and we go there to submit your path again?

Comment 13 Steve Tyler 2012-12-06 18:12:15 UTC
pizza306: Could you reopen this bug? We have new information that can fix it.

BTW, a similar patch was previously proposed in Bug 872282, Comment 18:
Bug 872282 - Add Chinese (Hong Kong) - zh_HK - locale to anaconda 

This commit added sr_Latn_RS to mangleMap:

Use sr_Latn_RS instead of sr_RS@latin in mangleMap (#872786)
http://git.fedorahosted.org/cgit/anaconda.git/commit/pyanaconda/localization.py?id=ffdb22846c87c8e09b0f3c5718f7b337771ca95c

Cheng-Chia: You can test the patch yourself by booting the F18-Beta Live CD and patching localization.py there:

# yum update anaconda
# yum install patch
# patch -b /usr/lib64/python2.7/site-packages/pyanaconda/localization.py < localization-zh-Hant-TW-1.patch

Comment 14 pizza306 2012-12-07 05:00:04 UTC
Steve Tyler: status changed to assigned.

---
so if we have a patch, how fast can we apply it to thee new version anaconda?
it seems we still have more new test version on the ftp (-smoke i thought) before final release.

Comment 15 Steve Tyler 2012-12-07 17:43:11 UTC
Thanks for reopening.

The F18 release schedule is here:
https://fedoraproject.org/wiki/Releases/18/Schedule
And there is a lot to be done:
http://qa.fedoraproject.org/blockerbugs/milestone/18/final/buglist

Comment 16 Cheng-Chia Tseng 2012-12-08 14:37:57 UTC
(In reply to comment #13)
> pizza306: Could you reopen this bug? We have new information that can fix it.
> 
> BTW, a similar patch was previously proposed in Bug 872282, Comment 18:
> Bug 872282 - Add Chinese (Hong Kong) - zh_HK - locale to anaconda 
> 
> This commit added sr_Latn_RS to mangleMap:
> 
> Use sr_Latn_RS instead of sr_RS@latin in mangleMap (#872786)
> http://git.fedorahosted.org/cgit/anaconda.git/commit/pyanaconda/localization.
> py?id=ffdb22846c87c8e09b0f3c5718f7b337771ca95c
> 
> Cheng-Chia: You can test the patch yourself by booting the F18-Beta Live CD
> and patching localization.py there:
> 
> # yum update anaconda
> # yum install patch
> # patch -b /usr/lib64/python2.7/site-packages/pyanaconda/localization.py <
> localization-zh-Hant-TW-1.patch

Good! I had tested the patch and found it working well with both zh-CN and zh-TW as expected. :) 

The latter steps worked fine, no regression occurred either.

Comment 17 Chris Lumens 2012-12-10 15:27:41 UTC
This is not a blocker for F18, and for F19 I want to get rid of mangleMap.  So I do not want to take any fixes that simply change things there.  I want to get rid of it entirely and just have anaconda use someone else's knowledge.  If you guys want to spend time working on this, working on it from that angle would be much more productive.

Comment 18 Cheng-Chia Tseng 2012-12-11 13:25:22 UTC
So we won't see any improvement even have it patched. Alright, I accept that.

Now the obstacle is that we might not have CLDR changed. :S However, I will give it a try to file a bug for this displaying zh_TW as "中文 (台湾)" instead of "中文 (臺灣)" problem. Hope that they will fix it.

If anaconda from F19 does not use python-babel or CLDR, but still has this problem, I will file a bug against it by that time.

Comment 19 Steve Tyler 2012-12-12 18:58:38 UTC
pizza306: You can propose this bug as an F18 Blocker NTH by adding the "F18-accepted" bug alias[1] to the Blocks field for this bug. That will give QA a chance to review this bug for possible inclusion in F18.[2]

When proposing a blocker, there needs to be a rationale based on the release criteria:[3]
1. The installer looks unpolished to users who read Chinese.
2. Cannot be fixed with an update.

Cheng-Chia: Thanks for testing the patch. If you open a bug against CLDR with http://unicode.org, could you post a link here?

[1] http://fedoraproject.org/wiki/BugZappers/HouseKeeping/Trackers

[2] QA:SOP blocker bug process
https://fedoraproject.org/wiki/QA:SOP_blocker_bug_process

[3] Fedora Release Criteria
https://fedoraproject.org/wiki/Fedora_Release_Criteria

Comment 20 Cheng-Chia Tseng 2012-12-13 05:13:50 UTC
(In reply to comment #19)
> pizza306: You can propose this bug as an F18 Blocker NTH by adding the
> "F18-accepted" bug alias[1] to the Blocks field for this bug. That will give
> QA a chance to review this bug for possible inclusion in F18.[2]
> 
> When proposing a blocker, there needs to be a rationale based on the release
> criteria:[3]
> 1. The installer looks unpolished to users who read Chinese.
> 2. Cannot be fixed with an update.
> 
> Cheng-Chia: Thanks for testing the patch. If you open a bug against CLDR
> with http://unicode.org, could you post a link here?

I don't know much about the programming thing, so I search CLDR charts to see if "台灣" is "台湾" in zh_TW or not. 
However, I don't find zh_TW but there are zh_Hant[0] and zh_Hant_TW. In zh_Hant, the string is "台灣" so there is no problem with it.

I guess that pyton-babel prints out "台湾" from zh but not zh_Hant. And I filed a bug to babel[1]. 

However, further investigation is needed to find out what is the source of the problem. I don't know whether my guessing is right or not.

0. http://www.unicode.org/cldr/charts/summary/zh_Hant.html 
1. http://babel.edgewall.org/ticket/315

Comment 21 pizza306 2012-12-13 14:10:18 UTC
Steve Tyler: blocks alias added. thank you for all navigating!

---
To Bug Reviewers:

I think this problem is worth to check again because:
>> 1. The installer looks unpolished to users who read Chinese.
this is important to Chinese/Taiwanese users, this will even make some new users thought that Fedora don't have complete Traditional Chinese support.

>> 2. Cannot be fixed with an update.
if python-babel or CLDR are "fixable", we still have to wait until F19.
if they don't/can't fix zh_TW into 中文 (台灣), it means we'll have a release which has wrong string, and it'll probably remain to later releases.

beside, we already have a patch can solve this problem, so is that possible to check this again? if not, still hope this will be fixed in F19.

Comment 22 Adam Williamson 2012-12-19 19:18:17 UTC
Discussed at 2012-12-19 NTH review meeting: http://meetbot.fedoraproject.org/fedora-bugzappers/2012-12-19/f18final-blocker-review-6.2012-12-19-17.02.log.txt . As a bunch of North Americans / Europeans who don't know nothin' about nothin', we were pretty reluctant to make a call on this one, with the political considerations. We agreed to delay a decision on NTH here to see if we can get more clarity on this, jreznik will look into it.

Note that in oldUI, the analogous screen did not refer to territories by name: it simply had 'Chinese (Simplified)' and 'Chinese (Traditional)' (the first gave you zh_CN and the second gave you zh_TW, and if you were Singaporean or from Hong Kong you just dealt with it). This neatly avoided most of the political stuff as, no matter where you come from and what you think the status of Taiwan is and what language ought to be used to refer to it, you probably at least agree that both simplified and traditional Chinese *exist*.

But now we're referring to actual places, we get to deal with the messy political stuff.

Comment 23 Cheng-Chia Tseng 2012-12-20 13:35:25 UTC
I found that charts in CLDR provide "[zh] Chinese" and "[zh_Hant] Traditional Chinese". And the string for Taiwan in "[zh_Hant] Traditional Chinese" is in Traditional Chinese form as "台灣" which is what we would like to see.

I doubt that maybe python-babel or something behind this print out the result from zh data instead of zh_Hant for Taiwan string. This might be the source of the problem, but I'm not sure. I don't know much about programming and the output mechanism behind this issue.

In my opinion, "Simplified Chinese" and "Traditional Chinese" are good way to describe the writing styles of Chinese. This may further extend to "Simplified Chinese (China)", "Simplified Chinese (Singapore)", "Traditional Chinese (Taiwan)" and "Traditional Chinese (Hong Kong)" to indicate what the translation terms used to fit the territories. In this way, it is more neutral and more of inclusive language.

Comment 24 Cheng-Chia Tseng 2012-12-20 13:47:56 UTC
Indeed, "Chinese (China)" and "Chinese (Taiwan)" is not appropriate. It does not show the writing styles of each. And now we do not have mature support for Singapore, Hong Kong and Macau SAR, using the name that way will make them be confused and not knowing what to do.

Comment 25 Cheng-Chia Tseng 2012-12-20 14:01:22 UTC
By applying the patch Steve provided, Anaconda will show "Chinese (Simplified Han, China)" and "Chinese (Traditional Han, Taiwan)" for zh_CN and zh_TW via mangleMap mechanism. Plus, it prints out simplified form in Chinese for previous one (China one), and traditional form in Chinese for latter one (Taiwan one).

This is not as good as what I proposed in Comment 23, but it shows the writing styles as Simplified or Traditional though. Chinese users in Singapore, Hong Kong and Macau SAR will be able to choose according to the writing styles they use, and knowing that the option will give him translation terms fit China or Taiwan but not his location.

It might be a quick and compromised workaround.

Comment 26 Chris Lumens 2013-01-02 15:49:21 UTC
Please see comment #17, though.  I don't want workarounds (this is not an approved blocker) and I don't want any changes to mangleMap except for deleting it entirely.  anaconda should not own this knowledge.  We should be able to use something else on the system.

Comment 27 Adam Williamson 2013-01-04 03:39:29 UTC
and I should have a flying magic golden pony, but right now there is no magic source of this information, so it's kind of irresponsible to just drop it from anaconda.

Comment 28 David Shea 2013-09-13 20:46:33 UTC
Anaconda is still not the place for this