Description of problem: During F19 L10n QA Test Day event, found that in Timezone tab, many cities are placed under Region without country names while there are America and Australia. It does not look good to see that country names are omitted. Version-Release number of selected component (if applicable): system-config-date-1.10.5-2.fc19.noarch How reproducible: always Steps to Reproduce: 1. Run system-config-date 2. 3. Actual results: Beijin, Tokyo, Seoul, etc are placed under 'Asia' with no country name. Monaco, Paris, Madrid, London etc are placed under 'Europe' with no country name. Expected results: Country names should be appeared, such as Japan, China, Korean, France, Germany, Spain, etc. Additional info:
System-config-date orders cities exactly as listed in /usr/share/zoneinfo/zone.tab and "America" and "Australia" are used as continent names here. This file comes from the tzdata package and any changes would need to be acceptable upstream... I suspect that a proposed change to introduce a "country level" wouldn't be accepted because there are disputes about e.g. to which country some regions belong and so forth, but I'll change the component to tzdata regardless, perhaps its maintainer can provide some more details.
The logic behind this is that cities are more stable than country borders. The border disputes that Nils mentions are also a good reason. Where there are individual countries mentioned, they are often for backward compatibility with ancient systems. Sometimes there's grouping where that's practical, e.g. we have a bunch of zones under America/Indiana/ and America/Argentina/. On the other hand we don't have America/Mexico/ even though there's a bunch of Mexican zones. It's really quite arbitrary. The reason this can be arbitrary is that individual users shouldn't need to care about the zone naming at all. Those are just internal identifiers, much like names of ABI symbols in a library. We might just as well name those things with UUID strings and it would be perfectly OK. What users should care about is what's in zone.tab, which is what system-config-date should present to the user.
(Sorry for the delay, was on vacation.) (In reply to comment #2) > what users should care about is what's in zone.tab, which is what > system-config-date should present to the user. Which is what it does, the arbitrary identifiers are in zone.tab (as well) ;-).
Created attachment 756617 [details] cities localized in Kanji are appeared at the bottom of the list Ok, I can understand the border disputes for some regions. Let me report just one thing I noticed for Japanese locale user. Using locale 'Japanese', Cities localized in Kanji always come at the bottom of the list, while most of cities localized in Katakana are appeared in 50-on order as expected. This may be Japanese specific issue though. The following are cities localized in Kanji and appeared at the bottom of each continent. Asia - Hong Kong, Chungking, Shanghai, Taipei, Tokyo and Pyongyang (see attached screenshot) America - Southern Prince' Antarctica - Showa station and south pole
I'm sorry, but I don't seem to follow. Are you saying that ordering of Katakana words should be different with respect to Kanji words (I.e. they should appear at the top of the list, or should be intermixed according to pronunciation, etc.)? Or that those locations have an existing Kanji rendering, and shouldn't be transcribed using Katakana?
(In reply to Noriko Mizumoto from comment #4) > Created attachment 756617 [details] > cities localized in Kanji are appeared at the bottom of the list > > Ok, I can understand the border disputes for some regions. > Let me report just one thing I noticed for Japanese locale user. > Using locale 'Japanese', Cities localized in Kanji always come at the bottom > of the list, while most of cities localized in Katakana are appeared in > 50-on order as expected. This may be Japanese specific issue though. The sorting (within geographic time zones) is done by the translated name of the time zone, and depends on the collation rules of the locale set -- apparently the rules for Japanese are to sort Kanji after Katakana.
How would you sort words written in Kanji? If you want to sort them phonetically, it would be necessary to know the pronunciation of the city names written in Kanji. But Japanese pronunciation is not very regular. For example, to sort a phonebook with person names phonetically, it is necessary to add the readings as well. Is 河野 pronounced かわの or こうの or こおの? How should the sort algorithm know that if the correct reading is not given as well? In my Android phone, the Japanese names are only sorted correctly if I enter not only the Kanji but also the readings in Hiragana. Without that it is not really possible to sort Japanese well, I think.
(In reply to Noriko Mizumoto from comment #4) > Created attachment 756617 [details] > cities localized in Kanji are appeared at the bottom of the list > > Ok, I can understand the border disputes for some regions. > Let me report just one thing I noticed for Japanese locale user. > Using locale 'Japanese', Cities localized in Kanji always come at the bottom > of the list, while most of cities localized in Katakana are appeared in > 50-on order as expected. This may be Japanese specific issue though. > > The following are cities localized in Kanji and appeared at the bottom of > each continent. > Asia - Hong Kong, Chungking, Shanghai, Taipei, Tokyo and Pyongyang (see > attached screenshot) > America - Southern Prince' > Antarctica - Showa station and south pole $ echo -e "上海\nラングーン\n香港\nリヤド\n重慶\n平壌\n台北\n東京\nヴィエンチャン\n" | LC_ALL=ja_JP.UTF-8 sort ラングーン リヤド ヴィエンチャン 香港 重慶 上海 台北 東京 平壌 mfabian@ari:~ So that is just the way glibc sorts this in ja_JP.UTF-8 locale. Of course this is not nice for the names written in Kanji, but how to do this better without knowing the readings? But glibc doesn’t even sort kana only nicely: mfabian@ari:~ $ echo -e "ウ\nう\nた\nカ\nラ\nら\nワ\nわ\nヲ\nを\nヴ\nゔ\nヤ\nや\nか\nモ\nも\nヒ\nひ\nナ\\nな\nア\nあ\nエ\nえ\nイ\nい\nサ\nさ\n" | LC_ALL=ja_JP.UTF-8 sort ゔ あ い う え か さ た な ひ も や ら わ を ア イ ウ エ カ サ ナ ヒ モ ヤ ラ ワ ヲ ヴ mfabian@ari:~ $ All the hiragana are above all the katakana, this is not very nice. And in hiragana, ゔ is at the top above あ, in katakana ヴ is at the end after ヲ. So not even the kana are sorted nicely. Correct order would be: 1 あ 2 ア 3 い 4 イ 5 う 6 ウ 7 ゔ 8 ヴ 9 え 10 エ 11 か 12 カ 13 さ 14 サ 15 た 16 な 17 ナ 18 ひ 19 ヒ 20 も 21 モ 22 や 23 ヤ 24 ら 25 ラ 26 わ 27 ワ 28 を 29 ヲ (Sorted with libicu using http://minaret.info/test/sort.msp) So glibc does not even sort the kana correctly. libicu does a better job sorting Japanese, but to sort words in Kanji correctly, I think one needs to know the correct readings.
You can also try this to see how libicu sorts Japanese: http://demo.icu-project.org/icu-bin/locexp?_=ja&d_=en&x=col
Created attachment 756966 [details] icu-japanese-sorting.png
The strings in the list are eventually sorted using g_utf8_collate() from glib (I don't use a custom sorting function), which in turn uses wcscoll() from glibc on current Linux systems (i.e. here). I'm not sure if there is a viable way to fix collation in glibc for even just for kana. An alternative way would be to use libicu for collating if it is present (i.e. try to dlopen() it), but that would result in different sorting depending on if libicu is installed.
This package has changed ownership in the Fedora Package Database. Reassigning to the new owner of this component.
Ping, Is any progress on this problem?
Which issue do you mean exactly? Talking about the original issue (no country names), this is not a bug as I've explained in comment #1. This is basically just waiting for confirmation from the tzdata maintainer to be closed IMO. Patsy, welcome aboard BTW and what's your take? If you refer to the Kanji vs. Katakana sorting issue from comment #4 ff., this is a different bug that should be opened (or cloned from this one for reference) against glibc (I guess).
(In reply to Nils Philippsen from comment #15) > Which issue do you mean exactly? [...] > If you refer to the Kanji vs. Katakana sorting issue from comment #4 ff., > this is a different bug that should be opened (or cloned from this one for > reference) against glibc (I guess). Yes, a different bug against glibc for that problem would be nice.
Thanks for the input folks - and for the "welcome"! I'm going to close this and open a glibc bug based on comment 4. -Patsy
Here's a link to the glibc BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1042896