+++ This bug was initially created as a clone of Bug #227473 +++ Description of problem: Version-Release number of selected component (if applicable): scim-chewing-0.3.1-10.el5 How reproducible: always Steps to Reproduce: 1.log in zh_TW locale 2.start gedit 3.ctrl-space to bring up Chewing IM and type 'gji' 4.SPACE SPACE to bring up candiate window Actual results: there are two 說 in candidate list Expected results: Additional info: -- Additional comment from xwang on 2007-02-06 03:31 EST -- Created an attachment (id=147448) screenshot -- Additional comment from zhu on 2007-02-06 04:05 EST -- Well, these two chracters look like the same, but their value are not the same in fact, one is for simplified Chinese, and the other is the traditional Chinese version. If you type this in simplified Chinese environment, you will see the second one will show as garbage, which indicates that the two are different. So this should not being a bug. -- Additional comment from cchance on 2007-02-06 18:57 EST -- I agree with Hu. They are two different characters that having different codepoint: 說: U+8AAA 説: U+8AAC There are so many similar cases that Unicode treats characters with different strokes or from different cultures (China/Taiwan/Japan/Korea/Hong Kong/etc) as different characters. So probably this is not a bug. -- Additional comment from tagoh on 2007-02-07 05:20 EST -- Well, just add a comment from the usability aspect, what's actually the target language for chewing? if it's only Traditional Chinese, one should be removed from dictionary. if both is the target, does chewing really want to support both input in one input style or layout? since there are different codepoint for similar glyphs, leading different character that isn't comfortable with current language isn't a good usability. for example, one assumes inputing U+8AAA but it was actually U+8AAC. one may not realizes that thing since it looks similar. but it might causes trouble sooner or later. what does it sound to you? is it still NOTABUG? We should prevent any confusions as far as possible. -- Additional comment from cchance on 2007-02-07 17:44 EST -- This behavior is not only existing in chewing, but also in Changjie at least AFAIK. The Chinese input is deeper meaning than just deciding whether just Traditional Chinese or Simplified to be used. Firstly, how about it if there are some users who is doing business in both China and Taiwan, who just understand Chewing input method? Picking either language to support obviously is not the full solution. I am not too sure if we should use a hotkey or a preference option to let user switch between code ranges is a good idea. Secondly, there are some characters are shared by both Simp Chinese and Trad Chinese. If we need to get the hide the characters that non Simp/Trad Chinese, we have to had the list of which characters are used in such language/locale. Thirdly, there are some characters that are non Simp/Trad Chinese but be used in both of them. Such as Japanese Hirakana Katagana, Hangul, etc. Furthermore, Kanji (Japanese Chinese characters) and Hanja (Korean Chinese characters) are needed by someone who need Chinese and such languages. Fourthly, when we created two tables for two Chinese languages, any changes on the input combination keys may double the expense of maintance resources. A wider coverage should be the fundamental direction of improvement for all input method (or IME). Generally character frequency is the current solution developed by upstream which tries to be more flexible. I need to consult with upstream. This should be recognized as a new feature instead of issue resolution. It should be in devel branch but not in RHEL. IMHO. -- Additional comment from cchance on 2007-02-07 17:49 EST -- If we analysis and IF it is positive for us to go for that, it might be good idea to clone this feature request to scim-tables and other IMEs. -- Additional comment from tagoh on 2007-02-07 22:46 EST -- Ok, just dealing with this as a feature would be good too. as a suggestion, if both characters appears in the candidate list at the same time, how about managing to show up if the character is Traditional Chinese or Simplified Chinese in the candidate list? It should be less confusion and they can choose one easily. Anyway just dealing with this as NOTABUG so that it's likely to happen according to that usage doesn't make sense to me. or I just feel like that because I'm not a native speaker? -- Additional comment from cchance on 2007-02-07 23:21 EST -- Apart from how much we need to invest for achieving the feature, all I am trying to express is 'There are many Chinese/Kanji/Hanji codepoints are actually the same word.'. Though they might have even same meaning, they are treated by Unicode organization as different characters only because stroke has minor differences. If we want Chewing to be smart, it has to be smart by: 1. Learns from user through frequency of usage. 2. Bases on charset standard that which characters are in which locale. (Question: should we refer to Unicode standard about categories of chars, or should we refer to localized encoding? e.g. big5/gbk/gb18030,ISO2022-JP,JIS,etc) 3. We might improve to let user customize their preferred character (e.g. some of them might preferred typing Trad Chinese but with specific academic words in Simp Chinese.) 4. Put a note next to the characters in candidate list is good idea. OS X has some icons to indicate to user that certain characters are not in current locale code range. We just need to spend time to research how to implement, especially to discover exactly which characters current in Chewing candidate list are only in either Simp Chinese or Trad Chinese or even are foreign characters such as Japanese or Hangul. (i.e. We might need to analysis all characters in Chinese to see each character whether just appears in 1 language/locale, or shared by both Trad and Simp Chinese even Japanese/Korean.) -- Additional comment from cchance on 2007-02-08 00:13 EST -- Bug# 227466 has similar relationship with this bug: The cover range of chewing characters are wider than font glyph range. -- Additional comment from cchance on 2007-02-08 00:38 EST -- Clone this feature request to new devel bug, please follow-up at there.
Based on the date this bug was created, it appears to have been reported against rawhide during the development of a Fedora release that is no longer maintained. In order to refocus our efforts as a project we are flagging all of the open bugs for releases which are no longer maintained. If this bug remains in NEEDINFO thirty (30) days from now, we will automatically close it. If you can reproduce this bug in a maintained Fedora version (7, 8, or rawhide), please change this bug to the respective version and change the status to ASSIGNED. (If you're unable to change the bug's version or status, add a comment to the bug and someone will change it for you.) Thanks for your help, and we apologize again that we haven't handled these issues to this point. The process we're following is outlined here: http://fedoraproject.org/wiki/BugZappers/F9CleanUp We will be following the process here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping to ensure this doesn't happen again.
IMHO firstly we should separate the table file into smaller ones in different charset. We need some docu about charset range. I am not sure if the charsets such as big5, gbk/gb18030, iso-2022-jp/shift-jis are purely subset of UTF-8. If so, it would be a simpler case.
Changing version to '9' as part of upcoming Fedora 9 GA. More information and reason for this action is here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
FYI, Currently libchewing-[v-r]/data/phone.cin* contains the table. Also, there is TC <-> SC conversion filter available in SCIM atm.
Hi, I am thinking of a more generalized solution. We can label each character by the locale they appeared. For example: 說: U+8AAA (zh_TW, kr) variant 説,说 説: U+8AAC (zh_CN, jp) variant 說,说 说: U+8BF4 (zh_CN) variant 説,说 In SCIM setting, there should be a set of check boxes for user to toggle the output he/she desire. What say you?
Exactly.
If there are any requirements one wants to see characters in the same time, which looks same, you could as one of options in IME's preference. I'm not sure if there are. but seems not for the above case at least. well, I'd rather prefer IME itself deals with it against current locale or input layout - is there any layout both zh_CN and zh_TW uses? if not, I don't think that option really helps for the kind of this problem. speaking of the above characters, does people in zh_TW locale really wants to see U+8AAC and U+8BF4 in their input according to that option you suggest?
I agree with the view of Tagoh, as the functionality just benefits Hanzi users, Anyway, what I will do is: IME enables common IRG sources for locale default. For example, default for CN is G0; TW is T1, T2; HK is T1, T2, H, JP is J0. There will be also GUI in IME to enable the reset of the common sources. A "Advance setting" button will be in this GUI for showing advance IRG sources.
requested by Jens Petersen (#27995)
This message is a reminder that Fedora 9 is nearing its end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 9. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '9'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 9's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 9 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug to the applicable version. If you are unable to change the version, please add a comment here and someone will do it for you. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Moving to ibus-chewing.
This bug appears to have been reported against 'rawhide' during the Fedora 12 development cycle. Changing version to '12'. More information and reason for this action is here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
This message is a reminder that Fedora 12 is nearing its end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 12. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '12'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 12's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 12 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug to the applicable version. If you are unable to change the version, please add a comment here and someone will do it for you. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Any progress on this? guess we should add FutureFeature tag to avoid house-keeping?