Bug 438662 - ordering of candidates is wrong in Zhu Yin
Summary: ordering of candidates is wrong in Zhu Yin
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: scim-tables
Version: 9
Hardware: All
OS: Linux
low
medium
Target Milestone: ---
Assignee: Caius Chance
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-03-24 10:00 UTC by Allison Lee
Modified: 2008-06-19 09:19 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2008-06-19 06:39:52 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
bad candidates on first page (27.02 KB, image/jpeg)
2008-03-24 10:00 UTC, Allison Lee
no flags Details
Zhu Yin table in FC-5. (483.95 KB, application/octet-stream)
2008-06-19 06:20 UTC, Caius Chance
no flags Details
Zhu Yin table (Big) in FC-5. (1.19 MB, application/octet-stream)
2008-06-19 06:21 UTC, Caius Chance
no flags Details

Description Allison Lee 2008-03-24 10:00:32 UTC
Description of problem:
When SCIM is switched to Zhu Yin IM, I found the ordering of candidates is
different as compared to SCIM in FC5, FC6, and F7. Infrequently used characters
appeared earlier instead of later.  FC5 is reasonable, but FC6 and F7 are "wrong".

For example, if I type the character "今", which is the first candidate in the
first page I should see after typing it, it is not there anymore. It looks like
positions of all candidates are just shuffled. (see attachment)

Based on FC5 and Windows, users expect candidate ordering prioritized by: 
1. characters commonly used 
2. number of strokes


Below are the scim packages I have:
  scim-1.4.7-7.fc8
  scim-m17n-0.2.2-2.fc8
  scim-tables-0.5.7-3.fc7
  scim-libs-1.4.7-7.fc8
  scim-tables-chinese-0.5.7-3.fc7
  scim-bridge-0.4.14-1.fc8
  scim-chewing-0.3.1-10.fc8

This is a serious problem for Zhu Yin users, who are major IM users of
Traditional Chinese. They are not able to type in Chinese well because they will
waste time looking around for suitable candidates.  For this reason, it is
unreasonable to upgrade away from FC5.

Version-Release number of selected component (if applicable):


How reproducible:
If I type the character "今", which is the first candidate in the first page I
should see after typing it, it is not there anymore. It looks like positions of
all candidates are just shuffled.

Steps to Reproduce:
1. Switch to Zhu Yin
2. Type "rup"
  
Actual results:
Character "今", the first candidate in the first page.  It is on a later page.

Expected results:
It should be on the first page.

Additional info:
LANG=zh_TW.UTF-8
GDM_LANG=zh_TW.UTF-8

Comment 1 Allison Lee 2008-03-24 10:00:32 UTC
Created attachment 298881 [details]
bad candidates on first page

Comment 2 Caius Chance 2008-05-01 00:04:37 UTC
Hi Peng, Would you know how Zhu Yin load in order? Is it rely on the order in
the table files?

Comment 3 Peng Huang 2008-05-03 12:19:24 UTC
I remember a frequency could be specified for every phrase in the table file.
The table engine will short the candidates by frequency. Does the Zhu Yin table
provide the correct frequencies for phrases or chars?


Comment 4 Bug Zapper 2008-05-14 06:48:37 UTC
Changing version to '9' as part of upcoming Fedora 9 GA.
More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 5 Caius Chance 2008-06-18 05:33:56 UTC
Compared the Zhu Yin table file in rawhide and FC-4, "今" is lower than those
characters in rawhide but it is the first candidate for "rup" in FC-4. The order
in the table file should be the default order for the candidates to be shown on
the look-up window.

I need to confirm if latest Zhu Yin has character/phrase frequency mechanism
which could adjust the candidate order from learning. Besides, generally
frequently used chars like "今" should be at higher rank initially. We should
have a more practical table file.

Wondering how the order of characters in the latest table file referred to.

Comment 6 Caius Chance 2008-06-18 06:15:10 UTC
The character order is changed on then:

Revision 1.9 - (view) (download) (annotate) - [select for diffs]
Mon Jan 16 07:06:14 2006 UTC (2 years, 5 months ago) by suzhe
Branch: MAIN
Changes since 1.8: +43763 -43501 lines
Diff to previous 1.8

Update ZhuYin tables according to the latest table from CMEX.

(
http://scim.cvs.sourceforge.net/scim/scim-tables/tables/zh/ZhuYin.txt.in?view=log#rev1.12
)

I haven't estimate the workload to revert to the version before this change,
with all clean later fixes merged.

Comment 7 Caius Chance 2008-06-18 07:16:12 UTC
Hi Allison,

I could rearrange the candidate orders based on the table in FC-5. However,
there are changes which made by upstream since then which might have minor
effects on specific characters.

For example, "兒" was moved from "-" to "-6". Then the candidate after 兒 will
be moved up towards the head of candidate list. To ensure you have exact
character always at exact location in your current Fedora release version as it
was in FC-5, I suggest you recompile scim-tables with the ZhuYin.txt.in and
ZhuYin-Big.txt.in copied over from FC-5 sources (tarball/src.rpm).

Thank you very much.

Comment 8 Caius Chance 2008-06-19 06:19:41 UTC
Decided to revert the Zhu Yin table to the one in FC-5. Due to the huge size of
data (~58k entries), fixing the latest table manually by me alone isn't feasible
to be finished in short period.

Comment 9 Caius Chance 2008-06-19 06:20:42 UTC
Created attachment 309816 [details]
Zhu Yin table in FC-5.

Comment 10 Caius Chance 2008-06-19 06:21:11 UTC
Created attachment 309817 [details]
Zhu Yin table (Big) in FC-5.

Comment 11 Caius Chance 2008-06-19 06:39:52 UTC
Built to rawhide:

http://koji.fedoraproject.org/koji/buildinfo?buildID=53197

Comment 12 Allison Lee 2008-06-19 09:19:06 UTC
Hi Caius,

I compiled ZhuYin.txt.in and ZhuYin-Big.txt.in from FC-5 and Zhu Yin works fine
now. Candidate orders are back to how they were. The position of "兒" is also
correct.

Thank you very much! You have been great in trying to get this problem fixed. 


Note You need to log in before you can comment on or make changes to this bug.