Bug 54087
Summary: | ttmkfdir error with CJK fonts | ||||||
---|---|---|---|---|---|---|---|
Product: | [Retired] Red Hat Linux | Reporter: | Won-kyu Park <wkpark> | ||||
Component: | freetype | Assignee: | Carl Worth (Ampere) <cworth> | ||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Brock Organ <borgan> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 9 | CC: | llch, pcormier, tagoh, ynakai | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | i386 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2003-08-30 10:38:51 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 57818, 61769 | ||||||
Attachments: |
|
Description
Won-kyu Park
2001-09-27 10:07:25 UTC
*** Bug 54089 has been marked as a duplicate of this bug. *** Leon, can you reproduce and confirm this problem? above submitted patch by me is incorrect please change from ... + cur_enc->size = (startptr == endptr) ? i1 - 1: (i1 - 1 << 8) + i2; to... + cur_enc->size = (startptr == endptr) ? i1: (i1 - 1 << 8) + i2; It works for me Bug exists in the current version of ttmkfdir for ttfonts-{ko,ja,zh,..}. ttfmkdir uses characters matching to recognize the encoding. Because CJK fonts (like gulim) usually don't have full set of chars, ttfmkdir will not recognize it. There are a option - --max-missing able to set max # of missing characters allowed per encoding (default is 5). If you put large enough, it able to recognize the encoding. yshao has done a patch on adding one more option to ttmkfdir, for setting max % of missing characters allowed per encoding. Because it is ratio based, so it will work better. Ok. Currently Gulim.ttf (in ttfonts-ko) has full set of KSC5601(KSX1001) and does not have full set of iso10646. then ttmkfdir have to emit 'ksc5601.1987' (this case also true for Wadalab in ttfonts-ja for jisx-20??) But, ttmkfdir over estimate the number of needed chars for each Charsets and alway fails against CJK ttf. (above patch correct this problem) I made a patch. ttmkfdir which applied it generate fonts.dir by checking CodePage from OS/2 table. As far as I checked it seems to be ok for all of CJK. any comments? Created attachment 42311 [details]
fix patch
Ok, I've made the change to gbk-0.enc in XFree86 4.2.0-6.21 in rawhide, and also this change too: cur_enc->size = (startptr == endptr) ? i1: (i1 - 1 << 8) + i2; Please note now that ttmkfdir is part of the XFree86-font-utils package and no longer in freetype package. tagoh - your patch is not applied yet. I'd like to see if the first two fixes fix this if possible. I'll need you guys to test this stuff out to ensure whatever encodings are needed are there, yada yada. Thanks. If encodings.dir has jisx entries only, ttmkfdir has output jisx entries, but it hasn't jisx0201.1976-0. However when encodings.dir is made by mkfontdir on xfs initscript, it doesn't. I will make ttfonts-ja without fonts.{dir,scale} in the meantime. BTW kochi fonts doesn't support microsoft-cp1251 encodings. but it is output. this problem is exactly same as Bug#57818. Patch does not fix ttfonts-{zh_CN,zh_TW}. Fixing the forumla is good, but the main reason being some of the ttf we got have much more missing glyphs than it allows for single encoding. Tagoh's patch (codepage detection from OS/2 table) does the work. However, it will have some problems on newer and older encoding for single language. For example: gb18030.2000-1 has over 20000 glyphs gbk-0 has over 10000 glyphs gb2312.1980-0 has over 2000 glyphs If we assigning all three charset into a zh_CN ttf font which only has 2000 glyphs, then it is not sane. Same goes to kr encodings and probably jp. We at least do some checking for how many glyphs as well as codepage detection (tagoh's patch). > Currently Gulim.ttf (in ttfonts-ko) has full set of KSC5601(KSX1001)
> and does not have full set of iso10646.
No single font has all the glyphs necessary for ISO 10646
(the number of glyphs necessary is larger than the number of code
points in ISO 10646). Anyway, does gulim.ttf in ttfonts-ko
have the full set of 11,172 precomposed Hangul syllables as
defined in ISO 10646/Unicode. If it does, ttfmkdif
should emit not only ksc5601.1987-0 entry but also
entries for ksc5601.1992-3 (JOHAB as used on Solaris 8 and
supported by Mozilla) the encoding for which is included
in XF86 4.2 and iso10646-1.
IMHO, sometimes having only a fraction of glyphs for a given
character set should not prevent ttfmkdir from producing an
entry for that character set for a given font. For instance,
a truetype font with only Latin,Cyrillic, Greek alphabets
and other characters(e.g. genuine - as opposed to crippled
ASCII counterparts - punctuation marks) used in European writing systems
is very useful for European users and should be
presented as iso10646-1 font although it may have only 10% or less
of glyphs for the full iso10646-1. My point is that
ttfmkdir needs to have a kind of 'useful Unicode subset' concept
and produce iso10646-1 entry for a font which has all the
glyphs for that subset even though its repertoire is limited
to a small fraction of iso10646-1.
I'm very sorry for my comment which turned out to be all but irrelevant especially because iso10646-1 patch has been already applied. I should have tried the newest ttmkfdir from rawhide. However, there's one problem that needs to be addressed. Even the newest ttmkfdir doesn't produce ksc5601.1992-3 entry for ttf fonts in ttfonts-ko package unless I use a very high value (~ 30000 ) for the maximum number of missing glyphs. After looking at ttmkfdir source a little, I found that the problem can be solved by chaning UNDEFINE line in ksc5601.1992-3.enc file from UNDEFINE 0x8431 0xF9FE to UNDEFINE 0 0xF9FE When I made and submitted the file to XF86, I didn't know the way ttmkfdir interprets UNDEFINE entry. With this change, ttmkfdir generates ksc5601.1992-3 entries as well as ksc5601.1987-0 and iso10646-1 entries for all three fonts in ttmkfdir-ko package. Note that there's no need to specify '-m' option because all three fonts in question have all the necessary glyphs for ksc5601.1992-3 (11,172 Hangul syllables , ~ 4800 Hanjas and ~950 symbols). I'll submit the patch to XF86. Closing bug as CURRENTRELEASE |