From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4.1) Gecko/20031009 Description of problem: When I open Japanese UTF-8 page, Mozilla shows up unbelievably very dirty glyphs. Users will feel ill every time they see UTF-8. Sample UTF-8 page: http://www.gnome.gr.jp/docs/glib-2.2.x-refs/glib/index.html All Kanji Glyphs are not Japanese but Chinese, and hiragana/kanakana glyphs are that in Chinese fonts, and those are very dirty. This situation happens when running GTK+2 apps with en_US.UTF-8. ja_JP.UTF-8 fixes the problem on GTK+2 apps. So, Mozilla is supposed to be set en_US.UTF-8 somewhere internally. If so, that's the mistake. Version-Release number of selected component (if applicable): mozilla 1.4.1-7 How reproducible: Always Steps to Reproduce: 1. Run mozilla in Japanese environment 2. http://www.gnome.gr.jp/docs/glib-2.2.x-refs/glib/index.html 3. Oh-no Actual Results: Japanese glyphs in Chinese fonts Expected Results: Japanese glyphs in Japanese fonts Additional info: Once before, Russian people claimed russian glyphs in Japanese fonts are dirty. Font designer designs well for their native language, but not for non-native language. So applications must use always locale-aware native fonts, rather than collect missing glyphs randomly. Developers are happy with locale-independent, but people need locale-respect now.
How do other apps look? This might sound silly to ask, but do you have the ttfonts-ja package installed?
Of course, I install all rpms include ttfonts-ja. Japanese texts shows up very well on other GTK+2/GNOME2 apps. (Even metacity window title)
Created attachment 95370 [details] LANG=en_US.UTF-8 gedit
Created attachment 95371 [details] LANG=ja_JP.UTF-8 gedit
Created attachment 95372 [details] sample test text in Japanese UTF-8
Above 2 shots are the sample different between normal GTK+2 apps in 2 different UTF-8 locales. #3 attachment - LANG=en_US.UTF-8 #4 attachment - LANG=ja_JP.UTF-8 The #4 ja_JP.UTF-8, gedit works well with Japanese ttfonts-ja fonts. See the #3 evil en_US.UTF-8 shot. At first line, just last charactor is correct Japanese glyph. At the second line, Japanese Hiragana string. Unbelievably dirty. At the 3rd line, Japanese Katakana string, Extremely very dirty. Even my handwriting is much better. At the 4th line, the glyph of the first char is Chinese's, and not correct Japanese. It means broken. And, Mozilla UTF-8 looks like the #3 shot, the mad one.
Over to fontconfig
If Japanese text doesn't have a Japanese language tag on it, then you simply can't expect GTK+ / fontconfig to know that the text is in Japanese. (Pango-1.4 might do a bit better if the text includes Hiragana/Katana, but plenty of Japanese text strings have no Hiragana/Katakana.) If Mozilla is rendering wrong, it is likely not putting the right language tag on the page. (I think "ugly" would be a more appropriate word here than "dirty")
When I install Epiphany 1.0.3, it works well. So, it is supporsed that Mozilla embedded component has ability to show Japanese UTF-8 correctly, and the problem is in Mozilla upper area.
This problem seemed fixed when turning on pango in FC3 with MOZ_ENABLE_PANGO=1 for mozilla, firefox and thunderbird.
Nakai. Sample URL you show is not good. Did you check the source of http://www.gnome.gr.jp/docs/glib-2.2.x-refs/glib/index.html At the header region, the following attribute is set though the page is written with UTF-8. <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"> And the body <div class="book" lang="en"> In this case, mozilla recognize that the document is written with ISO-8859-1 and language is English. Mozilla shows the page exactly with the information from the HTML source. I don't know how mozilla decides the fonts to display but I guess the following. Note that my guess is only my impression. I never checked the source of mozilla. 1. get information from the HTML source. 2. mozilla tries to show the page using the font settings of the language information from the HTML source. 3. if some glyphs are not included in the fonts selected in the step 2, mozilla searches the fonts that have all glyphs with the order of charset alphabetically. http://bugzilla.mozilla.gr.jp/show_bug.cgi?id=2231 (Japanese) That page discusses for Windows OSs but is similar to this bug. gedit problem in this bug is the another problem
This is the same problem as in #107952. The problem is that the Kochi Gothic and Kochi Mincho fonts don't support English ("en") according to fontconfig, since they don't contain accented characters sometimes used in English to write loanwords. When in an English Unicode locale, Mozilla assumes that the document is English, and prefers fonts which support English (according to fontconfig) to all those which don't. Hence, it picks MiscFixed (which supports lots of languages, including both jp and en) over the Kochi fonts for rendering Japanese glyphs in a webpage that it assumes is English.
I'm just going to mark this as a duplicate of 107952. This was basically that fontconfig problem, and I can't replicate it now. *** This bug has been marked as a duplicate of 107952 ***