Bug 107617

Summary: Mozilla makes Japanese UTF-8 page really sucks
Product: [Fedora] Fedora Reporter: Nakai <ynakai>
Component: mozillaAssignee: Christopher Aillon <caillon>
Status: CLOSED DUPLICATE QA Contact: Ben Levenson <benl>
Severity: high Docs Contact:
Priority: medium    
Version: rawhideCC: fedora-ja-list, johnthacker, otaylor
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-10-28 21:00:15 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
LANG=en_US.UTF-8 gedit
none
LANG=ja_JP.UTF-8 gedit
none
sample test text in Japanese UTF-8 none

Description Nakai 2003-10-21 10:49:23 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4.1) Gecko/20031009

Description of problem:
When I open Japanese UTF-8 page, Mozilla shows
up unbelievably very dirty glyphs.
Users will feel ill every time they see UTF-8.

Sample UTF-8 page:
http://www.gnome.gr.jp/docs/glib-2.2.x-refs/glib/index.html

All Kanji Glyphs are not Japanese but Chinese,
and hiragana/kanakana glyphs are that in Chinese fonts,
and those are very dirty.

This situation happens when running GTK+2 apps with
en_US.UTF-8. ja_JP.UTF-8 fixes the problem on GTK+2 apps.
So, Mozilla is supposed to be set en_US.UTF-8 somewhere
internally. If so, that's the mistake.

Version-Release number of selected component (if applicable):
mozilla 1.4.1-7

How reproducible:
Always

Steps to Reproduce:
1. Run mozilla in Japanese environment
2. http://www.gnome.gr.jp/docs/glib-2.2.x-refs/glib/index.html
3. Oh-no
    

Actual Results:  Japanese glyphs in Chinese fonts

Expected Results:  Japanese glyphs in Japanese fonts

Additional info:

Once before,
Russian people claimed russian glyphs in Japanese fonts are dirty.
Font designer designs well for their native language, but not
for non-native language. So applications must use always
locale-aware native fonts, rather than collect missing glyphs
randomly.

Developers are happy with locale-independent, but
people need locale-respect now.

Comment 1 Christopher Blizzard 2003-10-21 13:44:45 UTC
How do other apps look?

This might sound silly to ask, but do you have the ttfonts-ja package installed?

Comment 2 Nakai 2003-10-22 02:40:12 UTC
Of course, I install all rpms include ttfonts-ja.
Japanese texts shows up very well on other GTK+2/GNOME2 apps.
(Even metacity window title)

Comment 3 Nakai 2003-10-22 02:41:27 UTC
Created attachment 95370 [details]
LANG=en_US.UTF-8 gedit

Comment 4 Nakai 2003-10-22 02:42:07 UTC
Created attachment 95371 [details]
LANG=ja_JP.UTF-8 gedit

Comment 5 Nakai 2003-10-22 02:43:02 UTC
Created attachment 95372 [details]
sample test text in Japanese UTF-8

Comment 6 Nakai 2003-10-22 02:52:48 UTC
Above 2 shots are the sample different between normal GTK+2 apps in 2
different UTF-8 locales.

#3 attachment - LANG=en_US.UTF-8
#4 attachment - LANG=ja_JP.UTF-8

The #4 ja_JP.UTF-8, gedit works well with Japanese ttfonts-ja fonts.
See the #3 evil en_US.UTF-8 shot.
At first line, just last charactor is correct Japanese glyph.
At the second line, Japanese Hiragana string. Unbelievably dirty.
At the 3rd line, Japanese Katakana string, Extremely very dirty.
  Even my handwriting is much better.
At the 4th line, the glyph of the first char is Chinese's, and not
 correct Japanese. It means broken.

And, Mozilla UTF-8 looks like the #3 shot, the mad one.

Comment 7 Christopher Blizzard 2003-10-22 18:29:22 UTC
Over to fontconfig

Comment 8 Owen Taylor 2003-10-22 19:04:19 UTC
If Japanese text doesn't have a Japanese language tag on it, then
you simply can't expect GTK+ / fontconfig to know that the text
is in Japanese. (Pango-1.4 might do a bit better if the text
includes Hiragana/Katana, but plenty of Japanese text strings have
no Hiragana/Katakana.)

If Mozilla is rendering wrong, it is likely not putting the right language
tag on the page.

(I think "ugly" would be a more appropriate word here than "dirty")

Comment 9 Nakai 2003-10-23 12:12:21 UTC
When I install Epiphany 1.0.3, it works well.

So, it is supporsed that Mozilla embedded component has ability to show
Japanese UTF-8 correctly, and the problem is in Mozilla upper area.

Comment 10 Jens Petersen 2004-11-11 06:16:42 UTC
This problem seemed fixed when turning on pango in FC3
with MOZ_ENABLE_PANGO=1 for mozilla, firefox and thunderbird.

Comment 11 Takanori MATSUURA 2004-11-11 07:06:08 UTC
Nakai.

Sample URL you show is not good.
Did you check the source of 
http://www.gnome.gr.jp/docs/glib-2.2.x-refs/glib/index.html

At the header region, the following attribute is set though the page
is written with UTF-8.
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">

And the body
<div class="book" lang="en">

In this case, mozilla recognize that the document is written with
ISO-8859-1 and language is English.

Mozilla shows the page exactly with the information from the HTML source.

I don't know how mozilla decides the fonts to display but I guess the
following. Note that my guess is only my impression. I never checked
the source of mozilla.

1. get information from the HTML source.
2. mozilla tries to show the page using the font settings of the 
   language information from the HTML source.
3. if some glyphs are not included in the fonts selected in the step
2, mozilla searches the fonts that have all glyphs with the order of
charset alphabetically.

http://bugzilla.mozilla.gr.jp/show_bug.cgi?id=2231 (Japanese)
That page discusses for Windows OSs but is similar to this bug.

gedit problem in this bug is the another problem 

Comment 12 John Thacker 2004-12-09 20:47:28 UTC
This is the same problem as in #107952.
The problem is that the Kochi Gothic and Kochi Mincho fonts don't support
English ("en") according to fontconfig, since they don't contain accented
characters sometimes used in English to write loanwords.  When in an English
Unicode locale, Mozilla assumes that the document is English, and prefers fonts
which support English (according to fontconfig) to all those which don't. 
Hence, it picks MiscFixed (which supports lots of languages, including both jp
and en) over the Kochi fonts for rendering Japanese glyphs in a webpage that it
assumes is English.

Comment 13 John Thacker 2006-10-28 21:00:15 UTC
I'm just going to mark this as a duplicate of 107952.  This was basically that
fontconfig problem, and I can't replicate it now.

*** This bug has been marked as a duplicate of 107952 ***