Bug 107617 - Mozilla makes Japanese UTF-8 page really sucks
Mozilla makes Japanese UTF-8 page really sucks
Status: CLOSED DUPLICATE of bug 107952
Product: Fedora
Classification: Fedora
Component: mozilla (Show other bugs)
rawhide
All Linux
medium Severity high
: ---
: ---
Assigned To: Christopher Aillon
Ben Levenson
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2003-10-21 06:49 EDT by Nakai
Modified: 2007-11-30 17:10 EST (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-10-28 17:00:15 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
LANG=en_US.UTF-8 gedit (27.63 KB, image/png)
2003-10-21 22:41 EDT, Nakai
no flags Details
LANG=ja_JP.UTF-8 gedit (44.75 KB, image/png)
2003-10-21 22:42 EDT, Nakai
no flags Details
sample test text in Japanese UTF-8 (142 bytes, application/octet-stream)
2003-10-21 22:43 EDT, Nakai
no flags Details

  None (edit)
Description Nakai 2003-10-21 06:49:23 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4.1) Gecko/20031009

Description of problem:
When I open Japanese UTF-8 page, Mozilla shows
up unbelievably very dirty glyphs.
Users will feel ill every time they see UTF-8.

Sample UTF-8 page:
http://www.gnome.gr.jp/docs/glib-2.2.x-refs/glib/index.html

All Kanji Glyphs are not Japanese but Chinese,
and hiragana/kanakana glyphs are that in Chinese fonts,
and those are very dirty.

This situation happens when running GTK+2 apps with
en_US.UTF-8. ja_JP.UTF-8 fixes the problem on GTK+2 apps.
So, Mozilla is supposed to be set en_US.UTF-8 somewhere
internally. If so, that's the mistake.

Version-Release number of selected component (if applicable):
mozilla 1.4.1-7

How reproducible:
Always

Steps to Reproduce:
1. Run mozilla in Japanese environment
2. http://www.gnome.gr.jp/docs/glib-2.2.x-refs/glib/index.html
3. Oh-no
    

Actual Results:  Japanese glyphs in Chinese fonts

Expected Results:  Japanese glyphs in Japanese fonts

Additional info:

Once before,
Russian people claimed russian glyphs in Japanese fonts are dirty.
Font designer designs well for their native language, but not
for non-native language. So applications must use always
locale-aware native fonts, rather than collect missing glyphs
randomly.

Developers are happy with locale-independent, but
people need locale-respect now.
Comment 1 Christopher Blizzard 2003-10-21 09:44:45 EDT
How do other apps look?

This might sound silly to ask, but do you have the ttfonts-ja package installed?
Comment 2 Nakai 2003-10-21 22:40:12 EDT
Of course, I install all rpms include ttfonts-ja.
Japanese texts shows up very well on other GTK+2/GNOME2 apps.
(Even metacity window title)
Comment 3 Nakai 2003-10-21 22:41:27 EDT
Created attachment 95370 [details]
LANG=en_US.UTF-8 gedit
Comment 4 Nakai 2003-10-21 22:42:07 EDT
Created attachment 95371 [details]
LANG=ja_JP.UTF-8 gedit
Comment 5 Nakai 2003-10-21 22:43:02 EDT
Created attachment 95372 [details]
sample test text in Japanese UTF-8
Comment 6 Nakai 2003-10-21 22:52:48 EDT
Above 2 shots are the sample different between normal GTK+2 apps in 2
different UTF-8 locales.

#3 attachment - LANG=en_US.UTF-8
#4 attachment - LANG=ja_JP.UTF-8

The #4 ja_JP.UTF-8, gedit works well with Japanese ttfonts-ja fonts.
See the #3 evil en_US.UTF-8 shot.
At first line, just last charactor is correct Japanese glyph.
At the second line, Japanese Hiragana string. Unbelievably dirty.
At the 3rd line, Japanese Katakana string, Extremely very dirty.
  Even my handwriting is much better.
At the 4th line, the glyph of the first char is Chinese's, and not
 correct Japanese. It means broken.

And, Mozilla UTF-8 looks like the #3 shot, the mad one.
Comment 7 Christopher Blizzard 2003-10-22 14:29:22 EDT
Over to fontconfig
Comment 8 Owen Taylor 2003-10-22 15:04:19 EDT
If Japanese text doesn't have a Japanese language tag on it, then
you simply can't expect GTK+ / fontconfig to know that the text
is in Japanese. (Pango-1.4 might do a bit better if the text
includes Hiragana/Katana, but plenty of Japanese text strings have
no Hiragana/Katakana.)

If Mozilla is rendering wrong, it is likely not putting the right language
tag on the page.

(I think "ugly" would be a more appropriate word here than "dirty")
Comment 9 Nakai 2003-10-23 08:12:21 EDT
When I install Epiphany 1.0.3, it works well.

So, it is supporsed that Mozilla embedded component has ability to show
Japanese UTF-8 correctly, and the problem is in Mozilla upper area.
Comment 10 Jens Petersen 2004-11-11 01:16:42 EST
This problem seemed fixed when turning on pango in FC3
with MOZ_ENABLE_PANGO=1 for mozilla, firefox and thunderbird.
Comment 11 Takanori MATSUURA 2004-11-11 02:06:08 EST
Nakai.

Sample URL you show is not good.
Did you check the source of 
http://www.gnome.gr.jp/docs/glib-2.2.x-refs/glib/index.html

At the header region, the following attribute is set though the page
is written with UTF-8.
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">

And the body
<div class="book" lang="en">

In this case, mozilla recognize that the document is written with
ISO-8859-1 and language is English.

Mozilla shows the page exactly with the information from the HTML source.

I don't know how mozilla decides the fonts to display but I guess the
following. Note that my guess is only my impression. I never checked
the source of mozilla.

1. get information from the HTML source.
2. mozilla tries to show the page using the font settings of the 
   language information from the HTML source.
3. if some glyphs are not included in the fonts selected in the step
2, mozilla searches the fonts that have all glyphs with the order of
charset alphabetically.

http://bugzilla.mozilla.gr.jp/show_bug.cgi?id=2231 (Japanese)
That page discusses for Windows OSs but is similar to this bug.

gedit problem in this bug is the another problem 
Comment 12 John Thacker 2004-12-09 15:47:28 EST
This is the same problem as in #107952.
The problem is that the Kochi Gothic and Kochi Mincho fonts don't support
English ("en") according to fontconfig, since they don't contain accented
characters sometimes used in English to write loanwords.  When in an English
Unicode locale, Mozilla assumes that the document is English, and prefers fonts
which support English (according to fontconfig) to all those which don't. 
Hence, it picks MiscFixed (which supports lots of languages, including both jp
and en) over the Kochi fonts for rendering Japanese glyphs in a webpage that it
assumes is English.
Comment 13 John Thacker 2006-10-28 17:00:15 EDT
I'm just going to mark this as a duplicate of 107952.  This was basically that
fontconfig problem, and I can't replicate it now.

*** This bug has been marked as a duplicate of 107952 ***

Note You need to log in before you can comment on or make changes to this bug.