Bug 529594

Summary: [CJK] pango uses different fonts for LATIN and COMMON glyphs
Product: [Fedora] Fedora Reporter: Peng Huang <phuang>
Component: pangoAssignee: Behdad Esfahbod <behdad>
Status: CLOSED DUPLICATE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: 12CC: behdad, dchen, fonts-bugs, jni, K9, masao-takahashi, petersen, tagoh, tfujiwar
Target Milestone: ---Keywords: Reopened, Triaged
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-03-03 09:14:36 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
gedit screenshot
none
screenshot of gnome-terminal
none
clearer gnome terminal screenshot
none
Always use Latin fonts for ascii charactors
none
Patch for pango-script.c
none
Patch for fontconfig comcharset
none
Patch for pango compcharset
none
Patch for pango compcharset none

Description Peng Huang 2009-10-19 03:57:03 UTC
Description of problem:
Fontconfig can not pick up correct font file for 'Monospace'

Version-Release number of selected component (if applicable):
fontconfig-2.7.3-1.fc12.x86_64

How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Peng Huang 2009-10-19 03:59:42 UTC
Created attachment 365193 [details]
gedit screenshot

The @ chars are form different fonts for zh_CN.UTF-8 locale

Comment 2 Peng Huang 2009-10-19 04:04:23 UTC
Created attachment 365194 [details]
screenshot of gnome-terminal

Numbers and puncts are from bitmap fonts. I think it should pickup same fonts of letters.

Comment 3 Jens Petersen 2009-10-19 04:34:03 UTC
(In reply to comment #1)
> Created an attachment (id=365193) [details]
> gedit screenshot
> The @ chars are form different fonts for zh_CN.UTF-8 locale  

I think this is a duplicate of pango bug 129541 (I am having trouble locating
the fedora equivalent bug - there should be one upstream, but it has been
labelled wontfix for better or worse).  One way to fix might be to default to
native font for ASCII when available - perhaps we can explore that under
fontconfig?

Comment 4 Jens Petersen 2009-10-19 04:35:30 UTC
Created attachment 365196 [details]
clearer gnome terminal screenshot

Here is a clearer screenshot at a larger size.

Comment 5 Jens Petersen 2009-10-19 04:39:53 UTC
(A workaround of course is to set a Chinese font in the g-t profile.)

Comment 6 Peng Huang 2009-10-19 05:12:36 UTC
BTW, I added a new conf file (99-zh_CN.conf) to let fontconfig pick up glyphs from "DejaVu Sans Mono" for Monospace. But it doesn't work.

=============== 99-zh_CN.conf ==============
<fontconfig>
	<match target="pattern">
		<test qual="any" name="family">
			<string>serif</string>
		</test>
		<edit name="family" mode="prepend" binding="strong">
			<string>Bitstream Vera Serif</string>
			<string>DejaVu Serif</string>
			<string>AR PL ShanHeiSun Uni</string>
			<string>WenQuanYi Bitmap Song</string>
			<string>AR PL UMing CN</string>
			<string>AR PL UKai CN</string>
			<string>AR PL ZenKai Uni</string>
		</edit>
	</match> 
	<match target="pattern">
		<test qual="any" name="family">
			<string>sans-serif</string>
		</test>
		<edit name="family" mode="prepend" binding="strong">
			<string>Bitstream Vera Sans</string>
			<string>DejaVu Sans</string>
			<string>WenQuanYi Zen Hei</string>
			<string>AR PL UMing CN</string>
			<string>AR PL ShanHeiSun Uni</string>
			<string>WenQuanYi Bitmap Song</string>
			<string>AR PL UKai CN</string>
			<string>AR PL ZenKai Uni</string>
		</edit>
	</match> 
	<match target="pattern">
		<test qual="any" name="family">
			<string>monospace</string>
		</test>
		<edit name="family" mode="prepend" binding="strong">
			<string>Bitstream Vera Sans Mono</string>
			<string>DejaVu Sans Mono</string>
			<string>WenQuanYi Zen Hei</string>
			<string>AR PL UMing CN</string>
			<string>AR PL ShanHeiSun Uni</string>
			<string>WenQuanYi Bitmap Song</string>
			<string>AR PL UKai CN</string>
			<string>AR PL ZenKai Uni</string>
		</edit>
	</match> 
</fontconfig>

Comment 7 Jens Petersen 2009-10-19 06:56:31 UTC
A few testcases:


LANG=zh_CN.UTF-8 pango-view --font="Monospace" <(echo -e "my@ip-1-2-3 ~$\n  @@ 1 2 3")

LANG=ja_JP.UTF-8 pango-view --font="Monospace" <(echo -e "my@ip-1-2-3 ~$\n  @@ 1 2 3")

LANG=ko_KR.UTF-8 pango-view --font="Monospace" <(echo -e "my@ip-1-2-3 ~$\n  @@ 1 2 3")

Comment 8 fujiwara 2009-10-19 07:12:07 UTC
I think it's better to always use CJK fonts for the CJK locales.
The following is my previous patch.
http://src.opensolaris.org/source/xref/jds/spec-files/trunk/patches/pango-03-solaris-cjk-font-table.diff

I'll try to upstream the patch later.

Comment 9 Ding-Yi Chen 2009-10-19 08:06:10 UTC
Temporary measure:

Use other mono space font such as "DejaVu Sans Mono"

Comment 10 Jens Petersen 2009-10-19 08:09:59 UTC
I think this is a duplicate of bug 485566.

*** This bug has been marked as a duplicate of bug 485566 ***

Comment 11 James Ni 2009-10-19 08:17:04 UTC
I try to add some clauses as below to my wqy-zenhei.conf under conf.d. And then WenQuanYi Zen Hei will be used for mono space. 

<match>
        <test name="lang" compare="contains">
              <string>zh</string>
        </test>
        <test name="family">
        <string>monospace</string>
        </test>
        <edit name="family" mode="prepend" binding="same">
                <string>WenQuanYi Zen Hei</string>
        </edit>
</match>

Comment 12 Nicolas Mailhot 2009-10-19 21:41:17 UTC
Take a look at the l10n-font-template.conf in fontpackages-devel

That's the best we have right now and I wouldn't advise inventing anything else without blessing from Behdad

Comment 13 fujiwara 2009-10-20 00:38:13 UTC
If the lang tag works, this is not duplicate of bug 485566 and should modify the fontconfig.
I don't talk about the fontconfig bug.

Comment 14 Peng Huang 2009-10-20 08:03:25 UTC
Created attachment 365314 [details]
Always use Latin fonts for ascii charactors

Comment 15 Nicolas Mailhot 2009-10-20 08:30:36 UTC
(In reply to comment #14)
> Created an attachment (id=365314) [details]
> Always use Latin fonts for ascii charactors  

Please do not propose fontconfig rules that make assumptions about the system default latin fonts. A CJK font can bump its priority on certain locales but never assume it knows the non-CJK fonts to use.

If you want to change fontconfig rules, start from our default fontconfig templates, and if they are not good enough for you, discuss changes on the fonts list and with the fontconfig maintainer.

Comment 16 Akira TAGOH 2009-10-20 09:03:31 UTC
(In reply to comment #15)
> (In reply to comment #14)
> > Created an attachment (id=365314) [details] [details]
> > Always use Latin fonts for ascii charactors  
> 
> Please do not propose fontconfig rules that make assumptions about the system
> default latin fonts. A CJK font can bump its priority on certain locales but
> never assume it knows the non-CJK fonts to use.
> 
> If you want to change fontconfig rules, start from our default fontconfig
> templates, and if they are not good enough for you, discuss changes on the
> fonts list and with the fontconfig maintainer.  

FWIW it would be better looking back all of the fontconfig file available in Fedora now for Fedora 13 and correct as needed to be sane.  Though cjkuni-*-fonts provides a lot of the fontconfig files, I just wonder if all of them are really accepted - who knows that? it may be a good idea if we have any exception list of the fontconfig files for the packages. having any figures for the relationship would be good too. that would makes easier to understand how the fontconfig file affects when adding new one.

Comment 17 Peng Huang 2009-10-20 10:42:39 UTC
(In reply to comment #15)
> (In reply to comment #14)
> > Created an attachment (id=365314) [details] [details]
> > Always use Latin fonts for ascii charactors  
> 
> Please do not propose fontconfig rules that make assumptions about the system
> default latin fonts. A CJK font can bump its priority on certain locales but
> never assume it knows the non-CJK fonts to use.

What's your suggestion to modify the patch?

Comment 18 Nicolas Mailhot 2009-10-20 11:40:41 UTC
Try to make the tests for the prepend rule more fine-grained.

Alternatively, if this is a behaviour needed for all CJK fonts, modify fontconfig and/or pango code to make this decision by default for all cjk fonts.

But fontconfig kludges that assume CJK packagers know the right ordering for latin fonts are plain wrong, CJK people have been injecting variants on them in the system for years, they always caused no end of unintended side-effects, and never actually fixed CJK problems.

As a case in point your patch does not even respect the current Fedora latin font priority order, and even if it did no one will notify you when it will change (as has happened several times in the past), so even if this patch was fixed now it will cause problems one day or another.

And latin is the easy part, you also have cyrillic, greek, and all the other stuff needlessly duplicated in CJK fonts to take into account.

A basic fontconfig sanity rule is to never try to change the ordering of fonts external to the package, because they are the responsibility of other maintainers, not yours.

Comment 19 Nicolas Mailhot 2009-10-20 11:54:54 UTC
(In reply to comment #16)
 
> FWIW it would be better looking back all of the fontconfig file available in
> Fedora now for Fedora 13 and correct as needed to be sane.

Sure it would be better; but I don't have the cycles myself. All I can do right now is to try to make sure no new dubious files make it in the distro.

The basic rule is: if a fontconfig pattern is not documented in fontpackages-devel it is wrong for Fedora, and no pattern gets in fontpackages-devel without approval from Behdad and eventually discussion on the fonts list. That's the only way to make everyone converge and avoid different packagers making conflicting decisions.

In an ideal world someone would write the xslt rules to check that existing files conform to documented patterns, and integrate them in repo-font-audit (and packagers would actually act on repo-font-audit reports).

Also, it would be nice if somehow wrote a xslt script or another command that actually displayed in human-readable form the priority stack configured on a system (fc-list output unfortunately depends on the fonts installed on-system, so depending on local fonts the same conf will yield different results. Which is not a way to do reliable QA)

Interested people can send me fontpackages patches or ask for fontpackages git access.

http://fedoraproject.org/wiki/Fedora_fonts_policy_package

Comment 20 Behdad Esfahbod 2009-10-22 19:04:59 UTC
Maybe someone can first describe what the problem is, without jumping to patching...   My best guess so far is:

"In CJK locales, the Latin glyphs should be selected from the same font as the CJK glyphs, if the font supports that."

If that's the problem, it's an old one, and one that we don't have a solution for just yet.  Requires this before it can be fixed:
https://bugs.freedesktop.org/show_bug.cgi?id=17311

Comment 21 Peng Huang 2009-10-23 02:16:22 UTC
Hi Behdad,

At the begin, I reported this bug is about pango render common script with different fonts in one document. Especially in Chinese fedora desktop, pango mixes truetype and bitmap common script chars together (Please see the attach in comment #1). It looks very weird.

I created a fontconf http://phuang.fedorapeople.org/63-zh_CN.conf. it can force pango always use  Latin (DejaVu) fonts to render common script in Chinese desktop. It works very well, and will not affect other languages. But it will override Latin fonts priority of system for Chinese language, it seems the conf breaks fedora font config rules.

I also tried another way to fix this problem in pango. the modified pango is in http://github.com/phuang/pango/commits/master. It will process PANGO_SCRIPT_COMMON as a real script. And we can modify the pango_script_for_lang to decide which language fonts can support common script, and which fonts can not support common script. For zh-*, we could just put PANGO_SCRIPT_HAN in the table, and then pango will only use Chinese fonts to render HAN script. It works too.

It is current situation. Please give us some suggestions. Thanks.

Comment 22 Jens Petersen 2009-10-23 05:57:04 UTC
I think this is a duplicate of bug 228804
(https://bugzilla.gnome.org/show_bug.cgi?id=481210).

Comment 23 Jens Petersen 2009-10-23 06:05:54 UTC
(In reply to comment #20)
> "In CJK locales, the Latin glyphs should be selected from the same font as the
> CJK glyphs, if the font supports that."

This is one possible solution yes.

Though I think that would change English webpages to
render in a Chinese font say which is less attractive.

> If that's the problem, it's an old one, and one that we don't have a solution
> for just yet.  Requires this before it can be fixed:
> https://bugs.freedesktop.org/show_bug.cgi?id=17311  

I wish we can come up with a workaround via fontconfig say
if necessary in the meantime.

Comment 24 fujiwara 2009-10-27 02:25:55 UTC
Created attachment 366194 [details]
Patch for pango-script.c

My understanding is, the problem is ASCII alphabet uses the latin fonts but ASCII number and sign(e.g. '@') use the different fonts.

Probably it's a good idea to use the same font for ASCII for all languages.
However fontconfig is better?

Comment 25 Behdad Esfahbod 2009-10-29 00:59:53 UTC
I've said multiple times, that approach is not acceptable.

Comment 26 fujiwara 2009-10-29 01:34:50 UTC
(In reply to comment #25)
> I've said multiple times, that approach is not acceptable.  

I know it's not acceptable but I tried to clarify the problem above.
This problem is, how we use the common script.
It seems zh-cn people like to use the common ASCII script with latin fonts in monospace/sans/serif.

Currently I'm thinking if a feature in fontconfig could be an approach.
I noticed pango_engine_shape_real_covers() checks gunichar coverage.
E.g. <weak range>0x00-0x7f</weak> in fontconfig would be an idea?
I'll have a look at fontconfig next week.

On the other hand, I'd like to use ja fonts for common script, latin script and other covered scripts in monospace.

Comment 27 Bug Zapper 2009-11-16 13:49:20 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 12 development cycle.
Changing version to '12'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 28 fujiwara 2009-12-02 10:43:39 UTC
Created attachment 375392 [details]
Patch for fontconfig comcharset

This is a feature to customize pango common script in fontconfig.

	<match target="font">
		<test name="family" compare="contains">
			<string>AR PL</string>
		</test> 
		<edit name="compcharset" mode="assign">
			<compcharset>
			<complang>zh</complang>
			<compmode>weak</compmode>
			<string>0x00-0x7F</string>
			</compcharset>
		</edit>
	</match>

If the config file has the lines, fc-match shows the composite char flags.
The idea is, if compmode is weak, the code point is weak in the specified font and get the char from other fonts when "monospace", "sans" or "sans-serif" is referred.

Comment 29 fujiwara 2009-12-02 10:46:13 UTC
Created attachment 375394 [details]
Patch for pango compcharset

The patch is the pango part with attachment #375392 [details] .
If there is no specific comments, I'll file the patches in upstream bugs.

Comment 30 fujiwara 2009-12-09 08:35:28 UTC
Created attachment 377115 [details]
Patch for pango compcharset

Revised the patch.

Now this pango patch also could support that 'compmode' is strong.

	<match target="font">
		<test name="family" compare="contains">
			<string>VL Gothic</string>
		</test> 
		<edit name="compcharset" mode="assign">
			<compcharset>
			<complang>ja</complang>
			<compmode>strong</compmode>
			<string>all</string>
			</compcharset>
		</edit>
	</match>

Then pango latin, common, hira/kana, greek, etc could use ja fonts on ja locale.

Comment 31 fujiwara 2009-12-09 10:01:52 UTC
Filed the upstreamed bug in pango side at the moment.
https://bugzilla.gnome.org/show_bug.cgi?id=604149

Comment 32 Jens Petersen 2010-03-03 09:14:36 UTC

*** This bug has been marked as a duplicate of bug 228804 ***