Bug 2013083 - 31-cantarell.conf causes inconsistent font choice for numerals and symbols when the system locale is zh-*
Summary: 31-cantarell.conf causes inconsistent font choice for numerals and symbols wh...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: abattis-cantarell-fonts
Version: 35
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Kalev Lember
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-10-12 04:15 UTC by vtq
Modified: 2022-01-20 14:53 UTC (History)
8 users (show)

Fixed In Version: abattis-cantarell-fonts-0.301-5.fc36 abattis-cantarell-fonts-0.301-5.fc35
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-01-20 14:53:20 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
Inconsistent font for numerals and symbols in Nautilus (35.09 KB, image/png)
2021-10-12 04:15 UTC, vtq
no flags Details
Inconsistent font for numerals and symbols in gnome-shell (297.74 KB, image/png)
2021-10-12 04:17 UTC, vtq
no flags Details
pango selecting fonts based on context (80.03 KB, image/png)
2021-10-12 04:18 UTC, vtq
no flags Details
Expected result (removing relevant font config) - Nautilus (34.49 KB, image/png)
2021-10-12 04:19 UTC, vtq
no flags Details
Expected result (removing relevant font config) - gnome-shell (313.05 KB, image/png)
2021-10-12 04:20 UTC, vtq
no flags Details
all locales for which "fc-match Cantarell" returns a different font (7.89 KB, text/plain)
2021-11-04 05:01 UTC, vtq
no flags Details


Links
System ID Private Priority Status Summary Last Updated
GNOME Gitlab GNOME gnome-shell issues 4683 0 None None None 2021-10-12 04:15:12 UTC

Description vtq 2021-10-12 04:15:12 UTC
Created attachment 1832054 [details]
Inconsistent font for numerals and symbols in Nautilus

Description of problem:

In Fedora Workstation, the GNOME desktop is using Cantarell as its main font. However, the following section in /etc/fonts/conf.d/31-cantarell.conf, which comes from Fedora's abattis-cantarell-fonts package, allows the font choice to be overridden:

  <match target="pattern">
    <test qual="any" name="family">
      <string>Cantarell</string>
    </test>
    <edit name="family" mode="assign" binding="weak">
      <string>Cantarell</string>
    </edit>
  </match>

$ env LANG=en_US.utf8 fc-match Cantarell
Cantarell-Regular.otf: "Cantarell" "Regular"
$ env LANG=zh_CN.utf8 fc-match Cantarell
NotoSansCJK-Regular.ttc: "Noto Sans CJK SC" "Regular"

GTK/Pango seems to select a font based on the language of the text, and the decision is affected by the context. When the text is only numerals and symbols it will assume the language from the locale. This kind of decisions seems to be reasonable too. So for example, Latin script will still be displayed with Cantarell. "1234567890" will be displayed with Noto Sans CJK while "1234567890 GNOME" will be displayed with Cantarell (see screenshot):
$ env LANG=zh_CN.utf8 pango-view --markup -t '<span face="Cantarell" font_features="tnum" size="xx-large">1234567890</span>'
$ env LANG=zh_CN.utf8 pango-view --markup -t '<span face="Cantarell" font_features="tnum" size="xx-large">1234567890 GNOME</span>'

As a result, numerals and symbols in the GNOME UI are sometimes displayed with Cantarell and sometimes displayed with Noto Sans CJK (see screenshots). The inconsistency makes the UI appear broken. This is especially noticeable for time in various places of the UI, because time is displayed in Noto Sans CJK and the separator between hour and minute (a ratio symbol, which is a full-width glyth in Noto Sans CJK and many other CJK fonts) has extra blank space around it. (See https://gitlab.gnome.org/GNOME/gnome-shell/-/issues/4683)


Version-Release number of selected component (if applicable):

0.301-3.fc35


How reproducible:

Always


Steps to Reproduce:

1. In GNOME desktop, choose "中文 臺灣", "中文 香港", or "汉语 中国" in "Settings" – "Region & Language" – "Language".
2. Log out & log in.


Actual results:

Numerals and symbols in the GNOME UI are sometimes displayed with Cantarell and sometimes displayed with Noto Sans CJK (see screenshots).


Expected results:

Numerals and symbols should be consistently displayed with the same font, ideally the same as the Latin alphabet. Since GNOME explicitly requests Cantarell as UI font, Cantarell should be used here. Noto Sans CJK would only be used as fallback for Chinese characters which are not covered by Cantarell.

Note that other common Latin fonts in the system are not overridden by Noto Sans CJK like Cantarell is:
$ env LANG=zh_CN.utf8 fc-match "DejaVu Sans"
DejaVuSans.ttf: "DejaVu Sans" "Regular"
$ env LANG=zh_CN.utf8 fc-match "Liberation Sans"
LiberationSans-Regular.ttf: "Liberation Sans" "Regular"
$ env LANG=zh_CN.utf8 fc-match "Nimbus Sans"
NimbusSans-Regular.otf: "Nimbus Sans" "Regular"

Generally in mixed use of Latin and Chinese fonts, it is common to use the selected Latin font for numbers and symbols and only fallback to the Chinese font for glyths not covered. This is seen in word processors (e.g. LibreOffice), Flatpak apps (because the Freedesktop runtime doesn't have font config similar to the 31-cantarell.conf), as well as other OS (e.g. MacOS, Android).

By deleting the previously mentioned fontconfig section in 31-cantarell.conf, the expected results can be achieved (see screenshots).

env LANG=zh_CN.utf8 fc-match Cantarell
Cantarell-Regular.otf: "Cantarell" "Regular"


Additional info:

In my test without the relevant fontconfig section everything seems to be working very well and I've not noticed any adverse effect. But I'm not aware of why it was originally included in the package. 

An alternative solution is to use Noto Sans CJK for everything if the locale is zh-*. But this is less favorable because it's against upstream choice of Cantarell and needs a change in gsettings-desktop-schemas for these locales. The clock separator issue would additionally need workarounds in Nautilus and in GNOME translations to fix.

Comment 1 vtq 2021-10-12 04:17:17 UTC
Created attachment 1832055 [details]
Inconsistent font for numerals and symbols in gnome-shell

Comment 2 vtq 2021-10-12 04:18:31 UTC
Created attachment 1832056 [details]
pango selecting fonts based on context

Comment 3 vtq 2021-10-12 04:19:35 UTC
Created attachment 1832057 [details]
Expected result (removing relevant font config) - Nautilus

Comment 4 vtq 2021-10-12 04:20:01 UTC
Created attachment 1832058 [details]
Expected result (removing relevant font config) - gnome-shell

Comment 5 Akira TAGOH 2021-10-13 05:55:49 UTC
I'm not sure what was the background on that config but (whether it is good or bad) I guess it is going to avoid the situation where "Cantarell" is used for languages not covered by it, because the order of the priority to estimate the score in fontconfig is roughly coming like "family" (strong binding), "lang", and "family" (weak binding).
"Cantarell" doesn't have zh-cn coverage. that's why the above fc-match examples with the config behaves so. you'll get the expected result if you do fc-match Cantarell:lang=en regardless of current locale.

For Pango case, Pango use "en" lang for Latin characters, but they don't specify any lang for common characters. There are no consistency for fonts for common characters by design if we use different fonts per languages. that's why you see different behavior on numerals and symbols.

That config isn't Fedora standard. I don't mind to drop it but we may need to understand first why it was added, to avoid the regression against it. so we may need some discussion.

Comment 6 Sebastian Keller 2021-10-13 10:55:37 UTC
This seems to be the history of the config:

https://gitlab.gnome.org/GNOME/cantarell-fonts/-/commit/d0a582c32d8456a2ee74f027b299c8b493bdecc9
https://gitlab.gnome.org/GNOME/cantarell-fonts/-/commit/5573252c3d789b55baab0829887aacdd5f10a795
https://src.fedoraproject.org/rpms/abattis-cantarell-fonts/c/0c9edd7285feec0c64522c7a02038c6fbed63f70?branch=rawhide

The commit message of the first commit makes sense for what got removed, but it doesn't fully explain what got added, at least not without context of other fontconfig files that were never part of the canterell-fonts repo. There also doesn't seem to be any related bug report on bugzilla.gnome.org that would further explain it. The commit that added it to the Fedora package doesn't provide a rationale other than that it got removed upstream either.

Comment 7 vtq 2021-11-04 05:00:10 UTC
For languages that Cantarell has no coverage at all, it will fall back on another font even without this config, just like the Noto Sans CJK case here. So perhaps it was to prevent Cantarell from being used for languages that it has but partial coverage of, to avoid switching font within a word? Or maybe certain languages where people would normally use the font for that language to display Latin script too?

Anyway, running the fc-match command against all the available locales from "localectl list-locales" shows a rather long list of locales that might be affected by this config. Is it possible to make an adjustment here only for zh- locales?

Comment 8 vtq 2021-11-04 05:01:11 UTC
Created attachment 1839810 [details]
all locales for which "fc-match Cantarell" returns a different font

Comment 9 vtq 2022-01-05 03:14:51 UTC
I managed to put together some fontconfig changes like this:

--- 31-cantarell.conf.bak	2022-01-04 20:29:20.258428717 -0600
+++ 31-cantarell.conf	2022-01-04 20:29:31.553277713 -0600
@@ -26,6 +26,9 @@
     <test qual="any" name="family">
       <string>Cantarell</string>
     </test>
+    <test name="lang" compare="not_contains">
+      <string>zh</string>
+    </test>
     <edit name="family" mode="assign" binding="weak">
       <string>Cantarell</string>
     </edit>

This solves the issue for all my use cases. I've been using this for a while and haven't noticed any regression, and it shouldn't affect other languages in principle too.

before:

$ LANG=zh_CN.UTF-8 fc-match Cantarell -s | head -n2
NotoSansCJK-Regular.ttc: "Noto Sans CJK SC" "Regular"
Cantarell-Regular.otf: "Cantarell" "Regular"

$ LANG=zh_TW.UTF-8 fc-match Cantarell -s | head -n2
NotoSansCJK-Regular.ttc: "Noto Sans CJK TC" "Regular"
Cantarell-Regular.otf: "Cantarell" "Regular"

$ LANG=en_US.UTF-8 fc-match Cantarell -s | head -n2
Cantarell-Regular.otf: "Cantarell" "Regular"
DejaVuSans.ttf: "DejaVu Sans" "Regular"

$ LANG=ta_IN.UTF-8 fc-match Cantarell -s | head -n2
Lohit-Tamil.ttf: "Lohit Tamil" "Regular"
Cantarell-Regular.otf: "Cantarell" "Regular"

after:

$ LANG=zh_CN.UTF-8 fc-match Cantarell -s | head -n2
Cantarell-Regular.otf: "Cantarell" "Regular"
NotoSansCJK-Regular.ttc: "Noto Sans CJK SC" "Regular"

$ LANG=zh_TW.UTF-8 fc-match Cantarell -s | head -n2
Cantarell-Regular.otf: "Cantarell" "Regular"
NotoSansCJK-Regular.ttc: "Noto Sans CJK TC" "Regular"

$ LANG=en_US.UTF-8 fc-match Cantarell -s | head -n2
Cantarell-Regular.otf: "Cantarell" "Regular"
DejaVuSans.ttf: "DejaVu Sans" "Regular"

$ LANG=ta_IN.UTF-8 fc-match Cantarell -s | head -n2
Lohit-Tamil.ttf: "Lohit Tamil" "Regular"
Cantarell-Regular.otf: "Cantarell" "Regular"

Comment 10 Akira TAGOH 2022-01-13 05:06:25 UTC
I'm not sure what the sort of the information you need from me but this isn't the language specific issue and that won't be a solution but a workaround.
This was originally available in upstream but they have decided to drop. so I'm pondering to drop it from Fedora as well. let's see and revisit if there are any regressions with it.

Comment 11 Fedora Update System 2022-01-13 08:33:32 UTC
FEDORA-2022-b5f96b3eea has been submitted as an update to Fedora 36. https://bodhi.fedoraproject.org/updates/FEDORA-2022-b5f96b3eea

Comment 12 Fedora Update System 2022-01-13 08:34:38 UTC
FEDORA-2022-b5f96b3eea has been pushed to the Fedora 36 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 13 Akira TAGOH 2022-01-13 08:37:50 UTC
Please test. if it works fine, I'll push the updates for f35 too. thanks.

Comment 14 vtq 2022-01-15 19:16:41 UTC
Thanks! I installed the updated package on F35. It works fine and does fix the issue.

Comment 15 Akira TAGOH 2022-01-17 09:03:28 UTC
Thank you for testing. I'll push the updates for f35 soon.

Comment 16 Fedora Update System 2022-01-17 11:03:07 UTC
FEDORA-2022-659f17943d has been submitted as an update to Fedora 35. https://bodhi.fedoraproject.org/updates/FEDORA-2022-659f17943d

Comment 17 Fedora Update System 2022-01-18 01:44:14 UTC
FEDORA-2022-659f17943d has been pushed to the Fedora 35 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2022-659f17943d`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2022-659f17943d

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 18 Fedora Update System 2022-01-20 14:53:20 UTC
FEDORA-2022-659f17943d has been pushed to the Fedora 35 stable repository.
If problem still persists, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.