Bug 2262410 - Fonts are looking wrong after 20240101 update
Summary: Fonts are looking wrong after 20240101 update
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: libass
Version: 39
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Neal Gompa
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2024-02-02 15:42 UTC by Łukasz Patron
Modified: 2025-09-10 16:37 UTC (History)
14 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2024-11-27 22:50:31 UTC
Type: ---
Embargoed:
petersen: mirror+


Attachments (Terms of Use)
screenshot for freetype code (6.89 KB, image/png)
2024-03-11 10:57 UTC, Akira TAGOH
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker FC-1130 0 None None None 2024-03-01 06:51:36 UTC

Description Łukasz Patron 2024-02-02 15:42:08 UTC
Fonts look improperly in multiple places, e.g. mpv or KDE file picker.

Reproducible: Always

Steps to Reproduce:
1. Open mpv and look at OSD fonts
Actual Results:  
Fonts look wrong

Expected Results:  
Fonts look properly

See https://github.com/mpv-player/mpv/issues/13396 for screenshots.

Comment 1 faustian 2024-02-29 18:52:14 UTC
Comparing to last version 56-google-noto-sans-arabic-vf.conf lacks language match test.

diff /tmp/20230801/65-0-google-noto-sans-arabic-vf.conf /tmp/20240101/56-google-noto-sans-arabic-vf.conf
5,7d4
<     <test name="lang" compare="contains">
<       <string>ar</string>
<     </test>

Comment 2 Łukasz Patron 2024-02-29 18:56:50 UTC
(In reply to faustian from comment #1)
> Comparing to last version 56-google-noto-sans-arabic-vf.conf lacks language
> match test.
> 
> diff /tmp/20230801/65-0-google-noto-sans-arabic-vf.conf
> /tmp/20240101/56-google-noto-sans-arabic-vf.conf
> 5,7d4
> <     <test name="lang" compare="contains">
> <       <string>ar</string>
> <     </test>

Can confirm that adding language check to /usr/share/fontconfig/conf.avail/56-google-noto-sans-arabic-vf.conf fixes my issue.

Comment 3 Łukasz Patron 2024-02-29 19:07:41 UTC
(In reply to Łukasz Patron from comment #2)
> (In reply to faustian from comment #1)
> > Comparing to last version 56-google-noto-sans-arabic-vf.conf lacks language
> > match test.
> > 
> > diff /tmp/20230801/65-0-google-noto-sans-arabic-vf.conf
> > /tmp/20240101/56-google-noto-sans-arabic-vf.conf
> > 5,7d4
> > <     <test name="lang" compare="contains">
> > <       <string>ar</string>
> > <     </test>
> 
> Can confirm that adding language check to
> /usr/share/fontconfig/conf.avail/56-google-noto-sans-arabic-vf.conf fixes my
> issue.

Opened PR @ https://src.fedoraproject.org/rpms/google-noto-fonts/pull-request/7

Comment 4 Akira TAGOH 2024-03-05 02:18:18 UTC
What's your locale?

Comment 5 Łukasz Patron 2024-03-05 08:11:49 UTC
(In reply to Akira TAGOH from comment #4)
> What's your locale?

$ locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=

Comment 6 Akira TAGOH 2024-03-05 08:53:09 UTC
That is weird. Noto Sans Arabic doesn't have "en" covarage. I can't reproduce this except running with --osd-font="Noto Sans Arabic".  Do you have any your own config? how can I reproduce this?
I tried `LANG=en_US.UTF-8 mpv --player-operation-mode=pseudo-gui filename-mixing-12345.mp4` but it works as expected. but `LANG=en_US.UTF-8 mpv --player-operation-mode=pseudo-gui --osd-font="Noto Sans Arabic" filename-mixing-12345.mp4` not.

FWIW Noto Sans Arabic doesn't have alphabet glyphs. apparently libass is falling back to reuder alphabets. though Noto Sans and Noto Sans Arabic has same outline for numeric characters. If mpv use Noto Sans as a fallback, the rendering should be like your screenshot. I suspect something may went wrong in libass.

Comment 7 Akira TAGOH 2024-03-05 08:55:49 UTC
> FWIW Noto Sans Arabic doesn't have alphabet glyphs. apparently libass is falling back to reuder alphabets. though Noto Sans and Noto Sans Arabic has same outline for numeric characters. If mpv use Noto Sans as a fallback, the rendering should be like your screenshot. I suspect something may went wrong in libass.

Doh.

I meant:

FWIW Noto Sans Arabic doesn't have alphabet glyphs. apparently libass is falling back to RENDER alphabets. though Noto Sans and Noto Sans Arabic has same outline for numeric characters. If mpv use Noto Sans as a fallback, the rendering should NOT be like your screenshot. I suspect something may went wrong in libass.

Comment 8 Łukasz Patron 2024-03-05 09:03:21 UTC
(In reply to Akira TAGOH from comment #6)
> That is weird. Noto Sans Arabic doesn't have "en" covarage. I can't
> reproduce this except running with --osd-font="Noto Sans Arabic".  Do you
> have any your own config? how can I reproduce this?
> I tried `LANG=en_US.UTF-8 mpv --player-operation-mode=pseudo-gui
> filename-mixing-12345.mp4` but it works as expected. but `LANG=en_US.UTF-8
> mpv --player-operation-mode=pseudo-gui --osd-font="Noto Sans Arabic"
> filename-mixing-12345.mp4` not.
> 
> FWIW Noto Sans Arabic doesn't have alphabet glyphs. apparently libass is
> falling back to reuder alphabets. though Noto Sans and Noto Sans Arabic has
> same outline for numeric characters. If mpv use Noto Sans as a fallback, the
> rendering should be like your screenshot. I suspect something may went wrong
> in libass.

Here are my steps to reproduce:

1. Boot up Fedora-Workstation-Live-x86_64-Rawhide-20240227.n.0.iso
2. Leave locale settings as is, just click next few times until you're using live desktop
3. sudo dnf install mpv
4. Take screenshot
5. mpv ~/Pictures/Screenshots/*
6. Notice that fonts for numbers are wrong

Comment 9 Łukasz Patron 2024-03-05 09:03:39 UTC
(In reply to Łukasz Patron from comment #8)
> (In reply to Akira TAGOH from comment #6)
> > That is weird. Noto Sans Arabic doesn't have "en" covarage. I can't
> > reproduce this except running with --osd-font="Noto Sans Arabic".  Do you
> > have any your own config? how can I reproduce this?
> > I tried `LANG=en_US.UTF-8 mpv --player-operation-mode=pseudo-gui
> > filename-mixing-12345.mp4` but it works as expected. but `LANG=en_US.UTF-8
> > mpv --player-operation-mode=pseudo-gui --osd-font="Noto Sans Arabic"
> > filename-mixing-12345.mp4` not.
> > 
> > FWIW Noto Sans Arabic doesn't have alphabet glyphs. apparently libass is
> > falling back to reuder alphabets. though Noto Sans and Noto Sans Arabic has
> > same outline for numeric characters. If mpv use Noto Sans as a fallback, the
> > rendering should be like your screenshot. I suspect something may went wrong
> > in libass.
> 
> Here are my steps to reproduce:
> 
> 1. Boot up Fedora-Workstation-Live-x86_64-Rawhide-20240227.n.0.iso
> 2. Leave locale settings as is, just click next few times until you're using
> live desktop
> 3. sudo dnf install mpv
> 4. Take screenshot
> 5. mpv ~/Pictures/Screenshots/*
> 6. Notice that fonts for numbers are wrong

Forgot to post screenshot: https://i.imgur.com/e6htvMy.png

Comment 10 Akira TAGOH 2024-03-07 08:15:35 UTC
Well, I'd say this is mpv issue actually. I see they intentionally ignore lang object in a fontconfig cache and process fonts in their own way. Picking up Noto Sans Arabic on en_US locale doesn't make sense.
Plus, even if mixing up with Noto Sans and Noto Sans Arabic all together for alphabets and numeric characters, pango-view such as `pango-view --markup --text '<span fallback="false"><span font="Noto Sans 20">ABC</span><span font="Noto Sans Arabic 20">01234</span></span>'` and similar html on web browsers doesn't render like it.

Therefore it should be mpv issue. Reassigning.

Comment 11 Łukasz Patron 2024-03-07 09:47:43 UTC
(In reply to Akira TAGOH from comment #10)
> Well, I'd say this is mpv issue actually. I see they intentionally ignore
> lang object in a fontconfig cache and process fonts in their own way.
> Picking up Noto Sans Arabic on en_US locale doesn't make sense.
> Plus, even if mixing up with Noto Sans and Noto Sans Arabic all together for
> alphabets and numeric characters, pango-view such as `pango-view --markup
> --text '<span fallback="false"><span font="Noto Sans 20">ABC</span><span
> font="Noto Sans Arabic 20">01234</span></span>'` and similar html on web
> browsers doesn't render like it.
> 
> Therefore it should be mpv issue. Reassigning.

Are they really ignoring the lang object? As far as I can say, you removed it in this change https://src.fedoraproject.org/rpms/google-noto-fonts/c/609f6bc1323476289beda736ce3912b73b8900ad?branch=rawhide and after restoring it, mpv works just fine.

Comment 12 Akira TAGOH 2024-03-07 10:32:03 UTC
Sorry, correctly libass. see https://github.com/libass/libass/blob/649a7c2e1fc6f4188ea1a89968560715800b883d/libass/ass_fontconfig.c#L209

> Are they really ignoring the lang object? As far as I can say, you removed it in this change https://src.fedoraproject.org/rpms/google-noto-fonts/c/609f6bc1323476289beda736ce3912b73b8900ad?branch=rawhide and after restoring it, mpv works just fine.

fontconfig does. that's why it works after adding lang="ar" back. In that case, fontconfig drops it because you aren't on ar locale. but the problem is that Noto Sans Arabic isn't for ar only. this was why I removed it there.

Anyway, apparently libass doesn't rely on fontconfig to get the best font against requests. they just gather font list instead and use something they believe it is a best font for context, which isn't in this case. then it should be fixed in libass.

Comment 13 Łukasz Patron 2024-03-07 11:09:23 UTC
(In reply to Akira TAGOH from comment #12)
> Sorry, correctly libass. see
> https://github.com/libass/libass/blob/
> 649a7c2e1fc6f4188ea1a89968560715800b883d/libass/ass_fontconfig.c#L209
> 
> > Are they really ignoring the lang object? As far as I can say, you removed it in this change https://src.fedoraproject.org/rpms/google-noto-fonts/c/609f6bc1323476289beda736ce3912b73b8900ad?branch=rawhide and after restoring it, mpv works just fine.
> 
> fontconfig does. that's why it works after adding lang="ar" back. In that
> case, fontconfig drops it because you aren't on ar locale. but the problem
> is that Noto Sans Arabic isn't for ar only. this was why I removed it there.
> 
> Anyway, apparently libass doesn't rely on fontconfig to get the best font
> against requests. they just gather font list instead and use something they
> believe it is a best font for context, which isn't in this case. then it
> should be fixed in libass.

Doesn't seem like libass is used for rendering mpv UI, I recompiled libass.so with `FcPatternDel(pat, FC_LANG);` removed and ran `LD_LIBRARY_PATH=/tmp/tmp.TGPiZYtOoQ/libass/libass/.libs mpv ...`, confirmed that my custom libass.so is being loaded, but the issue is still there.

Comment 14 Łukasz Patron 2024-03-07 11:19:49 UTC
(In reply to Łukasz Patron from comment #13)
> (In reply to Akira TAGOH from comment #12)
> > Sorry, correctly libass. see
> > https://github.com/libass/libass/blob/
> > 649a7c2e1fc6f4188ea1a89968560715800b883d/libass/ass_fontconfig.c#L209
> > 
> > > Are they really ignoring the lang object? As far as I can say, you removed it in this change https://src.fedoraproject.org/rpms/google-noto-fonts/c/609f6bc1323476289beda736ce3912b73b8900ad?branch=rawhide and after restoring it, mpv works just fine.
> > 
> > fontconfig does. that's why it works after adding lang="ar" back. In that
> > case, fontconfig drops it because you aren't on ar locale. but the problem
> > is that Noto Sans Arabic isn't for ar only. this was why I removed it there.
> > 
> > Anyway, apparently libass doesn't rely on fontconfig to get the best font
> > against requests. they just gather font list instead and use something they
> > believe it is a best font for context, which isn't in this case. then it
> > should be fixed in libass.
> 
> Doesn't seem like libass is used for rendering mpv UI, I recompiled
> libass.so with `FcPatternDel(pat, FC_LANG);` removed and ran
> `LD_LIBRARY_PATH=/tmp/tmp.TGPiZYtOoQ/libass/libass/.libs mpv ...`, confirmed
> that my custom libass.so is being loaded, but the issue is still there.

Never mind, it is used, but commenting out that line still makes no difference.

Comment 15 Akira TAGOH 2024-03-07 11:41:38 UTC
Well, not that easy to fix. It seems that they use FC_CHARSET to check the coverage and select a font (see https://github.com/libass/libass/blob/master/libass/ass_fontselect.c#L722 and https://github.com/libass/libass/blob/master/libass/ass_fontconfig.c#L55). though I don't know why they have different bounding box (or using different size? dunno) for both fonts.

Comment 16 Oleg Oshmyan 2024-03-09 13:45:34 UTC
libass uses Fontconfig's FcConfigSubstitute to convert mpv's "sans-serif" font name to a real font:
https://github.com/libass/libass/blob/649a7c2e1fc6f4188ea1a89968560715800b883d/libass/ass_fontconfig.c#L286

Fedora's config says Noto Sans Arabic is the default sans-serif font:

>      { alias="sans-serif", variable=true, family="Sans Arabic",
>         obsoletes={ "sans-arabic-ui-vf" },
>         default=true
>       },

Noto Sans Arabic does contain glyphs for European digits, so they are used.

What else could libass possibly do?

On the mpv tracker, OP also said he was seeing weird fonts in other applications besides mpv:
https://github.com/mpv-player/mpv/issues/13396#issuecomment-1924131260

Comment 17 Łukasz Patron 2024-03-09 13:59:10 UTC
(In reply to Oleg Oshmyan from comment #16)
> libass uses Fontconfig's FcConfigSubstitute to convert mpv's "sans-serif"
> font name to a real font:
> https://github.com/libass/libass/blob/
> 649a7c2e1fc6f4188ea1a89968560715800b883d/libass/ass_fontconfig.c#L286
> 
> Fedora's config says Noto Sans Arabic is the default sans-serif font:
> 
> >      { alias="sans-serif", variable=true, family="Sans Arabic",
> >         obsoletes={ "sans-arabic-ui-vf" },
> >         default=true
> >       },
> 
> Noto Sans Arabic does contain glyphs for European digits, so they are used.
> 
> What else could libass possibly do?
> 
> On the mpv tracker, OP also said he was seeing weird fonts in other
> applications besides mpv:
> https://github.com/mpv-player/mpv/issues/13396#issuecomment-1924131260

Yeah, the thing I noticed was that after Noto Sans update, KDE file picker rows got higher — https://imgur.com/a/6dBRYLg (before; after).

Comment 18 Oleg Oshmyan 2024-03-09 14:02:28 UTC
> but the problem is that Noto Sans Arabic isn't for ar only. this was why I removed it there.

FYI I believe Fontconfig allows multiple language tags to be specified per font. I suspect this is probably what should be done here: list all the system locales for which Sans Arabic should be the default.


I must also say I'm confused by the it's-Fontconfig-it's-libass back-and-forth earlier in this discussion. For a start, the lang element has been removed, so it makes no difference whether libass or Fontconfig handle it or ignore it. At the same time, if the element is restored, it's been shown that the font configuration does work as expected in libass, so this tells us that *either* (or both) Fontconfig or libass handle the tag but doesn't tell us which of the two does it. As a matter of fact, Fontconfig's pattern substitution is the one that handles the tag in this scenario, but Fontconfig doesn't remove the font from the complete font list seen by libass: it would be impossible to render Arabic subtitles if it did.

Comment 19 Akira TAGOH 2024-03-11 08:29:02 UTC
> Yeah, the thing I noticed was that after Noto Sans update, KDE file picker rows got higher — https://imgur.com/a/6dBRYLg (before; after).

Apparently it is different issue.

Comment 20 Łukasz Patron 2024-03-11 08:55:07 UTC
(In reply to Akira TAGOH from comment #19)
> > Yeah, the thing I noticed was that after Noto Sans update, KDE file picker rows got higher — https://imgur.com/a/6dBRYLg (before; after).
> 
> Apparently it is different issue.

Well, it only happens when Noto Sans Arabic is missing lang=ar.

Comment 21 Akira TAGOH 2024-03-11 10:57:13 UTC
Created attachment 2021107 [details]
screenshot for freetype code

> FYI I believe Fontconfig allows multiple language tags to be specified per font. I suspect this is probably what should be done here: list all the system locales for which Sans Arabic should be the default.

Well, it might be able to reduce the scope of the problem. but it doesn't completely fix this issue because it will still happens for those locales. This isn't the chicken-and-egg problem. the rendering issue is out of the scope for fontconfig at all.

> Fontconfig doesn't remove the font from the complete font list seen by libass

No, not exactly. What libass does is to obtain *all* the font list for sans-serif no matter what coverage they are. As I said earlier, libass has an own function to filter out fonts. There are nothing else fontconfig can do in that approach. Plus, I don't see any problems on the rendering of mixing Noto Sans and Noto Sans Arabic on Pango and web browsers for example (Though I don't know if mpv really use Noto Sans for alphabets and Noto Sans Arabic for digits). I suppose it isn't a font issue even.

I tried to render both fonts together with simple FreeType code though (well, the baseline isn't straight but it may be wrong in my code), the rendering doesn't look like the original screenshot.

https://gist.github.com/tagoh/447f847f3f651695482508bb40980a33

So I still suspect it may be libass issue.

Comment 22 Akira TAGOH 2024-03-11 11:17:24 UTC
(In reply to Łukasz Patron from comment #20)
> (In reply to Akira TAGOH from comment #19)
> > > Yeah, the thing I noticed was that after Noto Sans update, KDE file picker rows got higher — https://imgur.com/a/6dBRYLg (before; after).
> > 
> > Apparently it is different issue.
> 
> Well, it only happens when Noto Sans Arabic is missing lang=ar.

Again, Noto Sans Arabic doesn't have "en" coverage and isn't supposed to be used for Latin characters in theory, regardless of whether there are lang="ar" line or not, *except* applications do the special things.

From the POV of fontconfig, the following results says all:
$ fc-match sans-serif:lang=en
NotoSans-Regular.ttf: "Noto Sans" "Regular"
$ fc-match sans-serif:lang=ar
NotoSansArabic[wght].ttf: "Noto Sans Arabic" "Regular"

Also:
$ locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
$ fc-match sans-serif:charset=0x30
NotoSans-Regular.ttf: "Noto Sans" "Regular"
$ fc-match sans-serif:charset=0x41
NotoSans-Regular.ttf: "Noto Sans" "Regular"

$ export LANG=ar_DZ.UTF-8
$ locale
LANG=ar_DZ.UTF-8
LC_CTYPE="ar_DZ.UTF-8"
LC_NUMERIC="ar_DZ.UTF-8"
LC_TIME="ar_DZ.UTF-8"
LC_COLLATE="ar_DZ.UTF-8"
LC_MONETARY="ar_DZ.UTF-8"
LC_MESSAGES="ar_DZ.UTF-8"
LC_PAPER="ar_DZ.UTF-8"
LC_NAME="ar_DZ.UTF-8"
LC_ADDRESS="ar_DZ.UTF-8"
LC_TELEPHONE="ar_DZ.UTF-8"
LC_MEASUREMENT="ar_DZ.UTF-8"
LC_IDENTIFICATION="ar_DZ.UTF-8"
LC_ALL=
$ fc-match sans-serif:charset=0x30
NotoSansArabic[wght].ttf: "Noto Sans Arabic" "Regular"
$ fc-match sans-serif:charset=0x41
Vazirmatn[wght].ttf: "Vazirmatn" "Regular"

Comment 23 Akira TAGOH 2024-03-11 12:06:35 UTC
Hmm, this issue seems not happening with google-noto-sans-arabic-fonts instead of google-noto-sans-arabic-vf-fonts.

libass doesn't support variable fonts well then.

Comment 24 Oleg Oshmyan 2024-03-11 19:46:46 UTC
> > Fontconfig doesn't remove the font from the complete font list seen by libass
>
> No, not exactly. What libass does is to obtain *all* the font list for sans-serif no matter what coverage they are. As I said earlier, libass has an own function to filter out fonts.

The list returned by Fontconfig is ordered by priority. libass uses the first (highest-priority) font that satisfies the request. The choice of priority lies entirely with Fontconfig.

Then, libass looks up this font in the full list of fonts installed on the system, which includes Arabic fonts regardless of locale, because it does want to be able to render Arabic files on any system.

> Again, Noto Sans Arabic doesn't have "en" coverage and isn't supposed to be used for Latin characters in theory, regardless of whether there are lang="ar" line or not, *except* applications do the special things.

But Latin *isn't the problem*. It's digits. Noto Sans Arabic contains digit glyphs. When it is the system's default font, applications have no reason not to use those glyphs.

Application says: draw this number in sans-serif
Fontconfig says: the top sans-serif font is Noto Sans Arabic
libass says: OK, here's your number set in Noto Sans Arabic

> libass doesn't support variable fonts well then.

libass doesn't support variable fonts at all, except in a few edge cases where Fontconfig makes them look like dedicated physical fonts.

Comment 25 Oleg Oshmyan 2024-03-11 23:54:17 UTC
What libass could, in fact, do better here is it could better match the size of the primary font (Noto Sans Arabic) used for the numerals and the fallback font used for the Latin, so that the apparent font size at least wouldn't jerk around like that (on this pair of fonts at least... there's never any guarantee about size consistency when using two different fonts). But I don't see how it could be expected to not choose Arabic in the first place.

The thing with the file picker likely has the same cause: Noto Sans Arabic has the same units-per-EM size and the same glyph sizes in units as Noto Sans, but it declares its ascender and descender heights to be much larger than Noto Sans does. Possibly to accommodate more complex Arabic glyphs; I don't know. But this directly affects the line height, because, well, that's exactly what those values control.

The reason the glyphs in the file picker look the same (just a larger line height) whereas the glyphs in mpv look different (smaller) is that they determine font size differently: the file picker sets a fixed glyph size and uses whatever line height it results in, whereas libass does the opposite: it sets a fixed line height and uses whatever glyphs it results in (this is by design, because libass is an ASS subtitle renderer and this is how ASS defines font size).

Comment 26 Akira TAGOH 2024-03-12 04:01:54 UTC
(In reply to Oleg Oshmyan from comment #24)
> The list returned by Fontconfig is ordered by priority. libass uses the
> first (highest-priority) font that satisfies the request. The choice of
> priority lies entirely with Fontconfig.

No, that isn't even true. It sounds like there are any gaps in recognition.
It isn't actually "ordered by priority". FcPattern built by FcConfigSubstitute() is just a result going through all config enabled. There are nothing more than that.
It doesn't guarantee that it is available on the system even.
fontconfig tries to build a best font/font list with FcFontMatch/FcFontSort against it. the result will be polished. This is what libass is missing here.

> But Latin *isn't the problem*. It's digits. Noto Sans Arabic contains digit
> glyphs. When it is the system's default font, applications have no reason
> not to use those glyphs.
> 
> Application says: draw this number in sans-serif
> Fontconfig says: the top sans-serif font is Noto Sans Arabic
> libass says: OK, here's your number set in Noto Sans Arabic

As I demonstrated the above, fontconfig chose Noto Sans without lang="ar". This is a libass issue that they use fontconfig API improperly.

(In reply to Oleg Oshmyan from comment #25)
> What libass could, in fact, do better here is it could better match the size
> of the primary font (Noto Sans Arabic) used for the numerals and the
> fallback font used for the Latin, so that the apparent font size at least
> wouldn't jerk around like that (on this pair of fonts at least... there's
> never any guarantee about size consistency when using two different fonts).
> But I don't see how it could be expected to not choose Arabic in the first
> place.

How about simply taking a look at "lang" for a font? "lang" object indicates what character coverage is needed to represent for. "en" almost covers Latin characters loosely. If you know what characters represent a language, you don't need to check it with FC_CHARSET one by one.

Comment 27 Oleg Oshmyan 2024-03-12 11:12:38 UTC
Why do you keep mentioning English? libass is looking for a font to display numbers in. There's no English involved.

> How about simply taking a look at "lang" for a font?

It's empty. You've removed it.

> "lang" object indicates what character coverage is needed to represent for. "en" almost covers Latin characters loosely. If you know what characters represent a language, you don't need to check it with FC_CHARSET one by one.

libass still needs to look up every glyph. Language coverage has some correlation with glyph coverage but one does not guarantee the other. Font designers can't draw every single glyph; Unicode keeps getting new code points; languages themselves are loosely defined (are digits "English"?); fonts can be subset; etc.

Comment 28 Oleg Oshmyan 2024-03-12 11:32:47 UTC
Ah, you must mean the other kind of lang object, the one Fontconfig extracts from each font. I'm actually unsure how it does this; isn't it still built from the fonts' built-in code page or Unicode coverage bits? Anyway, I don't see how that helps.

> Language coverage has some correlation with glyph coverage but one does not guarantee the other.

And more than anything, fonts' language properties are useful to determine what shape the glyphs have rather than what glyphs are available. For example, some CJK code points look different in different variants of Chinese and Japanese, some Cyrillic code points look different in Bulgarian, Serbian and Russian, and even some Latin code points look different in Europe and elsewhere. Correspondingly, if one knows the text being rendered is in a particular language, one should prefer a font that is designed for that language.

But this doesn't mean that one should stop checking whether a glyph actually exists in a font, and when no particular language is involved, libass has no particular reason to look for an English font when another font fits the bill.

> It isn't actually "ordered by priority". FcPattern built by FcConfigSubstitute() is just a result going through all config enabled. There are nothing more than that.
> It doesn't guarantee that it is available on the system even.
> fontconfig tries to build a best font/font list with FcFontMatch/FcFontSort against it. the result will be polished. This is what libass is missing here.

Side note: it would help if Fontconfig documentation actually explained this.

So basically we just need to add a FcFontSort call immediately after the FcConfigSubstitute?

How *does* Fontconfig decide to prefer Noto Sans upon FcFontSort in this case? Does it try to find a font that covers the system locale's language?

Comment 29 Oleg Oshmyan 2024-03-12 11:36:58 UTC
Seeing the __libass_delimiter magic though, I suspect this isn't quite that easy.

Comment 30 Akira TAGOH 2024-03-15 08:52:24 UTC
(In reply to Oleg Oshmyan from comment #28)
> Ah, you must mean the other kind of lang object, the one Fontconfig extracts
> from each font. I'm actually unsure how it does this; isn't it still built
> from the fonts' built-in code page or Unicode coverage bits? Anyway, I don't
> see how that helps.

There are one in SFNT table right. but no, it isn't. As I said, the lang object in fontconfig is estimated from the real charset coverage loosely.

> And more than anything, fonts' language properties are useful to determine
> what shape the glyphs have rather than what glyphs are available. For
> example, some CJK code points look different in different variants of
> Chinese and Japanese, some Cyrillic code points look different in Bulgarian,
> Serbian and Russian, and even some Latin code points look different in
> Europe and elsewhere. Correspondingly, if one knows the text being rendered
> is in a particular language, one should prefer a font that is designed for
> that language.

Well, that is different thing. they are assigned to the same codepoint but have different shape with GSUB. fontconfig doesn't get involved with the rendering thing at all. it is out of the scope.

> But this doesn't mean that one should stop checking whether a glyph actually
> exists in a font, and when no particular language is involved, libass has no
> particular reason to look for an English font when another font fits the
> bill.

I'm not sure what is correct but what if:

a. font A is missing a glyph X only in a unicode code block M
b. font B has full coverage for M
c. font A is somehow coming first

In current implementation, a glyph X will be rendered with font A and others with B. I think this is what is actually happening at libass.

However, given that X is key codepoint to represent lang L, it can be simplified because:

a. font A is missing a glyph X so it doesn't satisfy lang L.
b. font B has full overage for M so it satisfies lang L.
c. font B will be picked up if requesting lang L.

> > It isn't actually "ordered by priority". FcPattern built by FcConfigSubstitute() is just a result going through all config enabled. There are nothing more than that.
> > It doesn't guarantee that it is available on the system even.
> > fontconfig tries to build a best font/font list with FcFontMatch/FcFontSort against it. the result will be polished. This is what libass is missing here.
> 
> Side note: it would help if Fontconfig documentation actually explained this.

Sure.

> So basically we just need to add a FcFontSort call immediately after the
> FcConfigSubstitute?

with FcDefaultSubstitute() too.

> How *does* Fontconfig decide to prefer Noto Sans upon FcFontSort in this
> case? Does it try to find a font that covers the system locale's language?

That depends how one requests with FcPattern. the system locale is one of hints what characters they may want to see. If applications has multilingual support and they know what a language they are, they could set it instead of the system locale of course. You could check "fc-match -s :lang=XX" with the lang code XX. this calls FcFontSort. for example:

$ fc-match -s :lang=en | head -3
NotoSans[wght].ttf: "Noto Sans" "Regular"
NotoSans-Italic[wght].ttf: "Noto Sans" "Italic"
NotoSansArabic[wght].ttf: "Noto Sans Arabic" "Regular"
$ fc-match -s :lang=ar | head -3
NotoSansArabic[wght].ttf: "Noto Sans Arabic" "Regular"
Vazirmatn[wght].ttf: "Vazirmatn" "Regular"
NotoSans[wght].ttf: "Noto Sans" "Regular"
$ fc-match -s :lang=ja | head -3
NotoSansCJK-VF.ttc: "Noto Sans CJK JP" "Regular"
NotoSans[wght].ttf: "Noto Sans" "Regular"
NotoSans-Italic[wght].ttf: "Noto Sans" "Italic"
$ fc-match -s :lang=hi | head -3
NotoSansDevanagari[wght].ttf: "Noto Sans Devanagari" "Regular"
NotoSans[wght].ttf: "Noto Sans" "Regular"
NotoSans-Italic[wght].ttf: "Noto Sans" "Italic"


(In reply to Oleg Oshmyan from comment #29)
> Seeing the __libass_delimiter magic though, I suspect this isn't quite that
> easy.

I don't know what is it for?

Comment 31 Oleg Oshmyan 2024-03-15 13:49:46 UTC
> they are assigned to the same codepoint but have different shape with GSUB. fontconfig doesn't get involved with the rendering thing at all. it is out of the scope.

No, I mean fonts that are designed to support a single language. A font _can_ provide good support for everything at once via GSUB, but such global support isn't all that common and more often you find that you have e.g. a Japanese font, a mainland Chinese font and a Hong Kong Chinese font with near-identical glyph coverage but different shapes. My impression has been that this is (at least part of) why fonts.conf allows specifying what languages a font should be preferred for.

Indeed, Noto Sans itself is a counterexample. I don't understand why it isn't all a single font, but it isn't: instead, we have a myriad of language-specific fonts with distinct names.

> However, given that X is key codepoint to represent lang L, it can be simplified because:
>
> a. font A is missing a glyph X so it doesn't satisfy lang L.
> b. font B has full overage for M so it satisfies lang L.
> c. font B will be picked up if requesting lang L.

This might be useful when a lang L is requested, but again, the case here doesn't request any particular language at all. All we know is that we need "the `sans-serif` font". We could explicitly fetch the current system locale in libass and ask Fontconfig to look for this language, but I guess we kinda expected that Fontconfig already does this for substitutions/aliases on its own.

> I don't know what is it for?

Honestly, neither do I. We'll need to investigate. But that's why it scares me: given that someone put it in place to begin with, it seems likely that it does _something_.

Comment 32 Akira TAGOH 2024-03-18 03:58:57 UTC
(In reply to Oleg Oshmyan from comment #31)
> No, I mean fonts that are designed to support a single language. A font
> _can_ provide good support for everything at once via GSUB, but such global
> support isn't all that common and more often you find that you have e.g. a
> Japanese font, a mainland Chinese font and a Hong Kong Chinese font with
> near-identical glyph coverage but different shapes. My impression has been
> that this is (at least part of) why fonts.conf allows specifying what
> languages a font should be preferred for.

Well, because Unicode unified similar shapes. we can't guess a language from a character coverage completely, particularly if it is all-in-one font.

> Indeed, Noto Sans itself is a counterexample. I don't understand why it
> isn't all a single font, but it isn't: instead, we have a myriad of
> language-specific fonts with distinct names.

It isn't that simple. Such all-in-one font has another problem. As you can see in the file picker in plasma, character width and height may be too wider/taller because it will be aligned to the most widest/tallest one for fixed font and monospace, or just for better looking. also, they won't be recognized as monospace because they have too many size of glyphs.

> 
> > However, given that X is key codepoint to represent lang L, it can be simplified because:
> >
> > a. font A is missing a glyph X so it doesn't satisfy lang L.
> > b. font B has full overage for M so it satisfies lang L.
> > c. font B will be picked up if requesting lang L.
> 
> This might be useful when a lang L is requested, but again, the case here
> doesn't request any particular language at all. All we know is that we need
> "the `sans-serif` font".

Yes, that's right. so what's wrong so far? fontconfig returns all sans-serif fonts *as* requested. there are nothing wrong there. other tasks are up to libass.

>                          We could explicitly fetch the current system locale
> in libass and ask Fontconfig to look for this language, but I guess we kinda
> expected that Fontconfig already does this for substitutions/aliases on its
> own.

No, as I noted, current usage of the results from FcConfigSubstitute() is completely wrong. otherwise fontconfig doesn't need to provide APIs such as FcFontMatch/FcFontSort at all.

Comment 33 Oleg Oshmyan 2024-03-18 15:54:40 UTC
> we can't guess a language from a character coverage completely, particularly if it is all-in-one font.

Yes; that's why glyph-coverage-based language lists aren't as useful as human-set fonts.conf language lists.

Comment 34 Akira TAGOH 2024-03-19 07:22:39 UTC
(In reply to Oleg Oshmyan from comment #33)
> > we can't guess a language from a character coverage completely, particularly if it is all-in-one font.
> 
> Yes; that's why glyph-coverage-based language lists aren't as useful as
> human-set fonts.conf language lists.

Well, I don't mean to that.
Actually they are useful more than checking a pinpoint char coverage like libass does.

We can't guess a language representation coverage for a font from a character but if it is a set of characters, the guess would be somewhat better than single. "lang" guarantees minimal character coverage for the language. maybe the name "lang" may causes a confusion though.

Comment 35 Aoife Moloney 2024-11-13 11:57:36 UTC
This message is a reminder that Fedora Linux 39 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora Linux 39 on 2024-11-26.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
'version' of '39'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, change the 'version' 
to a later Fedora Linux version. Note that the version field may be hidden.
Click the "Show advanced fields" button if you do not see it.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora Linux 39 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora Linux, you are encouraged to change the 'version' to a later version
prior to this bug being closed.

Comment 36 Aoife Moloney 2024-11-27 22:50:31 UTC
Fedora Linux 39 entered end-of-life (EOL) status on 2024-11-26.

Fedora Linux 39 is no longer maintained, which means that it
will not receive any further security or bug fix updates. As a result we
are closing this bug.

If you can reproduce this bug against a currently maintained version of Fedora Linux
please feel free to reopen this bug against that version. Note that the version
field may be hidden. Click the "Show advanced fields" button if you do not see
the version field.

If you are unable to reopen this bug, please file a new report against an
active release.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.