Bug 1009650 - Some Serbian glyphs seem to be Latin glyph
Summary: Some Serbian glyphs seem to be Latin glyph
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: liberation-fonts
Version: 28
Hardware: x86_64
OS: Linux
unspecified
unspecified
Target Milestone: ---
Assignee: vishal vijayraghavan
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-09-18 19:25 UTC by Alessandro Ceschini
Modified: 2019-04-23 06:07 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-04-23 06:07:09 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
Buggy glyphs colored red (93.04 KB, application/pdf)
2014-04-14 10:03 UTC, Alessandro Ceschini
no flags Details
New Attachment with Buggy Glyphs (122.26 KB, application/pdf)
2014-04-23 10:28 UTC, Alessandro Ceschini
no flags Details
pdf to test copying serbian glyphs (119.55 KB, application/pdf)
2014-04-30 09:27 UTC, Pravin Satpute
no flags Details
Newest Attachment with Buggy Glyphs (119.48 KB, application/pdf)
2014-04-30 12:42 UTC, Alessandro Ceschini
no flags Details
TeX Source File (2.11 KB, text/x-tex)
2014-05-02 12:30 UTC, Alessandro Ceschini
no flags Details
Latest test (122.07 KB, application/pdf)
2014-05-12 09:16 UTC, Alessandro Ceschini
no flags Details
Embedded subset liberationsans-bolditalic font with invalid charmap to unicode (2.22 KB, image/png)
2019-04-09 07:23 UTC, vishalvvr
no flags Details
Valid charmap to unicode endpoint in upstream liberationsans-bolditalic-font file (1.59 KB, image/png)
2019-04-09 07:27 UTC, vishalvvr
no flags Details

Description Alessandro Ceschini 2013-09-18 19:25:14 UTC
Hello,

I have a problem with Italic DE and GE: when I copy them from a pdf produced by XeLaTeX, they are pasted respectively as Latin G and I WITH MACRON. I'm puzzled, why does this happen?

By the way, other Serbian glyphs aren't copied at all, but they told me this is XeLaTeX's fault, not font-dependent.

Comment 1 Alessandro Ceschini 2013-09-21 14:51:16 UTC
Actually, George Duffner at XeLaTeX mailing list says the fact they're incorrectly copied or not copied at all is due to bad glyph naming, see here: http://tug.org/pipermail/xetex/2013-September/024745.html

Comment 2 Georg Duffner 2013-09-21 16:24:23 UTC
There are actually to bugs present in LiberationSerif Italic.

The one concerns cyrillic ge (U+0433) and de (U+0434) which in serbian locl are simply substituted with "imacron" (U+012B) and "g" (U+0067). Thus, when copy/pasting from a pdf, these two codepoints will by copied instead of the expected U+0433 and U+0434. To fix this you need to add two unencoded glyphs, insert a reference to imacron and g respectively and name them uni0433.srb and uni0434.srb (or some other suffix).

The second bug concerns the serbian be, pe and te. They are named S_BE, S_PE and S_TE. Applications that rely on AGL-conformant glyph names to identify the original codepoints read these names as ligatures of a letter S and a letter BE (or PE or TE). As no BE, PE or TE named glyphs exist in the font, no mapping can be found and copying will fail. To fix this, the glyphs should be renamed to uni0431.srb, uni043F.srb and uni0442.srb (or any other suffix). This might concern other glyphs too (dottediacute should be renamed too, something like iacute.dotted should work).

Comment 3 Pravin Satpute 2013-09-23 03:45:16 UTC
Agree, i did not considered this point while adding serbian support in liberation. Now while working on lohit2 project i have started taking care for it http://sourceforge.net/adobe/aglfn/discussion/discussion/thread/f212153e/

While development i try my best for reusing existing glyphs available in font. But looks like this will not work in need to create another glyph for "imacron" (U+012B) and "g" (U+0067) and name it as suggested by  Georg Duffner.

I will do this by next weekend.

Comment 4 Pravin Satpute 2013-12-02 08:43:17 UTC
This issue has been fixed in upstream master, will be available with next upstream release.

Comment 5 Alessandro Ceschini 2014-04-05 18:49:10 UTC
(In reply to Pravin Satpute from comment #4)
> This issue has been fixed in upstream master, will be available with next
> upstream release.

Hello Pravin, where can I find this release so that I can test it?

Regards

Comment 6 Pravin Satpute 2014-04-14 06:56:03 UTC
This is git repo https://git.fedorahosted.org/git/liberation-fonts.git

Master is Liberation 2 
remotes/origin/liberation-fonts-1_07_3  is for Old liberation version.

Comment 7 Alessandro Ceschini 2014-04-14 10:03:22 UTC
Created attachment 886080 [details]
Buggy glyphs colored red

Hello Pravin,

I downloaded Liberation 2.0 and tested it. Unfortunately there are still many bugs. The buggy glyphs are colored red.

1. BE consistently looks like a GREEK DELTA
2. Alternative glyph for ITALIC SHA is not implemented
3. Liberation SANS and MONO have slanted glyphs in the Latin range, so italic glyphs should not apply to them.

Moreover, the localized glyphs are impossible to copy/paste from a pdf, only ITALIC GE and DE are pasted but even then they are incorrectly recognized as respectively Latin I WITH MACRON and G. I don't know whether this problem affects only me, please try it yourself.

Regards

Comment 8 Pravin Satpute 2014-04-14 10:20:15 UTC
Please test the ttf files available @ http://pravins.fedorapeople.org/liberation/

Comment 9 Alessandro Ceschini 2014-04-14 10:31:32 UTC
(In reply to Pravin Satpute from comment #8)
> Please test the ttf files available @
> http://pravins.fedorapeople.org/liberation/

I downloaded this ttf files but I witnessed no apparent change on the situation described above.

Comment 10 Pravin Satpute 2014-04-22 11:43:35 UTC
It really surprise to me that fixed issues again popped up. No problem

Last week i did release of liberation-1.07.4. Lets work on this release and fix all pending bugs.

https://fedorahosted.org/releases/l/i/liberation-fonts/liberation-fonts-ttf-1.07.4.tar.gz

https://fedorahosted.org/releases/l/i/liberation-fonts/liberation-fonts-1.07.4.tar.gz

I tested by creating pdf locally, i can copy now Serbian characters properly. Please check and update this bug.

Comment 11 Alessandro Ceschini 2014-04-22 15:06:49 UTC
Hello I've just checked and:

1. BE consistently looks good now
2. Alternative glyph for ITALIC SHA is now implemented
3. I've still got to question the point of having specific Cyrillic Italic glyphs for Liberation SANS, SANS NARROW and MONO, since they have slanted glyphs in the Latin range.

Glyphs that still don't get pasted are highlighted in red.

Regards

Comment 12 Pravin Satpute 2014-04-23 06:01:47 UTC
Glad to see BE and Italic SHA working properly now. !!

1. I think you missed attachment here.
2. Can you elaborate 3rd point?

Comment 13 Alessandro Ceschini 2014-04-23 10:02:20 UTC
Hello Pravin,

1. I cannot understand which attachment you're hinting to: I downloaded the 1.04.4 version, isn't this the correct one?

2. Cyrillic Italic glyphs, whether Russian or Serbian-style, are normally implemented only when the font also has a different set of Latin Italic glyph, that is, mostly in Serif fonts. The bottom line is it's not consistent to have slanted glyphs in the Latin range and italic glyphs in the Cyrillic range.

Also, don't forget to give me feedback about the glyphs I highlighted in red, which I couldn't get to copy/paste.

Regards

Comment 14 Pravin Satpute 2014-04-23 10:23:08 UTC
(In reply to Alessandro Ceschini from comment #13)
> Hello Pravin,
> 
> 1. I cannot understand which attachment you're hinting to: I downloaded the
> 1.04.4 version, isn't this the correct one?
Attachment for glyph highlighted in red. Now understood you have already attached it in earlier comment.

Can you try to create pdf again with latest version for same? Will be very helpful to get new pdf as a attachment.

Comment 15 Alessandro Ceschini 2014-04-23 10:28:05 UTC
Created attachment 888846 [details]
New Attachment with Buggy Glyphs

I'm sorry, it's entirely my fault. I'd indeed created it but then I must have forgotten to upload it.

If you may tamper with this thread, the best thing to do would be to join this attachment with https://bugzilla.redhat.com/show_bug.cgi?id=1009650#c11. It belongs there.

Regards

Comment 16 Pravin Satpute 2014-04-23 10:35:56 UTC
No problem. I will check and update this by tomorrow.

Comment 17 Pravin Satpute 2014-04-30 09:27:04 UTC
Created attachment 891091 [details]
pdf to test copying serbian glyphs

I tested this, there was an issue with Liberation Sans Italic which is fixed now and i can copy all shapes properly.

https://git.fedorahosted.org/cgit/liberation-fonts.git/commit/?h=liberation-fonts-1_07_3

Source for this pdf http://pravins.fedorapeople.org/liberation-testing.odt

Comment 18 Pravin Satpute 2014-04-30 09:28:37 UTC
Regarding other query:
So do you means, we need Serbian shapes only for Liberation Serif?
Sans, Mono and Narrow does not required it?

Comment 19 Alessandro Ceschini 2014-04-30 12:42:44 UTC
Created attachment 891172 [details]
Newest Attachment with Buggy Glyphs

Hello Pravin,

To me, it looks like things were compounded by this newest release :( 

I've also spotted that in Liberation Sans Narrow the be may be a little too "bolder" compared to other letters, but this is just a hunch of mine.

Concerning italic shapes in Sans, Sans Narrow and Mono, yes, since you chose not to implement italic shapes in the Latin Range, I don't see why you should do so in the Cyrillic one.

Regards

Comment 20 Alessandro Ceschini 2014-05-01 22:27:41 UTC
Another remark: the Serbian SHA glyph shouldn't be selected by default, it should rather be accessible as a character variant.

Comment 21 Pravin Satpute 2014-05-02 06:21:30 UTC
Hi Alessandro,

  how about pdf which i attached? in #17?

  It working fine at my end. Also i checked the fix required in fonts, so all is well named now.
  
  PDF output depends upon clients as well you choose for creating it.

Comment 22 Alessandro Ceschini 2014-05-02 12:30:33 UTC
Created attachment 891790 [details]
TeX Source File

Hello Pravin,

Yes I can copy/paste all of the glyphs contained in your pdf, but mine is processed through XeLaTeX, which patently has some problems with Liberation Fonts. Hopefully these can be solved, I attached the .tex source file in case you want to test it on your own.

P.S.
Don't forget about https://bugzilla.redhat.com/show_bug.cgi?id=1009650#c20

Regards

Comment 23 Alessandro Ceschini 2014-05-02 12:32:39 UTC
Ah, I forgot: this issue likely boils down to glyph names, because XeLaTeX is known to fail if glyphs aren't properly named.

Comment 24 Pravin Satpute 2014-05-06 10:54:47 UTC
Yeah, i remember we discussed this longtime back.

1. Presently we are following naming as per AGL. Serbian glyph names are exactly same as cyrillic letter with just appended .alt1 at the end.

2. Presently we have added serbian shapes in all liberation variants, as you said it is only needed for serif. Can you report one bug for removing serbian characters from Sans, mono and Narrow. In later stage it will ease the issue since we only need to deal with Liberation Serif.

3. I will create trial font for "Serbian SHA" and provide you for testing with CV01 feature tag.

Comment 25 Alessandro Ceschini 2014-05-06 13:13:58 UTC
Hello Pravin

1. It's strange then: I cannot realize why some letters work and other don't. By the way, it should be .locl not .alt1 because these are localized not alternative glyphs.

2. OK, I shall do so.

3. OK, let me know when you're ready so I can test it.

Comment 26 Georg Duffner 2014-05-07 05:52:29 UTC
@Alessanrdo it doesn’t matter if they are named .locl .alt1 or .elephant as long as what’s before the . is consistently named. The problem must be somewhere else.

Comment 27 Georg Duffner 2014-05-08 05:50:56 UTC
The reason why some glyphs still canʹt be copied or are copied with some strange glyphs is because they are encoded in the PUA instead of having no encoding. I think, the only relevant one is Liberation Serif Bold Italic serbian ghe. The other serif fonts don’t show problems. And the problems in the remaining fonts would get solved once the lookup there is removed, as Alessandro suggests.

By the way, also the fi and fl ligatures should be copied to an unencoded slot which would be used by the liga feature as the encoded ligatures are legacy encodings and shouldn’t be used by lookups.

Comment 28 Pravin Satpute 2014-05-08 09:37:08 UTC
Hi Georg, 

  thanks for analysis and providing solution for same. I will implement it and do release soon.

Comment 29 Pravin Satpute 2014-05-09 11:49:06 UTC
Now ghe also does not have Unicode assigned. Also removed serbian locale characters from Sans, Mono and Narrow.

fonts for testing available @ http://pravins.fedorapeople.org/export/

Let me know if any more issues. 

fi and fl ligature has assigned values from Unicode. See http://www.fileformat.info/info/unicode/char/fb01/index.htm

So it should be fine.

SHA is still remain. Do you have any font for reference of SHA?

Comment 30 Georg Duffner 2014-05-09 21:42:41 UTC
Of course there are the correct shapes at U+FB01 and U+FB02. However, these codepoints are legacy encodings that shouldn’t be used in lookups like liga. In fact, it’s recommended not to subsitute encoded glyphs by other encoded ones.

The problem in the actual case of fi and fl is that if you use these codepoints users might run into problems in documents that contain them instead of single letters or unencoded substitution letters e.g. when searching the document: try to search this page for the word “firefly”, you won’t find any occurrence. Also, screenreaders that are used by visually impaired might stumble upon such glyphs.

Comment 31 Pravin Satpute 2014-05-12 08:15:03 UTC
Yes, i do understand that. I hope Unicode normalization must be handling it correctly.

From Liberation fonts perspective, we dont have any lookup for these ligature, might be because of compatibility with Arial.

Comment 32 Alessandro Ceschini 2014-05-12 09:16:08 UTC
Created attachment 894628 [details]
Latest test

Hello Pravin

Here's the new test file. A few remarks:

1. Serif is 100% all right now.
2. Sans, Sans Narrow and Mono all lack the localized glyph for BE: this is not an italic glyph so there's no reason to remove it.
3. You should remove italic glyphs from the Cyrillic range of Sans and Sans Narrow altogether. At present you just deleted the Serbian localized glyphs.

Regards

Comment 33 Alessandro Ceschini 2014-05-12 09:27:48 UTC
(In reply to Pravin Satpute from comment #29)
> Now ghe also does not have Unicode assigned. Also removed serbian locale
> characters from Sans, Mono and Narrow.
> 
> fonts for testing available @ http://pravins.fedorapeople.org/export/
> 
> Let me know if any more issues. 
> 
> fi and fl ligature has assigned values from Unicode. See
> http://www.fileformat.info/info/unicode/char/fb01/index.htm
> 
> So it should be fine.
> 
> SHA is still remain. Do you have any font for reference of SHA?

FreeSerif implements it as a character variant, and this is the only free font I'm aware of with full-fledged support for Serbian. Then there are the paying fonts but I don't use them so I cannot provide you with any bit of information about these.

Comment 34 Alessandro Ceschini 2014-05-12 09:28:12 UTC
(In reply to Pravin Satpute from comment #29)
> SHA is still remain. Do you have any font for reference of SHA?

FreeSerif implements it as a character variant, and this is the only free font I'm aware of with full-fledged support for Serbian. Then there are the paying fonts but I don't use them so I cannot provide you with any bit of information about these.

Comment 35 Fedora End Of Life 2015-01-09 22:35:44 UTC
This message is a notice that Fedora 19 is now at end of life. Fedora 
has stopped maintaining and issuing updates for Fedora 19. It is 
Fedora's policy to close all bug reports from releases that are no 
longer maintained. Approximately 4 (four) weeks from now this bug will
be closed as EOL if it remains open with a Fedora 'version' of '19'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 19 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 36 Fedora End Of Life 2015-02-18 11:13:59 UTC
Fedora 19 changed to end-of-life (EOL) status on 2015-01-06. Fedora 19 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.

Comment 37 Pravin Satpute 2015-02-18 11:16:08 UTC
I have Liberation release pending with improvement. Hoping to do it soon.

Comment 38 Jaroslav Reznik 2015-02-19 14:01:09 UTC
(In reply to Pravin Satpute from comment #37)
> I have Liberation release pending with improvement. Hoping to do it soon.

Could you please change version to the release you'd like to release in? Fedora 19 is now EOL, so no updates are possible. Thanks.

Comment 39 Fedora End Of Life 2016-07-19 10:24:39 UTC
Fedora 22 changed to end-of-life (EOL) status on 2016-07-19. Fedora 22 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.

Comment 40 Pravin Satpute 2016-07-19 11:20:26 UTC
Some part is already fixed in upstream. Need to apply it in downstream with release.

Comment 41 Krasnaya Ploshchad’ 2017-06-23 03:57:40 UTC
A sample for Serbian glyphs available here:
https://en.wikipedia.org/wiki/File:Special_Cyrillics.png

Comment 42 Fedora End Of Life 2017-07-25 18:35:11 UTC
This message is a reminder that Fedora 24 is nearing its end of life.
Approximately 2 (two) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 24. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '24'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not
able to fix it before Fedora 24 is end of life. If you would still like
to see this bug fixed and are able to reproduce it against a later version
of Fedora, you are encouraged  change the 'version' to a later Fedora
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.

Comment 43 Fedora End Of Life 2017-08-08 11:43:21 UTC
Fedora 24 changed to end-of-life (EOL) status on 2017-08-08. Fedora 24 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.

Comment 44 Pravin Satpute 2017-08-09 07:25:20 UTC
This is still not fixed. I am moving to Rawhide now.

Comment 45 Jan Kurik 2017-08-15 09:01:02 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 27 development cycle.
Changing version to '27'.

Comment 46 sachin 2018-07-19 12:21:41 UTC
Reproducible in F28

Comment 47 vishalvvr 2019-04-09 07:23:40 UTC
Created attachment 1553795 [details]
Embedded subset liberationsans-bolditalic font with invalid charmap to unicode

Comment 48 vishalvvr 2019-04-09 07:27:12 UTC
Created attachment 1553796 [details]
Valid charmap to unicode endpoint in upstream liberationsans-bolditalic-font file

Comment 49 vishalvvr 2019-04-09 07:42:14 UTC
Hello, 

First of all sorry for such delayed reply.
This bug seems to be not an liberation-fonts issue this is pdf compression issue.
let me explain, for document portability pdf standards gave option to embedded subset fonts[1], 
which means instead of embedding entire font file it will only embedded a font file with only glyph/characters used in that document.
This will reduce the pdf file size and also not required to install that font on system to view that document(for portability purpose).

Now while embedded subset of font which ever tool you use it just parsers entire fontfile and dump into temporary font file in unordered format
i:e improper unicode to character mapping as shown in image attachment "Embedded subset liberationsans-bolditalic font with invalid charmap to unicode",
and a valid character map should look like image attachment "Valid charmap to unicode endpoint in upstream liberationsans-bolditalic-font file".   

Due to Embedded subset of liberation-font(by the tool which you are using to create/convert pdf) file the characters are mapped to invalid unicode point hence while you copy you get different characters rendered.

You can try it by yourself:
1. open your test attachment file using fontforge
2. choose any font from list shown
3. check if all characters are mapped to valid Unicode points  


[1] https://publicatorcommunity.zmags.com/hc/en-us/articles/115002482043-Difference-between-full-embedded-fonts-and-embedded-subset

Comment 50 vishalvvr 2019-04-23 06:07:09 UTC
Closing this bug as not a bug, if you still feel this need to be fixed please feel free to reopen the same.


Note You need to log in before you can comment on or make changes to this bug.