Bug 825081

Summary: [kn_IN] Lohit Kannada font does not properly handle vowel signs in consonant clusters
Product: [Fedora] Fedora Reporter: Shriramana Sharma <samjnaa>
Component: lohit-kannada-fontsAssignee: Pravin Satpute <psatpute>
Status: ASSIGNED --- QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 23CC: fonts-bugs, i18n-bugs, psatpute
Target Milestone: ---Keywords: i18n
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Attachments:
Description Flags
ODT and PDF for test-case
none
Updated ODT and PDF for test-case
none
Lohit Kannada on W8
none
Lohit from master on Fedora 18
none
Script to produce mostly valid test examples
none
ODT and PDF of random 2-consonant clusters none

Description Shriramana Sharma 2012-05-25 00:26:39 EDT
Created attachment 586752 [details]
ODT and PDF for test-case

Description of problem:

This seems to be a resurfacing of bug #223971 but since I saw no way to reopen that bug (sorry if I'm wrong) I'm reporting this again.

Version-Release number of selected component (if applicable):

2.5.1

How reproducible:

In a word processor, select Lohit Kannada font and input Kannada Unicode sequences having consonant clusters of the format CCV. I have attached a sample ODT.
  
Actual results:

Except in a few cases of "popular" consonant clusters like K.SSA ಕ್ಷ and J.NYA ಜ್ಞ, the vowel signs are not attached properly.

When the same text is rendered with other fonts (like Tunga of Microsoft even loaded into Linux's LibreOffice) the sequences are shown properly without overlaps or malformed glyphs.

Expected results:

The Kannada language uses lots of Sanskrit-based words and hence has many consonant clusters with two and even three consonants. Also, when English words are transliterated in Kannada script like ಎಕ್ಸ್ಪ್ಲೋರ್ (explore) etc for sign-boards etc, even more consonant clusters will occur. In all these cases Lohit Kannada should be able to gracefully handle such consonant clusters and not output overlapping or malformed glyphs.

Additional info:

I have attached an ODT and PDF file demonstrating the problem.
Comment 1 A S Alam 2012-05-25 01:44:05 EDT
Test Case from Attachment (to easy copy/paste):
---
Tunga:
ಕ್ಷಾ ಕ್ಷಿ ಕ್ಷೀ ಕ್ಷು ಕ್ಷೂ ಕ್ಷೃ ಕ್ಷೄ ಕ್ಷೆ ಕ್ಷೇ ಕ್ಷೈ ಕ್ಷೊ ಕ್ಷೋ ಕ್ಷೌ ಕ್ಷ್
ಚ್ಚಾ ಚ್ಚಿ ಚ್ಚೀ ಚ್ಚೃ ಚ್ಚೄ ಚ್ಚು ಚ್ಚೂ ಚ್ಚೆ ಚ್ಚೇ ಚ್ಚೈ ಚ್ಚೊ ಚ್ಚೋ ಚ್ಚ್
ಡ್ಡಾ ಡ್ಡಿ ಡ್ಡೀ ಡ್ಡು ಡ್ಡೂ ಡ್ಡೃ ಡ್ಡೆ ಡ್ಡೇ ಡ್ಡೈ ಡ್ಡೊ ಡ್ಡೋ ಡ್ಡೌ ಡ್ಡ್
---

Can we get Unicode points for such characters?
Comment 2 Shriramana Sharma 2012-05-25 02:17:03 EDT
UniView (http://rishida.net/scripts/uniview/) or BabelMap (http://www.babelstone.co.uk/software/babelmap.html) is your friend. 

(I have also made small correction to the sample text.)

ಕ್ಷಾ ಕ್ಷಿ ಕ್ಷೀ ಕ್ಷು ಕ್ಷೂ ಕ್ಷೃ ಕ್ಷೄ ಕ್ಷೆ ಕ್ಷೇ ಕ್ಷೈ ಕ್ಷೊ ಕ್ಷೋ ಕ್ಷೌ ಕ್ಷ್
ಚ್ಚಾ ಚ್ಚಿ ಚ್ಚೀ ಚ್ಚು ಚ್ಚೂ ಚ್ಚೃ ಚ್ಚೄ ಚ್ಚೆ ಚ್ಚೇ ಚ್ಚೈ ಚ್ಚೊ ಚ್ಚೋ ಚ್ಚೌ ಚ್ಚ್
ಡ್ಬಾ ಡ್ಬಿ ಡ್ಬೀ ಡ್ಬು ಡ್ಬೂ ಡ್ಬೃ ಡ್ಬೄ ಡ್ಬೆ ಡ್ಬೇ ಡ್ಬೈ ಡ್ಬೊ ಡ್ಬೋ ಡ್ಬೌ ಡ್ಬ್

\u0C95\u0CCD\u0CB7\u0CBE \u0C95\u0CCD\u0CB7\u0CBF \u0C95\u0CCD\u0CB7\u0CC0 \u0C95\u0CCD\u0CB7\u0CC1 \u0C95\u0CCD\u0CB7\u0CC2 \u0C95\u0CCD\u0CB7\u0CC3 \u0C95\u0CCD\u0CB7\u0CC4 \u0C95\u0CCD\u0CB7\u0CC6 \u0C95\u0CCD\u0CB7\u0CC7 \u0C95\u0CCD\u0CB7\u0CC8 \u0C95\u0CCD\u0CB7\u0CCA \u0C95\u0CCD\u0CB7\u0CCB \u0C95\u0CCD\u0CB7\u0CCC \u0C95\u0CCD\u0CB7\u0CCD

\u0C9A\u0CCD\u0C9A\u0CBE \u0C9A\u0CCD\u0C9A\u0CBF \u0C9A\u0CCD\u0C9A\u0CC0 \u0C9A\u0CCD\u0C9A\u0CC1 \u0C9A\u0CCD\u0C9A\u0CC2 \u0C9A\u0CCD\u0C9A\u0CC3 \u0C9A\u0CCD\u0C9A\u0CC4 \u0C9A\u0CCD\u0C9A\u0CC6 \u0C9A\u0CCD\u0C9A\u0CC7 \u0C9A\u0CCD\u0C9A\u0CC8 \u0C9A\u0CCD\u0C9A\u0CCA \u0C9A\u0CCD\u0C9A\u0CCB \u0C9A\u0CCD\u0C9A\u0CCC \u0C9A\u0CCD\u0C9A\u0CCD

\u0CA1\u0CCD\u0CAC\u0CBE \u0CA1\u0CCD\u0CAC\u0CBF \u0CA1\u0CCD\u0CAC\u0CC0 \u0CA1\u0CCD\u0CAC\u0CC1 \u0CA1\u0CCD\u0CAC\u0CC2 \u0CA1\u0CCD\u0CAC\u0CC3 \u0CA1\u0CCD\u0CAC\u0CC4 \u0CA1\u0CCD\u0CAC\u0CC6 \u0CA1\u0CCD\u0CAC\u0CC7 \u0CA1\u0CCD\u0CAC\u0CC8 \u0CA1\u0CCD\u0CAC\u0CCA \u0CA1\u0CCD\u0CAC\u0CCB \u0CA1\u0CCD\u0CAC\u0CCC \u0CA1\u0CCD\u0CAC\u0CCD

BTW what will you do with the codepoints?
Comment 3 Pravin Satpute 2012-05-29 07:21:24 EDT
http://pravins.fedorapeople.org/Lohit-Kannada.ttf
Just tried for ಕ್ಷೃ it works fine in Gnome with pango. Please test it and let me know if its work on Windows.
So i can fix other combination in same way.

Thanks.
Comment 4 Shriramana Sharma 2012-05-29 13:06:23 EDT
(In reply to comment #3)
> Just tried for ಕ್ಷೃ 

I *told* you it works for ಕ್ಷೃ because for ಕ್ಷೃ ಜ್ಞ and two other consonant clusters you have included separate glyphs with all vowel sign combinations in the font. The point was for combinations *other* than these which can very well occur when writing especially Sanskrit-based words in Kannada script.

But even with ಕ್ಷೃ the rendering is not consistent in Linux. See for ಕ್ಷು ಕ್ಷೂ the SSA-vattu is pushed to the right, whereas it is right below the KA in other vowels due to the existence of pre-composed glyphs with vowel sign. And for ಕ್ಷೌ it is also pushed to the right despite the existence of the precomposed-glyph.

(Maybe your rendering in Pango/Fedora is different from what I am seeing in LibreOffice/Kubuntu -- please post a PDF.)

> it works fine in Gnome with pango. Please test it and
> let me know if its work on Windows.

LOL I'm not only using it on Windows but also on my Kubuntu Linux installation. 

On LibreOffice 3.5 on Kubuntu 12.04, the sequences still do NOT render properly. The PDF I had given you was produced by LibreOffice on Linux only and not from Windows.

> So i can fix other combination in same way.

Let me just list the required rendering so you can take care of how it is to be achieved:

Nasically you have four consonant clusters with precomposed glyphs i.e. K.SSA, J.NYA, D.DA and L.LA. Of these, except for vowel signs U, UU and vocalic R, RR, L, LL, combinations with all other vowel signs are precomposed.

1) In the case of these precomposed glyphs:

1a) If VS = vowel sign of U/UU or one of the vocalic vowels,

    C1 + VIR + C2 + VS --> C1_WITH_C2_VATTU VS

1b) If VS is another vowel sign or the virama

    C1 + VIR + C2 + VS --> C1_WITH_C2_VATTU_WITH_VS

2) In the case of the consonant clusters *without* precomposed glyphs

2a) If VS = vowel sign of U/UU or one of the vocalic vowels,

    C1 + VIR + C2 + VS --> C1 VS C2_VATTU

2b) If VS is another vowel sign or the virama

    C1 + VIR + C2 + VS --> C1_WITH_VS C2_VATTU
Comment 5 Pravin Satpute 2012-05-30 01:52:57 EDT
(In reply to comment #2)
> 
> BTW what will you do with the codepoints?

Sometime we do get bugs consisting string in pdf/image and it takes some time to to decode basic composition.

So it helps non language person like me to select characters in Fontforge for testing with Unicode value. 

Thanks for link from Richard Ishida's very useful. I use python to decode text to Unicode.
Comment 6 Pravin Satpute 2012-12-28 04:35:21 EST
I have fixed the first row and committed in upstream.
ಕ್ಷಾ ಕ್ಷಿ ಕ್ಷೀ ಕ್ಷು ಕ್ಷೂ ಕ್ಷೃ ಕ್ಷೄ ಕ್ಷೆ ಕ್ಷೇ ಕ್ಷೈ ಕ್ಷೊ ಕ್ಷೋ ಕ್ಷೌ ಕ್ಷ್

Tunga also renders same way like lohit for below test cases. What is ideal rendering should below base part sit exactly below the base consonant?

2. ಚ್ಚಾ ಚ್ಚಿ ಚ್ಚೀ ಚ್ಚೃ ಚ್ಚೄ ಚ್ಚು ಚ್ಚೂ ಚ್ಚೆ ಚ್ಚೇ ಚ್ಚೈ ಚ್ಚೊ ಚ್ಚೋ ಚ್ಚ್
3. ಡ್ಡಾ ಡ್ಡಿ ಡ್ಡೀ ಡ್ಡು ಡ್ಡೂ ಡ್ಡೃ ಡ್ಡೆ ಡ್ಡೇ ಡ್ಡೈ ಡ್ಡೊ ಡ್ಡೋ ಡ್ಡೌ ಡ್ಡ್
---
Comment 7 Shriramana Sharma 2013-01-06 05:41:49 EST
Created attachment 673323 [details]
Updated ODT and PDF for test-case

Hello Pravin. Sorry I couldn't reply earlier. 

I have now revisited the problem with the latest[*] Lohit-Kannada using LibreOffice 3.6.3 on Windows XP and 3.5.4 on Kubuntu Precise, but the problem remains the same. [* = At the present instant it says "up-to-date" -- I'm sorry I'm not sure how to ask git for the commit id.]

I hope the updated ODT and PDFs which I am attaching now should make the problem in question clearer as I have marked the problematic clusters from Lohit Kannada in red as against the correct Tunga rendering in green:

1) In the case of consonant cluster + vowel sign vocalic R/RR / vowel sign AI the descender of the vowel sign is overlapping with the below-base conjunct consonant.

2) In case of consonant cluster + virama, the virama is not properly joined to the above base consonant. 

3) In Linux, only in the case of K.SSAI and K.SS.VIRAMA the problem is not occuring (blue in Linux PDF) but in the case of vowel sign of vocalic vowels problem remains.
Comment 8 Shriramana Sharma 2013-01-06 06:16:46 EST
Sorry for extra post:

As clarification to point 3 in my previous comment:

In Linux, 

1) In the case of the four clusters with separate conjunct glyphs in the font i.e. K.SSA, J.NYA, D.DA and L.LA, AI and Virama are attaching properly (without overlap) but vocalic R/RR are still overlapping with the below-base consonant.

2) In the case of all other clusters, AI and virama are also not attaching properly.

On Windows XP (sorry I am not able to test on Windows 7):

None of the clusters have AI and virama or vocalic R/RR attaching properly.
Comment 9 Pravin Satpute 2013-01-07 12:33:50 EST
Thanks Shriramana for more details.

committed some more fixes in git. Latest lohit ttf available at http://pravins.fedorapeople.org/Lohit-Kannada.ttf , please test and let me know if there are more combinations.


From sample file remaining are ಚ್ಚ್  & ಡ್ಡ್  bit tricky but will fix them soon.
Comment 10 Pravin Satpute 2013-01-08 02:55:51 EST
Created attachment 674575 [details]
Lohit Kannada on W8
Comment 11 Pravin Satpute 2013-01-08 03:26:14 EST
Created attachment 674589 [details]
Lohit from master on Fedora 18
Comment 12 Pravin Satpute 2013-01-08 03:27:22 EST
 ಚ್ಚ್  & ಡ್ಡ್  works fine in Windows but not on Harfbuzz, reported bug for https://bugs.freedesktop.org/show_bug.cgi?id=59118 lets see if we can get same rendering with Harfbuzz as well.
Comment 13 Pravin Satpute 2013-01-09 03:01:54 EST
Great, Behdad fixed this issues in harfbuzz. Now ಚ್ಚ್  & ಡ್ಡ್   works well with other Kannada fonts as well.

So AFA i see no more combination need fix in list provided.
Comment 14 Shriramana Sharma 2013-01-09 12:36:57 EST
Created attachment 675746 [details]
Script to produce mostly valid test examples

Hello Pravin and Behdad and thanks for the quick work. I now am attaching a small Python3 script to produce any required number of test cases which are phonologically valid (and to a great extent can really occur) for Indic languages, esp. Sanskrit. You can use it to produce more test cases if you want.

@Pravin: Even now the vowel sign for vocalic R/RR is slightly overlapping in case of K.SSA. Can you do a contextual kerning of the vowel sign when the below-base consonant will collide with it?
Comment 15 Pravin Satpute 2013-01-10 03:07:30 EST
(In reply to comment #14)
> Created attachment 675746 [details]
> Script to produce mostly valid test examples
> 
> Hello Pravin and Behdad and thanks for the quick work. I now am attaching a
> small Python3 script to produce any required number of test cases which are
> phonologically valid (and to a great extent can really occur) for Indic
> languages, esp. Sanskrit. You can use it to produce more test cases if you
> want.

ಶ್ರು	ಕ್ತೄ	ಠ್ಳೊ	ಚ್ಥೇ	ಟ್ವೀ	ಪ್ರೃ	ಷ್ಳೀ	ಸ್ರಾ	ನ್ಠಾ	ಬ್ೞ್	ಱ್ೞೌ	ನ್ಖ್	ಯ್ಱಾ	ಗ್ಧೊ	ಣ್ಕೄ	ಫ್ಲು	ಟ್ಚೊ	ಭ್ಱೀ	ದ್ಜಾ	ಹ್ಶೌ	ದ್ಝೀ	ಯ್ಳೃ	ಗ್ದೄ	ೞ್ಲಿ	ಪ್ಚೄ	ಘ್ರಿ	ಮ್ನಾ	ಙ್ಖೈ	ಲ್ರೌ	ನ್ತಾ	ತ್ಕೀ	ಟ್ಕೈ	ಳ್ಲೊ	ರ್ಯೇ	ಮ್ದೃ	ೞ್ಳ್	ಷ್ಯೃ	ಸ್ಸೇ	ಜ್ಬೄ	ಡ್ಝೃ	ತ್ಛೇ	ಯ್ಯೂ	ಶ್ಳೈ	ಡ್ಗೌ	ಗ್ಝೃ	ಸ್ಶಾ	ಬ್ಭೆ	ಢ್ಳೂ	ಸ್ಸೇ	ಜ್ಭ್	ಣ್ಯೊ	ಗ್ಬೋ	ಜ್ಝೈ	ಸ್ಷೆ	ದ್ಜೋ	ಞ್ಗೇ	ಪ್ಖೌ	ಱ್ಳೆ	ಶ್ಲೋ	ವ್ೞೄ	ಚ್ಟೃ	ಗ್ಜೇ	ಟ್ಛಾ	ಷ್ರೃ	ಷ್ಳೌ	ಙ್ಥಾ	ಲ್ವೈ	ಚ್ವೀ	ಲ್ಱೃ	ಷ್ಶೈ	ಶ್ಯೂ	ಱ್ಳಾ	ಹ್ೞು	ಹ್ಸೀ	ಷ್ಳೇ	ಷ್ಸೊ	ಮ್ಡಾ	ಣ್ಚೆ	ಷ್ಷೇ	ಕ್ತೇ	ರ್ರೊ	ಸ್ಸೈ	ಜ್ೞೆ	ಫ್ಱಾ	ಜ್ಬೀ	ಡ್ಘೊ	ದ್ರೇ	ಛ್ರೇ	ಙ್ಱ್	ಳ್ಱ್	ತ್ತೂ	ಹ್ಶೄ	ಡ್ಗಾ	ಹ್ಹೆ	ತ್ತೇ	ಚ್ಪ್	ಚ್ತೇ	ಞ್ಣೂ	ಪ್ತು	ಟ್ಛೌ	

These are the combinations generated by your script. Working well with Lohit Kannada, let me know if you find any problem. Test with http://pravins.fedorapeople.org/Lohit-Kannada.ttf

> 
> @Pravin: Even now the vowel sign for vocalic R/RR is slightly overlapping in
> case of K.SSA. Can you do a contextual kerning of the vowel sign when the
> below-base consonant will collide with it?

Fixed it.
Comment 16 Shriramana Sharma 2013-01-16 11:40:53 EST
Created attachment 679704 [details]
ODT and PDF of random 2-consonant clusters

Hi Pravin -- every time you run the script it will produce new examples. And right now I have tested on my Windows XP LibO 3.6.4 and Kubuntu Precise LibO 3.5.4 and still many clusters are not rendering correctly. I have colour-coded them in the attached doc. Is it that in HB trunk they are rendering correctly?
Comment 17 Pravin Satpute 2013-01-16 12:51:49 EST
Yes, i am testing with Harfbuzz latest releases not trunk. I will check again for new test cases provided by you.
Comment 18 Fedora End Of Life 2013-01-16 19:13:24 EST
This message is a reminder that Fedora 16 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 16. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '16'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 16's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 16 is end of life. If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora, you are encouraged to click on 
"Clone This Bug" and open it against that version of Fedora.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Comment 19 Fedora End Of Life 2013-12-21 10:02:48 EST
This message is a reminder that Fedora 18 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 18. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '18'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 18's end of life.

Thank you for reporting this issue and we are sorry that we may not be 
able to fix it before Fedora 18 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior to Fedora 18's end of life.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.
Comment 20 Jan Kurik 2015-07-15 11:08:26 EDT
This bug appears to have been reported against 'rawhide' during the Fedora 23 development cycle.
Changing version to '23'.

(As we did not run this process for some time, it could affect also pre-Fedora 23 development
cycle bugs. We are very sorry. It will help us with cleanup during Fedora 23 End Of Life. Thank you.)

More information and reason for this action is here:
https://fedoraproject.org/wiki/BugZappers/HouseKeeping/Fedora23