Bug 476427

Summary: [te_IN] - Consonant+Virama+Consonant+Virama+space renders the second virama as a separate glyph in lohit-telugu font
Product: [Fedora] Fedora Reporter: Arjuna Rao Chavala <arjunaraoc>
Component: lohit-fontsAssignee: Rahul Bhalerao <b.rahul.pm>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: 10CC: aalam, b.rahul.pm, bugzillas+padREMOVETHISdu, fonts-bugs, i18n-bugs, petersen
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-03-05 14:32:13 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
The screen shot of the problematic telugu script
none
Debug statements added to my local version of pango
none
Debug messages produced by my local version of pango
none
untested patch for Lohit-Telugu.sfd none

Description Arjuna Rao Chavala 2008-12-14 15:21:44 UTC
Consonant+Virama+[Consonant+Virama]+space renders the second virama as a separate glyph in lohit-telugu font. With Pothana2000 font, the rendering is correct.
More details at this bug report.
http://bugzilla.gnome.org/show_bug.cgi?id=516947

Comment 1 A S Alam 2008-12-15 06:00:45 UTC
Changing Product to Fedora

Comment 2 Ravi Chandra 2008-12-30 13:05:40 UTC
Created attachment 327959 [details]
The screen shot of the problematic telugu script

See the attached jpeg image to know why the problem occurs. Seems to be appearing  when the user types consonant+consonant+^ key combination.......

Comment 3 Padmanabhan V. K. 2009-01-06 22:39:37 UTC
Bug 318071 deals with the same problem, but for Lohit Kannada. That bug was closed with the comment that a fix "is not possible in the current opentype framework."

However the Pothana2000 font from http://www.kavya-nandanam.com/dload.htm is able to render these combinations correctly.

I hope Lohit Telugu and Lohit Kannada could be fixed to handle these combinations just like Pothana2000.

BTW the gnome bug above says the problem isn't seen in Kannada, which isn't true. Just that the problem is less noticeable when compared to Telugu (since in Kannada the virama connects to the right of the first consonant anyway, unlike in Telugu where it connects to the top).

Comment 4 Padmanabhan V. K. 2009-01-25 18:59:04 UTC
I think I have a fix to this problem (but not compatible with the latest .sfd format in use at http://fedorahosted.org/lohit/browser/trunk/Lohit-Telugu.sfd) and some justification for it.
I hope someone can fix the font correspondingly.

I am put up with an old Mandriva distro with pango-1.10.0-3mdk and fontforge-1.0-0.20050809.1mdk. I used the Lohit Telugu font extracted from lohit-fonts-2.3.1-1.fc10.src.rpm and the Pothana2000 font from http://www.kavya-nandanam.com/dload.htm. For comparing the fonts, I created .sfd's using the above fontforge (which cannot create or open the latest .sfd format).

I extracted the sources for pango-1.10.0-3mdk and added a few debug messages and observed the difference between the messages produced while rendering "U+0c2f U+0c4d U+0c30 U+0c4d" for the 2 fonts (pasting the combination into the font selection dialog for gvim while Pothana2000 is selected and then selecting Lohit Telugu).

The debug messages show a "haln" substitution that is performed only for Pothana2000 and not Lohit Telugu. The entire sequence of substitutions for Pothana2000 is:
ya + ra + halant + halant (Pango-reordered) --blwf-> ya + below-base-ra + halant --blws-> ya + wide-below-base-ra + halant --haln-> ya-halant + below-base-ra.

The way Pothana2000 differs from Lohit Telugu is two-fold. First, the below-base forms are shown with "gproperties 0x8" in the messages as opposed to "gproperties 0x2" which corresponds to the opentype glyph class "mark" as opposed to "base glyph."

From Pothana2000.sfd:
ChainSub: coverage 0 1 'blws' 0 0 0 1
 1 1 0
  Coverage: 25 RaOttuWide1 RaOttuMiddle1
  BCoverage: 114 Jha Ya ...
 1
  SeqLookup: 0 'L005'
EndFPST
...
StartChar: RaOttuWide1
...
GlyphClass: 4
...
EndChar
...
StartChar: RaOttuMiddle1
...
GlyphClass: 4
...
Substitution: 0 65534 'L005' RaOttuWide1
...
Ligature: 0 1 'blwf' Ra Halanth
EndChar

From Lohit-Telugu.sfd:
ChainSub: coverage 0 0 'blws' 0 0 0 1
 1 1 0
  Coverage: 16 U0C30_U0C4D.blwf
  BCoverage: 5 U0C2F
 1
  SeqLookup: 0 'L321'
EndFPST
...
StartChar: U0C30_U0C4D.blwf
...
GlyphClass: 2
...
Substitution: 0 65534 'L321' glyph495
...
Ligature: 0 0 'blwf' U0C30 U0C4D
EndChar
...
StartChar: glyph495
...
GlyphClass: 2
...
EndChar

Second, the ligation rules for halanth say all marks of other types according to the "MarkAttachClasses" should be ignored.

From Pothana2000.sfd:
MarkAttachClasses: 11
...
"MarkClass-10" 7 Halanth
...
StartChar: YaHalanth
...
GlyphClass: 3
...
Ligature: 2560 1 'haln' Ya Halanth
...
EndChar

(Note 2560 = 0xa00, i.e. all marks of types other than type 10 -- which happens to be called MarkClass-10 and includes just the halanth -- should be ignored)

From Lohit Telugu.sfd:
StartChar: U0C2F_U0C4D.haln
...
GlyphClass: 2
...
Ligature: 0 0 'haln' U0C2F U0C4D
...
EndChar

Comment 5 Padmanabhan V. K. 2009-01-25 19:31:42 UTC
Created attachment 329950 [details]
Debug statements added to my local version of pango

I have attached the diff for pango-1.10.0-3mdk to better understand the debug messages that show the processing of the two fonts.

Comment 6 Padmanabhan V. K. 2009-01-25 19:40:47 UTC
Created attachment 329951 [details]
Debug messages produced by my local version of pango

Actually we don't need various "MarkAttachClasses" to fix this bug. We can just make the ligation rules for "haln" say all types of marks should be ignored (using "8" instead of "2560" above) while applying the rule, and ensure that all below base forms are flagged as "mark." This is sufficient for ya + wide-below-base-ra + halant --haln-> ya-halant + wide-below-base-ra.

I made the following change to Lohit-Telugu.sfd and created a new lohit-te.ttf from that. The modified font renders "U+0c2f U+0c4d U+0c30 U+0c4d" correctly:

--- /tmp/orig/Lohit-Telugu.sfd        2009-01-24 18:10:48.000000000 +0530
+++ /tmp/Lohit-Telugu.sfd       2009-01-26 00:08:06.000000000 +0530
@@ -33780,7 +33780,7 @@
  760 170 760 170 760 243 c 1,99,-1
  760 243 l 1,13,14
 EndSplineSet
-Ligature: 0 0 'haln' U0C2F U0C4D
+Ligature: 8 0 'haln' U0C2F U0C4D
 EndChar
 StartChar: U0C30_U0C4D.haln
 Encoding: 420 -1 420
@@ -36649,7 +36649,7 @@
 StartChar: U0C30_U0C4D.blwf
 Encoding: 464 -1 464
 Width: 0
-GlyphClass: 2
+GlyphClass: 4
 Flags: W
 AnchorPoint: "Anchor-0" -324 -56 mark 0
 Fore
@@ -39468,7 +39468,7 @@
 StartChar: glyph495
 Encoding: 499 -1 499
 Width: 0
-GlyphClass: 2
+GlyphClass: 4
 Flags: W
 AnchorPoint: "Anchor-0" -492 -51 mark 0
 Fore

The attachment contains the debug messages my pango modification produces for Pothana2000, Lohit Telugu, and my modified Lohit Telugu.

Comment 7 Padmanabhan V. K. 2009-01-26 22:18:02 UTC
Created attachment 330034 [details]
untested patch for Lohit-Telugu.sfd

I hand edited http://fedorahosted.org/lohit/browser/trunk/Lohit-Telugu.sfd for my suggested changes and a few more changes and created the attached patch.

The changes made are:
1. Every "GlyphClass:" line is fixed to reflect if the glyph is a "mark" (4) or a "ligature" (3). [For the specific issue at hand, changing "GlyphClass:" for all below base forms ("*.blwf" & glyphs got by substituting those, "glyph471," "U0C24_U0C4D.blwf_U0C30_U0C4D.blwf.blws" & "glyph473") is enough.]
2. Lookup flags of "Above Base Substitutions in Telugu lookup 1" and "Halant Forms in Telugu lookup 317" is changed from 0 to 8 to ignore marks. [For the specific issue at hand, the second change is enough.]
3. What seem to be typos in "Ligature2:" lines are fixed -- "U0C38_U0C4D.haln U0C1F U0C30_U0C4D.blwf" -> "U0C38_U0C4D.haln U0C24 U0C30_U0C4D.blwf" and "U0C37_U0C4D.haln U0C24 U0C30_U0C4D.blwf" -> "U0C37_U0C4D.haln U0C1F U0C30_U0C4D.blwf." [These are of course unrelated to this bug. The modifications are to "Ligature2:" lines for the glyphs "U0C38_U0C24_U0C4D.blwf_U0C30_U0C4D.blwf.blws" and "U0C37_U0C1F_U0C4D.blwf_U0C30_U0C4D.blwf.blws."]
4. What seem to be erroneous "akhn" rules for consonant sequences without halanths -- "U0C38 U0C24 U0C30 U0C40" -> "U0C38_U0C4D.haln U0C24_U0C40.abvs U0C30_U0C4D.blwf" and "U0C37 U0C1F U0C30 U0C40" -> "U0C37_U0C4D.haln U0C1F_U0C40.abvs U0C30_U0C4D.blwf." [These are also unrelated to this bug. The modifications are to "Ligature2:" lines for the glyphs "U0C38_U0C24_U0C30_U0C40.akhn" and "U0C37_U0C1F_U0C30_U0C40.akhn." The new sequences I have put are inspired by the "akhn" rules fixed in step 3 above.]

The extra changes in 1 and 2 allow better rendering for rendering engines that do not position the base consonant and the matra immediately next to each other. E.g. before the change in http://svn.gnome.org/viewvc/pango?limit_changes=100&view=revision&revision=2107 Pothana2000 rendered consonant + "ai-matra (0x0C48)" correctly as it ignored the intervening "ai length mark (0x0C56)" when combining the consonant with the "e-matra (0x0C46)." These changes will allow Lohit Telugu to also handle this case with renderers other than pango.

Comment 8 Rahul Bhalerao 2009-03-05 14:32:13 UTC
Fixed in lohit-fonts-2.3.8. (lohit-telugu-fonts-2.3.8-1)
Padmanabhan, I tried using your patch. Unfortunately it did not work. Baiscally the solution involved two simple changes, 
1.Changing the lookup properties for 'ignore marks' at one place
2.Changing the glyph type for all the 'blwf' forms of consonants to 'Mark' from 'BaseGlyph'.

Thanks for the work you did on this bug. Your analysis has helped a lot in fixing it.