Red Hat Bugzilla – Bug 476427
[te_IN] - Consonant+Virama+Consonant+Virama+space renders the second virama as a separate glyph in lohit-telugu font
Last modified: 2009-03-05 09:32:13 EST
Consonant+Virama+[Consonant+Virama]+space renders the second virama as a separate glyph in lohit-telugu font. With Pothana2000 font, the rendering is correct.
More details at this bug report.
Changing Product to Fedora
Created attachment 327959 [details]
The screen shot of the problematic telugu script
See the attached jpeg image to know why the problem occurs. Seems to be appearing when the user types consonant+consonant+^ key combination.......
Bug 318071 deals with the same problem, but for Lohit Kannada. That bug was closed with the comment that a fix "is not possible in the current opentype framework."
However the Pothana2000 font from http://www.kavya-nandanam.com/dload.htm is able to render these combinations correctly.
I hope Lohit Telugu and Lohit Kannada could be fixed to handle these combinations just like Pothana2000.
BTW the gnome bug above says the problem isn't seen in Kannada, which isn't true. Just that the problem is less noticeable when compared to Telugu (since in Kannada the virama connects to the right of the first consonant anyway, unlike in Telugu where it connects to the top).
I think I have a fix to this problem (but not compatible with the latest .sfd format in use at http://fedorahosted.org/lohit/browser/trunk/Lohit-Telugu.sfd) and some justification for it.
I hope someone can fix the font correspondingly.
I am put up with an old Mandriva distro with pango-1.10.0-3mdk and fontforge-1.0-0.20050809.1mdk. I used the Lohit Telugu font extracted from lohit-fonts-2.3.1-1.fc10.src.rpm and the Pothana2000 font from http://www.kavya-nandanam.com/dload.htm. For comparing the fonts, I created .sfd's using the above fontforge (which cannot create or open the latest .sfd format).
I extracted the sources for pango-1.10.0-3mdk and added a few debug messages and observed the difference between the messages produced while rendering "U+0c2f U+0c4d U+0c30 U+0c4d" for the 2 fonts (pasting the combination into the font selection dialog for gvim while Pothana2000 is selected and then selecting Lohit Telugu).
The debug messages show a "haln" substitution that is performed only for Pothana2000 and not Lohit Telugu. The entire sequence of substitutions for Pothana2000 is:
ya + ra + halant + halant (Pango-reordered) --blwf-> ya + below-base-ra + halant --blws-> ya + wide-below-base-ra + halant --haln-> ya-halant + below-base-ra.
The way Pothana2000 differs from Lohit Telugu is two-fold. First, the below-base forms are shown with "gproperties 0x8" in the messages as opposed to "gproperties 0x2" which corresponds to the opentype glyph class "mark" as opposed to "base glyph."
ChainSub: coverage 0 1 'blws' 0 0 0 1
1 1 0
Coverage: 25 RaOttuWide1 RaOttuMiddle1
BCoverage: 114 Jha Ya ...
SeqLookup: 0 'L005'
Substitution: 0 65534 'L005' RaOttuWide1
Ligature: 0 1 'blwf' Ra Halanth
ChainSub: coverage 0 0 'blws' 0 0 0 1
1 1 0
Coverage: 16 U0C30_U0C4D.blwf
BCoverage: 5 U0C2F
SeqLookup: 0 'L321'
Substitution: 0 65534 'L321' glyph495
Ligature: 0 0 'blwf' U0C30 U0C4D
Second, the ligation rules for halanth say all marks of other types according to the "MarkAttachClasses" should be ignored.
"MarkClass-10" 7 Halanth
Ligature: 2560 1 'haln' Ya Halanth
(Note 2560 = 0xa00, i.e. all marks of types other than type 10 -- which happens to be called MarkClass-10 and includes just the halanth -- should be ignored)
From Lohit Telugu.sfd:
Ligature: 0 0 'haln' U0C2F U0C4D
Created attachment 329950 [details]
Debug statements added to my local version of pango
I have attached the diff for pango-1.10.0-3mdk to better understand the debug messages that show the processing of the two fonts.
Created attachment 329951 [details]
Debug messages produced by my local version of pango
Actually we don't need various "MarkAttachClasses" to fix this bug. We can just make the ligation rules for "haln" say all types of marks should be ignored (using "8" instead of "2560" above) while applying the rule, and ensure that all below base forms are flagged as "mark." This is sufficient for ya + wide-below-base-ra + halant --haln-> ya-halant + wide-below-base-ra.
I made the following change to Lohit-Telugu.sfd and created a new lohit-te.ttf from that. The modified font renders "U+0c2f U+0c4d U+0c30 U+0c4d" correctly:
--- /tmp/orig/Lohit-Telugu.sfd 2009-01-24 18:10:48.000000000 +0530
+++ /tmp/Lohit-Telugu.sfd 2009-01-26 00:08:06.000000000 +0530
@@ -33780,7 +33780,7 @@
760 170 760 170 760 243 c 1,99,-1
760 243 l 1,13,14
-Ligature: 0 0 'haln' U0C2F U0C4D
+Ligature: 8 0 'haln' U0C2F U0C4D
Encoding: 420 -1 420
@@ -36649,7 +36649,7 @@
Encoding: 464 -1 464
AnchorPoint: "Anchor-0" -324 -56 mark 0
@@ -39468,7 +39468,7 @@
Encoding: 499 -1 499
AnchorPoint: "Anchor-0" -492 -51 mark 0
The attachment contains the debug messages my pango modification produces for Pothana2000, Lohit Telugu, and my modified Lohit Telugu.
Created attachment 330034 [details]
untested patch for Lohit-Telugu.sfd
I hand edited http://fedorahosted.org/lohit/browser/trunk/Lohit-Telugu.sfd for my suggested changes and a few more changes and created the attached patch.
The changes made are:
1. Every "GlyphClass:" line is fixed to reflect if the glyph is a "mark" (4) or a "ligature" (3). [For the specific issue at hand, changing "GlyphClass:" for all below base forms ("*.blwf" & glyphs got by substituting those, "glyph471," "U0C24_U0C4D.blwf_U0C30_U0C4D.blwf.blws" & "glyph473") is enough.]
2. Lookup flags of "Above Base Substitutions in Telugu lookup 1" and "Halant Forms in Telugu lookup 317" is changed from 0 to 8 to ignore marks. [For the specific issue at hand, the second change is enough.]
3. What seem to be typos in "Ligature2:" lines are fixed -- "U0C38_U0C4D.haln U0C1F U0C30_U0C4D.blwf" -> "U0C38_U0C4D.haln U0C24 U0C30_U0C4D.blwf" and "U0C37_U0C4D.haln U0C24 U0C30_U0C4D.blwf" -> "U0C37_U0C4D.haln U0C1F U0C30_U0C4D.blwf." [These are of course unrelated to this bug. The modifications are to "Ligature2:" lines for the glyphs "U0C38_U0C24_U0C4D.blwf_U0C30_U0C4D.blwf.blws" and "U0C37_U0C1F_U0C4D.blwf_U0C30_U0C4D.blwf.blws."]
4. What seem to be erroneous "akhn" rules for consonant sequences without halanths -- "U0C38 U0C24 U0C30 U0C40" -> "U0C38_U0C4D.haln U0C24_U0C40.abvs U0C30_U0C4D.blwf" and "U0C37 U0C1F U0C30 U0C40" -> "U0C37_U0C4D.haln U0C1F_U0C40.abvs U0C30_U0C4D.blwf." [These are also unrelated to this bug. The modifications are to "Ligature2:" lines for the glyphs "U0C38_U0C24_U0C30_U0C40.akhn" and "U0C37_U0C1F_U0C30_U0C40.akhn." The new sequences I have put are inspired by the "akhn" rules fixed in step 3 above.]
The extra changes in 1 and 2 allow better rendering for rendering engines that do not position the base consonant and the matra immediately next to each other. E.g. before the change in http://svn.gnome.org/viewvc/pango?limit_changes=100&view=revision&revision=2107 Pothana2000 rendered consonant + "ai-matra (0x0C48)" correctly as it ignored the intervening "ai length mark (0x0C56)" when combining the consonant with the "e-matra (0x0C46)." These changes will allow Lohit Telugu to also handle this case with renderers other than pango.
Fixed in lohit-fonts-2.3.8. (lohit-telugu-fonts-2.3.8-1)
Padmanabhan, I tried using your patch. Unfortunately it did not work. Baiscally the solution involved two simple changes,
1.Changing the lookup properties for 'ignore marks' at one place
2.Changing the glyph type for all the 'blwf' forms of consonants to 'Mark' from 'BaseGlyph'.
Thanks for the work you did on this bug. Your analysis has helped a lot in fixing it.