Bug 829143 - [ta_IN] Fix Rendering of Letter RA,RI,RII per GoTN standards
[ta_IN] Fix Rendering of Letter RA,RI,RII per GoTN standards
Status: CLOSED ERRATA
Product: Fedora
Classification: Fedora
Component: lohit-tamil-fonts (Show other bugs)
rawhide
Unspecified Unspecified
unspecified Severity unspecified
: ---
: ---
Assigned To: Pravin Satpute
Fedora Extras Quality Assurance
: i18n
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-06-06 01:10 EDT by Srikanth
Modified: 2012-08-09 19:21 EDT (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2012-08-09 19:21:47 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
ZIP of ODT containing relevant text, PDFs showing current rendering (41.28 KB, application/zip)
2012-06-13 10:41 EDT, Shriramana Sharma
no flags Details
ZIP of ODT containing relevant text, PDFs showing current rendering (41.55 KB, application/zip)
2012-06-14 22:59 EDT, Shriramana Sharma
no flags Details

  None (edit)
Description Srikanth 2012-06-06 01:10:18 EDT
Description of problem:

Letter RA (ர, U+0BB0) should not use shortened form visually similar AA (◌ா, U+0BBE) for ர்,ரி,ரீ per GoTN procurement standards.

Version-Release number of selected component (if applicable):
2.50

How reproducible:
Always

Steps to Reproduce:
1. View http://translatewiki.net/wiki/User:Logicwiki/Sandbox?setlang=ta 
2. See row for ர் 0BB0 0BCD on the unicode table in Lohit-Tamil font to see how Lohit-Tamil renders . (default when page loads with webfonts.
  
Actual results:
ர்,ரி,ரீ without the / in the bottom part of the glyph.

Expected results:
ர்,ரி,ரீ with the / in the bottom part of the glyph.
http://www.tn.gov.in/gosdb/gorders/it/it_e_29_2010.pdf See last page of pdf for expected rendering.

Additional info:

http://www.tn.gov.in/gosdb/gorders/it/it_e_29_2010.pdf
Appendix A – Valid Unicode Tamil Character Sequences Page 4 of 9 

Tamil Vowel sign AA (◌ா, U+0BBE) and Tamil Letter RA (ர, U+0BB0) shall be 
treated as distinct from each other. As recommended by the Department of Tamil 
Development, Government of Tamil Nadu the letters , ,  not be rendered as , 	,

 (See letter No. E1/14702/99, dated 20.7.2000 from Department of Tamil 
Development enclosed as part of this appendix)
Comment 1 Pravin Satpute 2012-06-06 08:50:50 EDT
Understood the problem, Thanks for http://translatewiki.net/wiki/User:Logicwiki/Sandbox?setlang=ta and pdf from Govt. it made difference clear.

Will fix this soon.
Comment 2 Shriramana Sharma 2012-06-06 11:54:55 EDT
I can't say I agree with this bug report. The Unicode Standard's chapter on Tamil (http://www.unicode.org/versions/Unicode6.1.0/ch09.pdf p 310 of book) already documents the fact that in accepted orthographic styles the ர shape changes to ா when it joins with ் ி and ீ. It also notes that various governmental bodies recommend use of unmodified ர with these characters for educational purposes. 

Obviously these governmental bodies are trying to simplify the learning of the script for children. 

Maybe the TN Govt suggests use of unmodified ர. But Sri Lanka Govt suggests use of modified ர as per TUS 6.1. TN Govt is not the only deciding body about Tamil script because Tamil is also used in Sri Lanka and Malaysia. Therefore it is not true that this is a "bug" or that it should be fixed. Fedora is an international software (even though offices and Indic-related projects are located in Pune) and need not follow the mandates of a single government especially when other governments mandate otherwise. 

You can even see Tamil fonts released freely (but I think not under free licence) by TN Govt:

1) http://www.tn.nic.in/tamilsw/otf.zip from http://www.tn.nic.in/tamilsw/otf.htm

2) http://www.ildc.in/Tamil/GIST/Modular/Modular.zip from http://www.ildc.in/Tamil/GIST/htm/modular-otfonts.htm

As far as I can see they all show this behaviour of "modified" shape of ர.

Already in other parts of the world there are special fonts for literacy purposes i.e. specially to help teaching. For example see http://scripts.sil.org/andika and http://scripts.sil.org/SILEntityFonts. But these are clearly behaviours intended for specific usage contexts and not for general use.

And when you read the TN Govt order which says that ர் ரி ரீ should be rendered only with the / below and not without it, please also read the paragraph above that:

"""10. Font developers working on Tamil fonts to be procured by the Government of Tamil Nadu shall be required to follow orthographic conventions and standards and make the following distinctions:"""

Lohit Tamil is of the community, by the community and for the community. It is not designed specially for procurement by the TN Govt for usage in TN Govt offices (although it may be freely used there under the OFL). So Lohit Tamil need not comply by these rules. I already noted above that Lohit Tamil is an internationally usable font and not subject to rules of a single regional Govt. (Other scripts are mostly localized within India. Not Tamil. Bengali is the only other exception, but such issues don't seem to exist there.)

Anyhow, it should be noted that in fact historically "ர" is the modified shape and it was originally identical to ா as seen in inscriptions and manuscripts because it is descended from Brahmi RA which looks like | which developed a horizontal head-stroke and the left-side support stroke to become today's RA. So it actually looked like ா. Later for disambiguation purpose they settled on adding a / stroke at bottom for RA. (Compare எ ஏ which also were both written as எ originally.)

I personally would like to see Lohit Tamil retain its current behaviour. I learnt to write Tamil this way and would like to continue doing so. As noted, Unicode also gives only this as standard behaviour. If TN Govt would like to modify Lohit Tamil for its internal usage purpose, it may do so under the OFL.

Thank you.

P.S.: @Srikanth: Please don't take any of this personally. It is not directed against you.
Comment 3 Srikanth 2012-06-06 12:54:51 EDT
Thanks Shriramana for your inputs. There is nothing to take things personally here :)

While I agree with you that Lohit is a font for community and particular Govt procurement standard should not be taken as baseline, I think we can still look at those recommendations at its merit.

In my view, the intent of wanting ர்,ரி,ரீ to have a / at bottom is to disambiguate and provide better readability then why not have it for the sake of better readability?

PS: The bug was tagged with ta_IN(not by me), so should we be only considering about India ?
Comment 4 Pravin Satpute 2012-06-06 13:49:20 EDT
That is why i cc'ed this bug to lohit-devel to get comments from active contributors. Thanks Shriramana for detailed comment.

Yes Lohit is default font in World Wide distributions so we have to follow international standards i.e. Unicode.

Same time if possible we do support locale specific things not defined in Unicode but in local govt.

In this case i do not see any chance to do customized changes specific to ta_IN locale. As Open Type only support lang code i.e. "TAM" not lang_contry TAM_IN code.  http://www.microsoft.com/typography/developers/opentype/languagetags.aspx

We should wait to get some consensus between Tamil Govt. defined standard and Unicode standard.

@Srikanth between how Wikipedia handle this, language wise or language_country wise? i think its language wise, so this problem is application to ta.wikipedia.org as well :)
Comment 5 Srikanth 2012-06-06 14:20:57 EDT
(In reply to comment #4)
> That is why i cc'ed this bug to lohit-devel to get comments from active
> contributors. Thanks Shriramana for detailed comment.
> 
> Yes Lohit is default font in World Wide distributions so we have to follow
> international standards i.e. Unicode.
> 
> Same time if possible we do support locale specific things not defined in
> Unicode but in local govt.
> 
> In this case i do not see any chance to do customized changes specific to
> ta_IN locale. As Open Type only support lang code i.e. "TAM" not lang_contry
> TAM_IN code. 
> http://www.microsoft.com/typography/developers/opentype/languagetags.aspx
> 
> We should wait to get some consensus between Tamil Govt. defined standard
> and Unicode standard.

Unicode leaves it to typographical preference. So both are right. TamilNadu Govt standard restricts to only one form. Since its a question of preference and even as per Unicode, the preference is only different in Srilanka, where as Tamil Nadu, Singapore, Malaysia  fall in same line for this case. So its not that we need to wait for concensus between Unicode and Tamil Govt, but me and Shriramana need to have concensus :)

> @Srikanth between how Wikipedia handle this, language wise or
> language_country wise? i think its language wise, so this problem is
> application to ta.wikipedia.org as well :)

Wikipedia community is small and hence there has not been divisions, but elsewhere (like wordpress http://ta-lk.wordpress.org/ ) there have been specific language_country efforts. :)
Comment 6 Shriramana Sharma 2012-06-06 18:02:11 EDT
Hey Srikanth, thanks for taking this in a sporting manner! You are right that probably in the end it is only needed for you and I to have consensus as far as this bug is concerned, but IMO I think it would be useful to take some "census" :-) as to what the community actually thinks rather than what a few Govt bureaucrats think!? As I already pointed out, even fonts released by the TN Govt are having the actual traditional style glyphs. So they themselves are not behaving consistently with their own order.

But as you rightly said, whoever is following it or not is irrelevant, and the question is whether it would be good for Lohit to support this new orthography in the interests of readability for the community.

I wonder what is a good way to reach out to the actual Tamil users among the OSS community... Please give some time on this and don't hurriedly close it either way. I'll discuss with some other friends whether we can reach out to at least certain representative (?) sections of the Tamil computing community.

P.S.: A solution would be to make Lohit Tamil take "new" shapes of ர (i.e. as per TN Govt requirements) whereas to leave Lohit Tamil Classical with "old" shapes. Pravin, what do you think of that?
Comment 7 Shriramana Sharma 2012-06-11 04:08:26 EDT
I have been thinking over it and discussed with some of my friends. They say that not only GoTN but they also prefer to have uniform ர shape even with vowel signs ி ீ and virama ்.

I hence agree for Lohit Tamil to have glyphs of ரி ரீ ர் to have same consonant shape as regular ர. However Lohit Tamil Classical should have the existing glyphs only, because just like the ணை ணா etc behaviour, the current ரி ரீ ர் glyphs are the "classical" forms also. 

If this is agreeable to all, then one may proceed to modify the Lohit Tamil font as requested by Srikanth.
Comment 8 Srikanth 2012-06-11 07:10:10 EDT
It is fine to keep the current glyphs on the Classical font. Please proceed with the changes on Lohit-Tamil only.
Comment 9 Pravin Satpute 2012-06-13 06:52:37 EDT
Yes, looks fare enough. One can choose between Lohit Tamil and Classical according to his need.

As you both are agree and do not see any oppose from anyone. i will do this changes.

Can you test http://pravins.fedorapeople.org/Lohit-Tamil.ttf and give your comments. I have enhanced this by removing consonants+halant ligature by adding GPOS rules.
Comment 10 Shriramana Sharma 2012-06-13 10:41:15 EDT
Created attachment 591529 [details]
ZIP of ODT containing relevant text, PDFs showing current rendering

I have now tested the new font. The bottom / is OK, but there are some positioning issues. 

I notice that you have removed the entire set of vowelless consonants and the precomposed glyph for ரீ from the glyphs (but you have not removed ரி). I presume you are trying to get the same effect using GPOS. However, the results are not altogether entirely satisfactory.

The weird thing is that the positioning is different on Linux (Kubuntu 12.04 LibreOffice 3.5.3) and Windows XP (LO 3.5.0). Please see the attached files which show the rendering on Linux and Windows. You can see that the rendering of Lohit Tamil 2.5.1 is virtually the same on Linux and Windows [except for the fact that Win XP doesn't recognize SHA :-(]. But for the testing TTF version it is quite different.

On Linux:

The ீ for ரீ is somewhat more to the left. The attachment point of the ீ should be vertically aligned with the right side vertical stroke of the ர.  

The pulli-s (Tamil virama) on the consonants are mostly OK, in the sense that they are identical to the rendering of the published Lohit Tamil 2.5.1 (I might have some suggestions for that in the future too) but for ர் the pulli is too far to the left. 

On Windows:

The misalignment of ீ for ரீ is higher in this case, and as for the pulli-s, they have totally gone haywire. You can see it in the PDF.

Comments:

While mostly I work on Linux, for some purposes I still am not able to leave Windows XP. Apparently your GPOS rules are recognized correctly by the latest Linux but not by the old Win XP. I cannot afford to upgrade to latest Windows which might (or might not) have fixed the GPOS problem. I would like to keep using Lohit Tamil on Win XP also. With the haywire condition of the pulli-s in the testing version, it has become unusable.

There is also no point in removing the single ரீ when the whole series of CONSONANT + VS-II is having individual precomposed glyphs. Actually the Tamil font is very light (and it is Lohit Kannada which you should be trying to lighten up as I have reported as bug #825115) so there is not much gain in removing the single glyph (but only GPOS headache).

Removing the pulli series and using GPOS has had the positive effect of making sure that pulli for all consonants is at the same height. But as mentioned it is causing problems in Windows. If you are particular about lightening the font you can use composite glyphs (which do not contain outlines but only references to existing outlines).

Recommendations:

Restore the precomposed glyph for ரீ with proper attachment of ீ but with / in the bottom as requested.

If you wish to lighten the font and not give a full pulli series, please use composite glyphs. Otherwise, just retain the old series and just ensure that ர் gets its / below. Only then the usability under Win XP could be maintained (please!).
Comment 11 Pravin Satpute 2012-06-14 07:50:54 EDT
Thanks for detail testing. There was minor problem thats why it was not working on Windows.

Updated http://pravins.fedorapeople.org/Lohit-Tamil.ttf this should work on Windows as well.

We have removed Virama ligatures with Consonants since we do not required those. With positioning we have complete freedom to place VIRAMA wherever we want. Let me know if you find any difference with enhancements i tried my best to keep position same.

Did not removed ரி since i did not found matching "VOWEL SING I" in font. So let it be. But yes, if we can design U+0BBF (SIGN I) which can match with some consonants i would like to update it.

Yes, we should remove ரீ series as well but i think we can not remove all ligature in that series. Let me know which we can remove without affecting overall output. I will update it. May be on another bug

If you are happy with this enhancements i will update Classical fonts as well. Off course not for RA, RI and RII shapes :)

Yes, Kannada one is pending we will work on it soon.
Comment 12 Shriramana Sharma 2012-06-14 22:59:05 EDT
Created attachment 591974 [details]
ZIP of ODT containing relevant text, PDFs showing current rendering

(In reply to comment #11)
> Updated http://pravins.fedorapeople.org/Lohit-Tamil.ttf this should work on
> Windows as well.

Now it is working somewhat but not totally. See below.

> We have removed Virama ligatures with Consonants since we do not required
> those. With positioning we have complete freedom to place VIRAMA wherever we
> want. 

Well the point was that the GPOS implementations on Windows and Linux seem different, so unlike the precomposed glyphs (which render identical on both platforms) the GPOS positioning of pulli on the consonants differ between the platforms.

> Let me know if you find any difference with enhancements i tried my
> best to keep position same.

On Kubuntu, now the positioning is practically identical as of the released Lohit Tamil 2.5.1 font with precomposed pulli consonants. So that's fine. I presume other Linux distros will have the same behaviour.

The thing is, there is still difference in Windows and Kubuntu. On Windows the placing is still awkward for some of the consonants. I have now attached the rendering of the same ODT as before on Kubuntu 12.04 LO 3.5.3 and Windows XP LO 3.5.4 (I upgraded the Windows LO just now). You can compare them with the control PDFs generated by 2.5.1 on both platforms. The position of the pulli on த ந வ ழ ள ற ஷ ஹ is still too much to the right.

Now the GPOS positioning is obviously different on different platforms (even though it is not supposed to be) and I realize that you may not be able to perform testing on Win XP yourself. Supporting proper rendering on Win XP (which even by Microsoft standards is outdated and has many other rendering problems) may not be high on your priorities. My request is only that in the interests of helping people use open fonts even on non-open platforms, you go back to the precomposed glyphs but to cut down on the font size you can use composite glyphs which I have described before. 

Given that precomposed glyphs as simple glyphs (with copied outlines) are already there, using precomposed glyphs as composite glyphs (with referenced outlines) should not be a problem. If you think all is too much of an effort and expense on time and resources, I will not object if you give up trying to fix the rendering on Win XP, and will just maintain my own local fork for usage on Windows. (I already do so anyway.)

> Did not removed ரி since i did not found matching "VOWEL SING I" in font. So
> let it be. But yes, if we can design U+0BBF (SIGN I) which can match with
> some consonants i would like to update it.

No no, the point was not asking you to remove ரி, the point was to ask you to restore ரீ.

> Yes, we should remove ரீ series as well but i think we can not remove all
> ligature in that series. 

You can neither remove -ி series nor -ீ series. The cursive connection for each consonant is simply different. The point is to have a complete set of precomposed cursively connected set of consonants, so you should restore ரீ.

> Let me know which we can remove without affecting
> overall output. I will update it. May be on another bug

I think it is not worth the effort trying to optimize this. Lohit Tamil is one of the lightest Lohit fonts. We should rather be trying to optimize more heavier fonts.

> If you are happy with this enhancements i will update Classical fonts as
> well. Off course not for RA, RI and RII shapes :)

As I said, the rendering on Linux seems to be okay but not on Windows. Please let me know what you decide regarding using composite glyphs which would help to maintain same appearance and usability on Windows.
Comment 13 Pravin Satpute 2012-06-15 08:32:23 EDT
Backward compatibility is must. If it is not giving expected result on Windows will not do this at time. 

For the time being we will only go with the changes required for fixing this bug.
Comment 14 Shriramana Sharma 2012-06-17 11:43:59 EDT
Thanks Pravin. I also discovered just now that your proposed font also causes vowelless consonants to have more advance width than just basic consonants which is inappropriate as in Tamil the pulli is always placed above, so there should be no change in advance width.

As you say, it is advisable to keep backward compatibility and so please fix what is needed for this bug only. We will revisit the pulli consonants matter later.

Thank you.
Comment 15 Pravin Satpute 2012-06-18 08:45:47 EDT
update Lohit Tamil with changes http://pravins.fedorapeople.org/Lohit-Tamil.ttf , will commit it tomorrow in trunk.
Comment 16 Fedora Update System 2012-06-19 05:55:26 EDT
lohit-tamil-fonts-2.5.1-2.fc17 has been submitted as an update for Fedora 17.
https://admin.fedoraproject.org/updates/lohit-tamil-fonts-2.5.1-2.fc17
Comment 17 Fedora Update System 2012-07-19 05:06:13 EDT
Package lohit-tamil-fonts-2.5.1-2.fc17:
* should fix your issue,
* was pushed to the Fedora 17 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing lohit-tamil-fonts-2.5.1-2.fc17'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2012-10813/lohit-tamil-fonts-2.5.1-2.fc17
then log in and leave karma (feedback).
Comment 18 Fedora Update System 2012-08-09 19:21:47 EDT
lohit-tamil-fonts-2.5.1-2.fc17 has been pushed to the Fedora 17 stable repository.  If problems still persist, please make note of it in this bug report.

Note You need to log in before you can comment on or make changes to this bug.