Bug 1427550

Summary: [ta_IN] Adding additional Glyphs in Tamil
Product: [Fedora] Fedora Reporter: Seshadri N <nsesha92>
Component: lohit-tamil-fontsAssignee: vishal vijayraghavan <vvijayra>
Status: ASSIGNED --- QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: unspecified    
Version: rawhideCC: fonts-bugs, i18n-bugs, kkrothap, nsesha92, prigupta, psatpute, samjnaa, vishalvijayraghavan
Target Milestone: ---Keywords: FutureFeature, i18n
Target Release: ---   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Svarita Deerga Svarita Anudatta eg
none
sfd file
none
Testing result jpg file
none
Proposed Lohit Vedic Tamil sfd none

Description Seshadri N 2017-02-28 15:04:33 UTC
Created attachment 1258398 [details]
Svarita  Deerga Svarita  Anudatta eg

Description of problem:

This is not a bug, but enhancement request.

Tamil language script is also used for writing other language text such as Sanskrit, Hindi, Malayalam, Telugu and Kannada in the form of transliteration. However Tamil script and the associated Unicode range U0B80-U0BFF does not contain many characters that are available in these other languages - mainly the Devanagari Extended unicode UA8E0 and Devanagari Vedic 

Extentions unicode U1CD0 series.

Some such requests that I have come across in my FB and other groups:

1. How can I include the signs like lines above the letters as shown in the picture. This is required to show the swaram i.e. high or low pitch of pronouncing the letter/word as in Sanskrit. This kind of lines are used in Sanskrit also.

2. I need to type musical notations in Tamil Using MSword processor. To indicate octave we put a dot on top or bottom of the letter. how to add  ?

--- My answer for above point 2: dot above = anuswara, dot below = nukta, Lohit Tamil does not contain nukta, but includes anuswara (U+0B82). ---

While its lot easier to juggle around in Linux based systems by switching between different fonts, its very difficult in MS based applications. 

Hence, I am requesting that 9 such frequently used Glyphs be 'added' to the existing Lohit Tamil font. The existing Glyphs from Lohit-Devanagari.ttf can be 'leveraged' for this, no need to reinvent the wheel. 

Replace:

U+0310 COMBINING CANDRABINDU  >>> replace with VEDIC TONE CANDRABINDU / anunasika U+0901, thats in Devanagari range, to be consistent with all other scrips.

Add:

Signs: 

Visarga U+0903 ( as separate and distinct, in addition to Tamil Visarga U+0B83 )
AVAGRAHA U+093D
CANDRABINDU VIRAMA U+A8F3

Combining signs/marks with Anchoring:

NUKTA U+093C
VEDIC TONE UDATTA / SVARITA U+0951
VEDIC TONE ANUDATTA U+0952
VEDIC TONE PRENKHA U+1CD2
VEDIC TONE DOUBLE/Deerga SVARITA U+1CDA
VEDIC TONE CANDRA U+1CF4


Version-Release number of selected component (if applicable):


How reproducible:

Not a bug, but enhancement request.

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Seshadri N 2017-03-07 07:15:07 UTC
Hi Pravin,

After few mail & FB exchanges with the person that needed musical notations to indicate octave - we put a dot on top or bottom of the letter - and experimenting little bit with FontForge (I am learning), I have come to the conclusion that its better to use the following two Spacing Modifier Letters:

U+0307 COMBINING DOT ABOVE
U+0323 COMBINING DOT BELOW

Hence please include the above two also, in addition to the 10 already requested.

From whatever little I have learned so far in FontForge, following things are to needed to take care of my request ( in layman's language, as I am still not conversant with FF)

1. Move CANDRABINDU Glyph from U+0310 to U+0901 with the associated Anchor-0

2. Create a new GPOS Lookup 'blwm' Below Base Mark Lookup 1 and associated Subtable for Anchor-1 and "linkup" all the Consonant, Vowel and Combo Glyphs ( 12+18+5+2)

3. "Place" Anchor-0 or Anchor-1 "at the right place" to all the above 10+2 Mark Glyphs.

If the above is ok, I can modify the current Lohit Tamil V 2.91.0 and send you  .sfd file, which you can fine tune. Let me know if this will work.

Warm regards
Seshadri

Comment 2 Seshadri N 2017-03-07 10:35:17 UTC
Missed one important step:

1a. Copy 9 Lohit Devanagari and 2 Arial Glyphs and Paste into Lohit Tamil font at the respective places.

Comment 3 Seshadri N 2017-03-25 19:38:06 UTC
Hi Pravin,

Attaching my "trial and learn generated" sfd file with the following changes to 2.91.0

Added Anchor-1 Below Marker and the following Glyphs and te status (OK or Error)

No Anchor

U+0903  DEVANAGARI VISARGA >>>> Error ???
U+093D DEVANAGARI AVAGRAHA >>>> OK
U+A8F3 VEDIC TONE CANDRABINDU VIRAMA >>>> OK

Anchor-0 with Top Marker

U+0305 COMBINING OVERLINE  >>> Not tested
U+0307 COMBINING DOT ABOVE  >>> Not tested
U+0901 VEDIC TONE CANDRABINDU / ANUNASIKA ( Moved from U+0310) >>>>Error
U+0951 VEDIC TONE UDATTA / SVARITA  >>>> OK 
U+1CD2 VEDIC TONE PRENKHA  >>>> Error
U+1CDA VEDIC TONE DOUBLE/DEERGA SVARITA >>>> Error
U+1CF4 VEDIC TONE CANDRA   >>>> Error

Anchor-1 with Below Marker

U+0323 COMBINING DOT BELOW  >>> Not tested
U+0332 COMBINING LOW LINE >>> Not tested
U+0952 VEDIC TONE ANUDATTA >>>> OK 


கँ
கஂ
கः
ऽ
ꣳ
ம॑
ப॒
ப᳒
ம᳚
த᳴

Hope you can fix the issues / inform me how to fix these error. Did get some error msg while generating font, don't know how to fix it.

Thanks
Seshadri

Comment 4 Seshadri N 2017-03-25 19:42:42 UTC
Created attachment 1266409 [details]
sfd file

Sfd file

Comment 5 Seshadri N 2017-03-25 19:44:12 UTC
Created attachment 1266410 [details]
Testing result jpg file

Testing output jpg file

Comment 6 Pravin Satpute 2017-03-27 05:34:10 UTC
Nice, you moved ahead and tried to add these characters into Lohit Tamil.

I have one query how frequently these new added characters get used? Is it for all users for Tamil language or only specific set.

After adding Vedic accents to Lohit Devanagari i left rather than making Lohit Devanagari heavy with addition of this characters good to have different version of Lohit Devanagari i.e. Lohit Vedic etc, where as a Developer we can get more flexibility adding new characters and changing.

Same way i think: Rather than adding these new characters into Lohit Tamil add into new variant of Lohit Font. Did you saw Lohit Tamil Classical?  Same way to be Lohit Tamil Vedic etc??

Comment 7 Seshadri N 2017-03-27 15:27:28 UTC
Hi Pravin,

Thanks for taking this up, I am delighted. 

Yes, moved ahead, trying to learn new things, challenging and fun. 

On the sfd that I sent, few things that need to be corrected per my current 'learning'. 

1. The error is Missing Point at Extrema and Self intersecting. 

Not invested time to learn and fix it, will try in the coming days, you should be able to fix it in no time.

2. I also realised that kSha (kataml_viramataml_ssataml) is not showing up in GPOS blwn subtable and Anchor-1 is also NOT showing up at the left most side ( comparing against GPOS abvm).

3. Guess I also need to add another GSUB ( or GPOS or both) table for vowel sign aa, i, ii, u, uu, ai, o, oo, au combinations to place Anchor-0 and Anchor-1 at the 'right' place ??

4. Same for kSha

5. I have placed Anchor-0 and Anchor-1 more by guess work/positioning by looking at my laptop screen than scientifically/using the right step. 

6. And 'Transformed' the size of newly added 12 Glyphs again by guess work. 

To the best of my current understanding, the above need to be corrected/fixed.

Now coming to your questions:

Q1. How frequently these new added characters get used? Is it for all users for Tamil language or only specific set?

A1. Used by a select set of people for writing Dev Vedas like Pancha Suktam etc in Tamil and also transliterating the Dev slokas that have these characters into Tamil. 

Other contributors and masters like Sri Shriramana Sharma can throw more light. 

Q2. Did you see Lohit Tamil Classical?

A2. Yes, I dowloaded it from here, but finally what I see is Normal/Basic Tamil fonts.

https://pagure.io/lohit  >> Lohit Tamil Classical >> https://releases.pagure.org/lohit/lohit-tamil-ttf-2.5.3.tar.gz

However for oldies like me who have studied in Tamil medium and having gone thro some of the related Bugs, I know the context - the rendering of aa, i, ii 
etc, vowel signs for nna, nnna, rra etc.

But, note that, in the classical case, its an OR condition - either 'old style' or the new unified/uniform computer saavy style.

Q3. How about different versions - Lohit Tamil (basic), Lohit Tamil Vedic etc?? Because, as a Developer, we can get more flexibility adding new 
characters and changing. 

A3. Yes AND NO.

Here the condition is AND and not OR as the Classical case. 

It can give flexibility but maintaining (version control) multiple Tamil Lohit fonts can be challenging. You can also have Basic and Vedic (with just the extentions) as separate 'packages' and create Tamil Vedic by '#include'ing the Vedic. However there are also interdependencies - say adding a marker 
needs adding related Anchors in the Basic package etc. 

So, IMO, better to have single 'package' to take care of these dozen or so additions and those who want to use them can do so. 

Comments from others are welcome and the final approach is left to you. 

Thanks again for your help.

Warm regards
Seshadri

Comment 8 Shriramana Sharma 2017-06-05 10:59:14 UTC
The OP has pinged me many times via direct mail but being busy I wasn't able to respond to this issue till today.

First it should be understood that to keep things formally correct the OP should submit a request to Unicode to allot any characters that he desires to use with the Tamil script a ScriptExtension ("SE") of Taml (http://www.unicode.org/Public/UNIDATA/ScriptExtensions.txt). Just adding the characters to Lohit Tamil and adding appropriate OT tables may work but might not be portable.

The OP is advised to use http://www.unicode.org/reporting.html or better still submit a document as per http://www.unicode.org/pending/docsubmit.html with appropriate attestations for the usages he asks to be supported.

(In reply to Seshadri N from comment #0)
> Extentions unicode U1CD0 series.

Not all of these characters are attested to be used with Tamil.

> Some such requests that I have come across in my FB and other groups:
> 
> 1. How can I include the signs like lines above the letters as shown in the
> picture. This is required to show the swaram i.e. high or low pitch of
> pronouncing the letter/word as in Sanskrit. This kind of lines are used in
> Sanskrit also.

The characters attested to be used with Tamil are 0951, 0952 and 1CDA and they are already allotted ScriptExtensions for Tamil.


0951          ; Beng Deva Gran Gujr Guru Knda Latn Mlym Orya Shrd Taml Telu # Mn       DEVANAGARI STRESS SIGN UDATTA
0952          ; Beng Deva Gran Gujr Guru Knda Latn Mlym Orya Taml Telu # Mn       DEVANAGARI STRESS SIGN ANUDATTA
1CDA          ; Deva Knda Mlym Taml Telu # Mn       VEDIC TONE DOUBLE SVARITA

> 2. I need to type musical notations in Tamil Using MSword processor. To
> indicate octave we put a dot on top or bottom of the letter. how to add  ?
> 
> --- My answer for above point 2: dot above = anuswara, dot below = nukta,
> Lohit Tamil does not contain nukta, but includes anuswara (U+0B82). ---

Nuktas and anusvaras form part of Indic syllabic structure. Musical notations are outside syllable structure. Combining characters from the 03xx range should be used instead.

Note that separately single- and double-dot nukta-s are attested for Tamil and need to be added to Lohit Tamil:

http://www.unicode.org/L2/L2015/15256-tamil-nukta.pdf

Of these the double-dot nukta from Grantha is allotted SE for Tamil:

1133C         ; Gran Taml # Mn       GRANTHA SIGN NUKTA

and the single-dot nukta is in the pipeline to be encoded in the Grantha block with Script=Inherited (which means it can be used with all scripts).

> U+0310 COMBINING CANDRABINDU  >>> replace with VEDIC TONE CANDRABINDU /
> anunasika U+0901, thats in Devanagari range, to be consistent with all other
> scrips.

In fact it would seem Unicode advises to use Grantha characters for Tamil rather than Devanagari. See visarga matter below. Thus U+11301 Grantha Candrabindu would be the appropriate character.

> Visarga U+0903 ( as separate and distinct, in addition to Tamil Visarga
> U+0B83 )

UTC advises to use U+11303 Grantha Sign Visarga:
11303         ; Gran Taml # Mc       GRANTHA SIGN VISARGA

> AVAGRAHA U+093D

I suppose U+1133D Grantha Sign Avagraha would be appropriate.

> CANDRABINDU VIRAMA U+A8F3

Already in SE:
A8F3          ; Deva Taml # Lo       DEVANAGARI SIGN CANDRABINDU VIRAMA

> NUKTA U+093C
> VEDIC TONE UDATTA / SVARITA U+0951
> VEDIC TONE ANUDATTA U+0952
> VEDIC TONE DOUBLE/Deerga SVARITA U+1CDA

See above.

> VEDIC TONE PRENKHA U+1CD2
> VEDIC TONE CANDRA U+1CF4

Please provide attestation as to where these are used with Tamil.

Comment 9 Shriramana Sharma 2017-06-05 11:01:15 UTC
(In reply to Seshadri N from comment #2)
> Missed one important step:
> 
> 1a. Copy 9 Lohit Devanagari and 2 Arial Glyphs and Paste into Lohit Tamil
> font at the respective places.

IANAL but I think Arial glyphs should not be copied to Lohit fonts because of copyright issues. Use an OFL-compatible source font for borrowed glyphs and make sure they are at least stylistically acceptable to be in line with Lohit Tamil style.

Comment 10 Krishna Babu K 2018-02-28 15:19:41 UTC
No glyphs added as per the comments.

Comment 11 Ben Cotton 2018-11-27 14:07:35 UTC
This message is a reminder that Fedora 27 is nearing its end of life.
On 2018-Nov-30  Fedora will stop maintaining and issuing updates for
Fedora 27. It is Fedora's policy to close all bug reports from releases
that are no longer maintained. At that time this bug will be closed as
EOL if it remains open with a Fedora  'version' of '27'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 27 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 12 Seshadri N 2019-07-08 16:35:26 UTC
Thanks for the feedback from Shriramana Sharma, appreciate it very much.

I am not a fontographer. Just learnt fontforge by 'trial & learn' method. 

And I am not a Vedic scholar and I don't know Sanskrit.. All that I have learnt so far is panchasuktam in TAMIL - sad. 
But typing panchasuktam in Tamil with vedic svaras is the trigger for me to foray into this unknown territory. 
Helped me learn a very LITTLE bit about unicode, glyph, font etc,.

I have modifed the original sfd to take care of the feedback points to the best of my knowledge & abilities. 

I have added comments in FONTLOG as to what I have done ( rather copy/pasted). 

I find that the above & below (vedic) symbols don't get aligned properly for pure consonants and CVCs except for a, u, U, e, E, ai series, no idea why. 

Hoping that Redhat Fedora / Lohit font team makes final corrections and releases this Vedic Tamil font. 
(could be a new font, but better to update the existing Lohit Tamil font for easier upgrades, maintenance etc in future).

Thanks

Comment 13 Seshadri N 2019-07-08 16:44:47 UTC
Created attachment 1588438 [details]
Proposed Lohit Vedic Tamil sfd

Attaching the sfd file.

Comment 14 Seshadri N 2019-07-16 09:51:17 UTC
On second thoughts, it is better to have a separate Vedic Tamil font, instead of modifying / adding Vedic signs existing Lohit Tamil.

As Vedic letters require extra ascent and descent space.

Since I have not modified ascent and descent parameters, ( I don't know the logic on what / how to modify, what values to choose ..), I guess some of the Vedic signs are intersecting / overlapping.

Hope V Vijay will take care of this.

Comment 15 Ben Cotton 2019-10-31 19:03:30 UTC
This message is a reminder that Fedora 29 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora 29 on 2019-11-26.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
Fedora 'version' of '29'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 29 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 16 vishalvvr 2019-11-05 05:33:55 UTC
This is still not fixed. Moving to rawhide

Comment 17 Ben Cotton 2020-02-11 15:44:17 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 32 development cycle.
Changing version to 32.

Comment 18 Fedora Program Management 2021-04-29 16:58:46 UTC
This message is a reminder that Fedora 32 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora 32 on 2021-05-25.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
Fedora 'version' of '32'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 32 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 19 Priyam Gupta 2021-05-11 05:03:24 UTC
As discussion with @vishalvijayraghavan, moving this to fedora 34.

Comment 20 Ben Cotton 2022-05-12 16:56:12 UTC
This message is a reminder that Fedora Linux 34 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora Linux 34 on 2022-06-07.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
'version' of '34'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, change the 'version' 
to a later Fedora Linux version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora Linux 34 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora Linux, you are encouraged to change the 'version' to a later version
prior to this bug being closed.