Description of problem:
It was announced (https://www.redhat.com/archives/lohit-devel-list/2012-February/msg00011.html) that the latest 2.5.1 version Lohit fonts support latest Unicode 6.0 characters. Especially Malayalam was also mentioned.
However I find that the Lohit Malayalam 2.5.1 release downloadable from https://fedorahosted.org/releases/l/o/lohit/lohit-malayalam-ttf-2.5.1.tar.gz does NOT provide support for 0D4E MALAYALAM LETTER DOT REPH.
As this is one of the three Malayalam characters encoded for Unicode 6.0 (see http://www.unicode.org/Public/UNIDATA/DerivedAge.txt and search for 0D4E) it should also be supported.
[The other two characters are provided but 0D3A has a wrong glyph which I have reported as bug 798870.]
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Install Lohit Malayalam 2.5.1 font.
2. Try to use 0D4E MALAYALAM LETTER DOT REPH
This character is not available.
This character was encoded to support the old Malayalam orthography. As such it should be made available for full Unicode 6.0 (or 6.1) support.
You might need to do some smart font programming to position the dot reph correctly. Note that this character is a special rendering character (hence the dotted box around it in the code chart http://www.unicode.org/charts/PDF/U0D00.pdf).
The special rendering is that it should be placed on top of the character *following* it. See the original proposal bottom of page 3 and top of page 4.
I think the e-Malayalam OTC font (http://www.aai.uni-hamburg.de/indtib/INDOLIPI/Malayalam.zip) has pre-composed glyphs using this character on top of other consonants which might help you in positioning this character. Note that most often it is found with doubled consonants (i.e. DOT_REPH + GA + VIRAMA + GA etc) so you will have to be able to position this character above stacked consonant clusters.
I hope this is sufficient feedback for supporting this character which is important for old Malayalam orthography.
can you provide screenshot of its rendering?
The original proposal http://std.dkuug.dk/jtc1/sc2/wg2/docs/n3676.pdf has many examples on pp 3-5. Is that enough?
yes, fixed in upstream, latest ttf http://pravins.fedorapeople.org/Lohit-Malayalam.ttf
Created attachment 567649 [details]
Revised Lohit Malayalam for bugs #799565 and #798870
There are some corrections. Please find attached a revised font.
The correction is that the dot reph should have a positive LSB. Otherwise it will become positioned on the previous letter rather than on the following letter. Please see original proposal. PA DOT_REPH VA പൎവ should place dot reph on VA, and not PA. So I have now moved the dot reph to the right.
However, while it seems to be working correctly (at least in LibreOffice) with medium-size consonants like ഗ വ etc, it still does not look good with wide consonants like ണ. It should ideally be centered on top of the consonant as you can see in the proposal samples. Can you please implement proper glyph positioning? I don't know how to do that.
I will attach ODT and PDF samples of font as currently modified by me for testing.
Created attachment 567651 [details]
ODT and PDF for test-case
this is peculiar case, where marks come first and then base character. need to check it.
Yes as I told you that is why the character in the Unicode chart has a dotted box around it to indicate that it is a special rendering character.
I quote from: http://www.unicode.org/versions/Unicode6.0.0/ch09.pdf p 310:
Dot Reph. U+0D4E MALAYALAM LETTER DOT REPH is used to represent the dead consonant form of U+0D30 MALAYALAM LETTER RA, when **it is displayed as a dot over the consonant following it**. Conceptually, the dot reph is analogous to the sequence <RA, VIRAMA>, but when followed by another consonant, the Malayalam cluster <RA, VIRAMA, C2> normally assumes the C2 conjoining form. **U+0D4E MALAYALAM LETTER DOT REPH occurs first, in logical order, even though it displays as a dot above the succeeding consonant**. It has the character properties of a letter, and is not considered a combining mark.
If we consider this as a base character, then there is no feature in OT spec for doing base to base positioning. We can use dist/kern feature but i dont think it will give expected results.
Dunno, do we need reordering for this character? we are reordering "ra+virama" in Devanagari script.
I used akhn for this in Meera. The dot positioning using positive lsb is not optimal and will result the dot appearing in wrong positions size the ligature below is variable width. The DOT should come in center top position in general, but there are exceptions too. Meera has separate glyphs, for all valid dot rephs. But I got some issues in this implementation too.
I asked about this on Unicore list. One Microsoft engineer replied that they use reordering for this. I am guessing that they are treating the dot_reph character just like any other reph in any Indic script -- move it to the end of the syllable. The only difference is that Malayalam has a distinct character for this whereas in other scripts it is just RA + VIRAMA which is rendered as the reph.
So probably you can just treat it like the reph of other scripts. However, as Santosh says (and I already said) there will be a problem with the positioning.
One solution is to use akhanda ligatures as Santosh says but that is just a hack. The proper solution is to use GPOS. After all, that is what GPOS is for, isn't it? To position combining marks properly?
That said, I myself am not knowledgeable about all this GPOS-GSUB thing -- I'm more a Graphite person. So I leave it to you people to decide how to implement this.
I would only suggest that if you do use akhanda ligatures that you use composite glyphs instead of duplicating existing outlines. Would help in keeping the size of the font under check.
Santhosh, i think with reordering dot-reph to end of the syllable, we can solve this problem.
We can simply use positioning/kerning for it, Once it get reordered same way like Anuswara in Devanagari script U+0902 we can position it.
For time being for testing purpose we can type u0d4e at the end of syllable and check.
In between i am not finding any rule written for dot-reph ligatures in Meera font.
Shriramana, can you ask Microsoft guy regarding any link for it in specification, we can ask behdad to do that change in harfbuzz.
(In reply to comment #11)
> For time being for testing purpose we can type u0d4e at the end of syllable and
Agreed. If the kerning is done first, we can later bring in reordering support from the software side.
> Shriramana, can you ask Microsoft guy regarding any link for it in
> specification, we can ask behdad to do that change in harfbuzz.
AFAIK this is not present in the Microsoft OT docs on Malayalam (http://www.microsoft.com/typography/otfntdev/malayot/shaping.aspx). The Unicode publication only says right now that it should be placed after the following consonant but clearly this is insufficient description. I will ask the Unicode people to update the wording. Hopefully Behdad can implement this as you say.
We understand the Repha now. Will implement for Malayalam soon (eg. tomorrow).
http://pravins.fedorapeople.org/Lohit-Malayalam.ttf Test font with added GPOS 'abvm' for U+0D4E (though not very accurate positioning)
Created attachment 599365 [details]
Desired rendering of dot reph and comparison of Bengali syllable structure
@Pravin/Behdad: I'm not much knowledgeable about but isn't the feature tag for reph characters spelt as "reph" or "rphf" or something?
@Behdad: Basically if you treat this 0D4E character equivalent to the cluster-initial RA + VIRAMA sequences of other Indic scripts, especially Bengali, I think it would be sufficient. Why Bengali? Because it also has two part vowel signs ো ৌ like Malayalam ൊ ോ ൌ and the reph also. But Bengali doesn't seem to have post-base VA unlike Malayalam so you may have to look out for that.
FWIW I have attached a document (ODT and PDF) showing the desired rendering of the reph (using the e-Malayalam OTC font from the "Indolipi" package [link above] which hack-renders the Malayalam RA + Virama combination as the reph) and the equivalent Bengali sequences in two standard Bengali fonts.
You might also like to see https://sites.google.com/site/jamadagni/files/utcsubmissions/12106-ed-updates.pdf §3 (on p 4) for more details on the Malayalam dot reph.
Feature tag will not come into picture for U+0D4E, once reordering is done by OTLS as per syllable structure, we can simply apple GPOS tag 'abvm' and get desired positioning.
I will update Lohit as per desired positioning.
(In reply to comment #13)
> Hi all,
> We understand the Repha now. Will implement for Malayalam soon (eg.
I just checked with Harfbuzz NG. Still we are not reordering U+0D4E at the end of syllable.
hb-shape returns :: [uni0D4E=0|U0D15=1+1015]
Any specific plan for this?
Ok, I'll work on this now. Thanks for pinging, I was out of issues to fix and was getting bored... :)
Fixed upstream. Please test.
(In reply to comment #7)
> I quote from: http://www.unicode.org/versions/Unicode6.0.0/ch09.pdf p 310:
> Dot Reph. U+0D4E MALAYALAM LETTER DOT REPH is used to represent the dead
> consonant form of U+0D30 MALAYALAM LETTER RA, when **it is displayed as a
> dot over the consonant following it**. Conceptually, the dot reph is
> analogous to the sequence <RA, VIRAMA>, but when followed by another
> consonant, the Malayalam cluster <RA, VIRAMA, C2> normally assumes the C2
> conjoining form. **U+0D4E MALAYALAM LETTER DOT REPH occurs first, in logical
> order, even though it displays as a dot above the succeeding consonant**. It
> has the character properties of a letter, and is not considered a combining
I still not get this output. do i need to use different OT feature here?
Testing with the font in comment 14, I see the reordering happening. Here's the hb-shape output:
$ ./hb-unicode-encode d4e,d15 | build/util/hb-shape ./Lohit-Malayalam.ttf --shaper ot
This is what I get with Lohit-Malayalam 2.5.2:
$ ./hb-unicode-encode d4e,d15 | build/util/hb-shape
My mistake. Messed up with git, its working fine now.
I will do release of lohit with this fix.
Thanks a lot Behdad
Cool. Please close when you do.
Completed GPOS in upstream, will be available with the next release of lohit-malayalam-fonts.