Description of problem: when try to input following combination with SCIM using Telugu-Inscript layout. --- bfvaeldkccwv --- Version-Release number of selected component (if applicable): openoffice.org-core-2.0.4-2.2 How reproducible: Everytime during Input Steps to Reproduce: 1. Open oowriter (better to open in te_IN locale) 2. Activate SCIM, select Telugu-Inscript) 3. Type (bfvaeldkccwv) keys Actual results: Rendering is incorrect (you can see dotted Circle) Expected results: as in Gedit Additional info: 1) Screenshot is attached with gedit/Openoffice/kedit 2) Case for Input with SCIM
Created attachment 135576 [details] Screen-shot for Rendering issue
always worth cutting and pasting chars from gedit to OOo to quickly see if it's a combining character rendering difference vs input mechanism difference.
so appears to be rendering "TELUGU VOWEL SIGN AA" 0x0C3E generated on the 'e' which is the problem, I'd guess that the previous character is not valid for such a vowel sign to be applied to, pango deciding not to render it, OOo (through icu ?) and kde deciding to draw the vowel sign uncombined with anything
Created attachment 135641 [details] make icu behave like pango for two vowel signs in a row This patch makes icu behave like pango
Created attachment 135642 [details] and this patch makes pango behave like icu and this makes pango behave like icu
caolanm->besfahbo: Do you know which of the above two behaviours is the correct one when we have a sequence such as generated from "vae" i.e. 0x0C28 0x0C4B 0x0C3E the icu one where we reset the sequence and don't allow two such vowels to be part of the same sequence, or the pango one where we do.
Looks deliberate in pango, http://cvs.gnome.org/viewcvs/pango/modules/indic/indic-ot-class-tables.c#rev1.7 http://bugzilla.gnome.org/show_bug.cgi?id=121882#c10 so I'll apply to icu to be like pango
logged upstream as http://bugs.icu-project.org/cgi-bin/icu-bugs/incoming?findid=5365
<quote quthor="Eric Mader <emader(at)icu..."> I've just looked at this bug and the associated Bugzilla bug 205252. It looks to me that the reason for the difference is that the Pango fix for BugZilla bug 121882 is too general: it allows arbitrary combinations of dependent vowels. I fixed the ICU bug corresponding to 121882 (JB#4206) by assigning distinct character classes to the pieces of split vowels so that only the pieces of split vowels, in the correct order, are allowed to combine. I think that the example given in 205252 is not a correct sequence, and is only accepted as one in Pango because the fix to 121882 is too general. </quote>
That analysis is probably correct. I compared Pango's and ICU's Indic shapers yesterday. They have converged in a few places. I'm going to resynch them back, but that will take some time.