Bug 205252 - [te_IN] icu treats two vowel signs that follow eachother differently than does pango
Summary: [te_IN] icu treats two vowel signs that follow eachother differently than doe...
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: icu
Version: rawhide
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Caolan McNamara
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2006-09-05 15:52 UTC by A S Alam
Modified: 2013-07-03 00:39 UTC (History)
3 users (show)

Fixed In Version: icu-3.6-2
Clone Of:
Environment:
Last Closed: 2006-09-07 08:57:56 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
Screen-shot for Rendering issue (62.64 KB, image/png)
2006-09-05 15:52 UTC, A S Alam
no flags Details
make icu behave like pango for two vowel signs in a row (959 bytes, patch)
2006-09-06 10:15 UTC, Caolan McNamara
no flags Details | Diff
and this patch makes pango behave like icu (615 bytes, patch)
2006-09-06 10:16 UTC, Caolan McNamara
no flags Details | Diff

Description A S Alam 2006-09-05 15:52:40 UTC
Description of problem:
when try to input following combination with SCIM using Telugu-Inscript layout.
---
bfvaeldkccwv
---


Version-Release number of selected component (if applicable):
openoffice.org-core-2.0.4-2.2

How reproducible:
Everytime during Input

Steps to Reproduce:
1. Open oowriter (better to open in te_IN locale)
2. Activate SCIM, select Telugu-Inscript)
3. Type (bfvaeldkccwv) keys
  
Actual results:
Rendering is incorrect (you can see dotted Circle)

Expected results:
as in Gedit

Additional info:
1) Screenshot is attached with gedit/Openoffice/kedit
2) Case for Input with SCIM

Comment 1 A S Alam 2006-09-05 15:52:40 UTC
Created attachment 135576 [details]
Screen-shot for Rendering issue

Comment 2 Caolan McNamara 2006-09-05 16:10:44 UTC
always worth cutting and pasting chars from gedit to OOo to quickly see if it's
a combining character rendering difference vs input mechanism difference.

Comment 3 Caolan McNamara 2006-09-05 16:40:32 UTC
so appears to be rendering "TELUGU VOWEL SIGN AA" 0x0C3E generated on the 'e'
which is the problem, I'd guess that the previous character is not valid for
such a vowel sign to be applied to, pango deciding not to render it, OOo
(through icu ?) and kde deciding to draw the vowel sign uncombined with anything

Comment 4 Caolan McNamara 2006-09-06 10:15:52 UTC
Created attachment 135641 [details]
make icu behave like pango for two vowel signs in a row

This patch makes icu behave like pango

Comment 5 Caolan McNamara 2006-09-06 10:16:51 UTC
Created attachment 135642 [details]
and this patch makes pango behave like icu

and this makes pango behave like icu

Comment 6 Caolan McNamara 2006-09-06 10:20:44 UTC
caolanm->besfahbo: Do you know which of the above two behaviours is the correct
one when we have a sequence such as generated from "vae" i.e. 0x0C28 0x0C4B 0x0C3E
the icu one where we reset the sequence and don't allow two such vowels to be
part of the same sequence, or the pango one where we do.

Comment 7 Caolan McNamara 2006-09-06 10:43:38 UTC
Looks deliberate in pango,

http://cvs.gnome.org/viewcvs/pango/modules/indic/indic-ot-class-tables.c#rev1.7
http://bugzilla.gnome.org/show_bug.cgi?id=121882#c10

so I'll apply to icu to be like pango

Comment 8 Caolan McNamara 2006-09-06 10:54:51 UTC
logged upstream as http://bugs.icu-project.org/cgi-bin/icu-bugs/incoming?findid=5365

Comment 9 Caolan McNamara 2006-09-15 08:58:19 UTC
<quote quthor="Eric Mader <emader(at)icu...">

I've just looked at this bug and the associated Bugzilla bug 205252. It 
looks to me that the reason for the difference is that the Pango fix for 
BugZilla bug 121882 is too general: it allows arbitrary combinations of 
dependent vowels. I fixed the ICU bug corresponding to 121882 (JB#4206) 
by assigning distinct character classes to the pieces of split vowels so 
that only the pieces of split vowels, in the correct order, are allowed 
to combine.

I think that the example given in 205252 is not a correct sequence, and 
is only accepted as one in Pango because the fix  to 121882 is too general.

</quote>

Comment 10 Behdad Esfahbod 2006-09-15 16:15:43 UTC
That analysis is probably correct.  I compared Pango's and ICU's Indic shapers
yesterday.  They have converged in a few places.  I'm going to resynch them
back, but that will take some time.


Note You need to log in before you can comment on or make changes to this bug.