Bug 873073

Summary: [si] Sinhala script rendering broken with harfbuzz 0.9.5 and 0.9.6
Product: [Fedora] Fedora Reporter: Harshula Jayasuriya <harshula>
Component: harfbuzzAssignee: Parag Nemade <pnemade>
Status: CLOSED NEXTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: high    
Version: 18CC: gwync, i18n-bugs, kalevlember, mclasen, orion, petersen, pnemade, psatpute
Target Milestone: ---Keywords: i18n, Patch, Regression
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: harfbuzz-0.9.7-1.fc18 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-12-11 06:15:17 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 752665    

Description Harshula Jayasuriya 2012-11-05 04:29:16 UTC
Fedora needs to package a recent SVN snapshot of GNU Free Font in order for Sinhala script to render correctly with Harfbuzz (apparently GNOME 3.6 uses HarfBuzz for i18n text layout). The LKLUG font will not work correctly with HarfBuzz.

Here's the reason why:
http://lists.freedesktop.org/archives/harfbuzz/2012-July/002261.html

There's been a lot of work on Sinhala glyphs/tables in FreeSerif:
svn.savannah.gnu.org/viewvc/trunk/freefont/sfd/FreeSerif.sfd?root=freefont&view=log

Comment 1 Gwyn Ciesla 2012-11-05 14:00:57 UTC
What SVN revision do you need?

Comment 2 Harshula Jayasuriya 2012-11-05 14:43:38 UTC
Hi Jon, 

http://svn.savannah.gnu.org/viewvc?view=rev&root=freefont&revision=2415 would be nice, but at least http://svn.savannah.gnu.org/viewvc?view=rev&root=freefont&revision=2411 .

There were a lot of changes recently. It might be best to ask the maintainer on his opinion too.

Comment 3 Gwyn Ciesla 2012-11-05 16:00:40 UTC
I'd rather just carry a small patch, can you post a diff of the exact changes you need?

Comment 4 Harshula Jayasuriya 2012-11-21 01:46:44 UTC
Moving this bug from gnu-free-fonts to harfbuzz.

If the Harfbuzz version in Fedora 18 is updated to include the following patch, then LKLUG font will work correctly and so will the older FreeSerif font.

------------------------------------------------------
commit 43b653150081a2f9dc6b7481229ac4cd952575dc
Author: Behdad Esfahbod <behdad>
Date:   Fri Nov 16 13:12:35 2012 -0800

    [Indic] Another try to unbreak Sinhala split matras
    
    Just read the comments...

diff --git a/src/hb-ot-shape-complex-indic.cc b/src/hb-ot-shape-complex-indic.cc
index b185824..d924d1a 100644
--- a/src/hb-ot-shape-complex-indic.cc
+++ b/src/hb-ot-shape-complex-indic.cc
@@ -1317,15 +1317,42 @@ decompose_indic (const hb_ot_shape_normalize_context_t *c,
 #endif
   }
 
-  if (indic_options ().uniscribe_bug_compatible)
-  switch (ab)
+  if ((ab == 0x0DDA || hb_in_range<hb_codepoint_t> (ab, 0x0DDC, 0x0DDE)))
   {
-    /* These Sinhala ones have Unicode decompositions, but Uniscribe
-     * decomposes them "Khmer-style". */
-    case 0x0DDA  : *a = 0x0DD9; *b= 0x0DDA; return true;
-    case 0x0DDC  : *a = 0x0DD9; *b= 0x0DDC; return true;
-    case 0x0DDD  : *a = 0x0DD9; *b= 0x0DDD; return true;
-    case 0x0DDE  : *a = 0x0DD9; *b= 0x0DDE; return true;
+    /*
+     * Sinhala split matras...  Let the fun begin.
+     *
+     * These four characters have Unicode decompositions.  However, Uniscribe
+     * decomposes them "Khmer-style", that is, it uses the character itself to
+     * get the second half.  The first half of all four decompositions is always
+     * U+0DD9.
+     *
+     * Now, there are buggy fonts, namely, the widely used lklug.ttf, that are
+     * broken with Uniscribe.  But we need to support them.  As such, we only
+     * do the Uniscribe-style decomposition if the character is transformed into
+     * its "sec.half" form by the 'pstf' feature.  Otherwise, we fall back to
+     * Unicode decomposition.
+     *
+     * Note that we can't unconditionally use Unicode decomposition.  That would
+     * break some other fonts, that are designed to work with Uniscribe, and
+     * don't have positioning features for the Unicode-style decomposition.
+     *
+     * Argh...
+     */
+
+    const indic_shape_plan_t *indic_plan = (const indic_shape_plan_t *) c->plan->data;
+
+    hb_codepoint_t glyph;
+
+    if (indic_options ().uniscribe_bug_compatible ||
+       (c->font->get_glyph (ab, 0, &glyph) &&
+        indic_plan->pstf.would_substitute (&glyph, 1, true, c->font->face)))
+    {
+      /* Ok, safe to use Uniscribe-style decomposition. */
+      *a = 0x0DD9;
+      *b = ab;
+      return true;
+    }
   }
 
   return c->unicode->decompose (ab, a, b);
------------------------------------------------------

Comment 5 Jens Petersen 2012-11-21 02:37:47 UTC
Proposing as F18 NTH bug, since without this patch Sinhala script
will be rendered quite broken for Sri Lankan users.

Comment 6 Harshula Jayasuriya 2012-11-21 02:54:24 UTC
Harfbuzz 0.9.6 will need the following patches at a minimum for Sinhala to render correctly with both fonts made for Pango/ICU and those made for Uniscribe:

commit dde5506fd963e3cec27c3389bb1fc092f86d1e06
Author: Behdad Esfahbod <behdad>
Date:   Wed Nov 14 11:37:04 2012 -0800

    [Indic] Don't move virama with left matra
    
    This is important for the Sinhala U+0DDA split matra since it decomposes
    to U+0DD9,U+0DCA where U+0DD9 is a left matra and U+0DCA is the virama.
    We don't want to move the virama with the left matra.
    TEST: U+0D9A,U+0DDA
    
    Note that we were already doing this in the Uniscribe bug compatibility
    mode.  We now do it all the time.


commit eba312c8d1b2bbe8cb9b6414e843e78d2c521aa4
Author: Behdad Esfahbod <behdad>
Date:   Fri Nov 16 12:39:23 2012 -0800

    Plumbing to get shape plan and font into complex decompose function
    
    So we can handle Sinhala split matras smartly...  Coming soon.


commit 43b653150081a2f9dc6b7481229ac4cd952575dc
Author: Behdad Esfahbod <behdad>
Date:   Fri Nov 16 13:12:35 2012 -0800

    [Indic] Another try to unbreak Sinhala split matras
    
    Just read the comments...

Comment 7 Matthias Clasen 2012-12-03 14:35:51 UTC
harfbuzz 0.9.7 is in updates-testing now

Comment 8 Parag Nemade 2012-12-03 15:06:28 UTC
I have pushed it already to stable and its now in f18 stable repo.

Comment 9 Fedora Admin XMLRPC Client 2012-12-10 10:31:11 UTC
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.