Bug 459698 - [ml_IN] Obsolete Malayalam patches in libicu-3.8.1-7.fc9
Summary: [ml_IN] Obsolete Malayalam patches in libicu-3.8.1-7.fc9
Alias: None
Product: Fedora
Classification: Fedora
Component: icu
Version: 9
Hardware: All
OS: Linux
Target Milestone: ---
Assignee: Caolan McNamara
QA Contact: Fedora Extras Quality Assurance
Depends On:
TreeView+ depends on / blocked
Reported: 2008-08-21 11:46 UTC by Manilal
Modified: 2008-11-07 14:25 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Last Closed: 2008-11-07 14:25:14 UTC
Type: ---

Attachments (Terms of Use)
Samvruthokaram patch for Malayalam (517 bytes, patch)
2008-08-21 11:46 UTC, Manilal
no flags Details | Diff
Screenshot of Actual results (2.48 KB, image/png)
2008-08-21 11:49 UTC, Manilal
no flags Details
Expected result (2.10 KB, image/png)
2008-08-21 11:49 UTC, Manilal
no flags Details
some testcases (7.19 KB, image/png)
2008-08-26 09:32 UTC, Caolan McNamara
no flags Details
after reverting patches (4.11 KB, image/png)
2008-08-26 09:32 UTC, Caolan McNamara
no flags Details
new screenshot (44.00 KB, image/png)
2008-08-26 10:53 UTC, Caolan McNamara
no flags Details
Screenshot (182.73 KB, image/png)
2008-08-26 12:26 UTC, Manilal
no flags Details
Screenshot with 2 patches disabled (164.82 KB, image/png)
2008-08-26 12:55 UTC, Manilal
no flags Details
Screenshot with patch-5506 disabled (164.61 KB, image/png)
2008-08-26 13:16 UTC, Manilal
no flags Details

Description Manilal 2008-08-21 11:46:34 UTC
Created attachment 314693 [details]
Samvruthokaram patch for Malayalam

Description of problem:
Since most of the issues in Malayalam rendering are resolved in upstream or in font level, the following patches in libicu package are no longer required:
The above patches cause the word കേന്ദ്രം to be rendered incorrectly. 

Note that the patch to fix the samvruthokaram issue is still required for icu-3.8.1. This patch is attached with this bug report.

Version-Release number of selected component (if applicable):3.8.1-7.fc9.i386

How reproducible:

Steps to Reproduce:
1. Open a OpenOffice.org Writer document
2. Type the Malayalam word : കേന്ദ്രം
Actual results:
Refer the screenshot

Expected results:
Refer the screenshot

Additional info:
This issue is caused due to the obsolete patches listed above.

Comment 1 Manilal 2008-08-21 11:49:22 UTC
Created attachment 314694 [details]
Screenshot of Actual results

Comment 2 Manilal 2008-08-21 11:49:59 UTC
Created attachment 314695 [details]
Expected result

Comment 3 Ani Peter 2008-08-21 12:16:42 UTC
In the description section -> steps to reproduce, Step #2 says Type the Malayalam word : കേന്ദ്രം.

Below is the keys to type this word on Inscript Malayalam keyboard. Hope this helps.

കേന്ദ്രം -> ksvdodjx 


Comment 4 Caolan McNamara 2008-08-26 09:32:09 UTC
Created attachment 314972 [details]
some testcases

Comment 5 Caolan McNamara 2008-08-26 09:32:35 UTC
Created attachment 314973 [details]
after reverting patches

Comment 6 Caolan McNamara 2008-08-26 09:34:42 UTC
Nothing would give me more pleasure than dropping all these patches. On the other hand, doing so gives the output of #5 for KA VIRAMA RA and so on. I assume that the output of #4 is what we want, not that of #5 right ?

Comment 7 Ani Peter 2008-08-26 09:42:33 UTC
Yes Caolan, the glyphs shown in the boxes of the attachment in Comment #4 are the correct ones.

Comment 8 Manilal 2008-08-26 10:08:17 UTC
yes Comment #4 is the expected result. In the case comment #5, I think it's a font issue. Please make sure that you are using one of the latest Malayalam fonts(Meera/Rachana).

Comment 9 Caolan McNamara 2008-08-26 10:53:46 UTC
Created attachment 314974 [details]
new screenshot

aha, I have long suspected that our Malayalam font is crud. That's pretty compelling evidence to just drop the lot and blame the font for anything remaining that doesn't work.

I still have some combinations though in those fonts that don't look quite right, e.g.

U+0D16 U+0D4D U+0D30  ഖ്ര
U+0D1A U+0D4D U+0D30  ച്ര
U+0D20 U+0D4D U+0D30  ഠ്ര
U+0D25 U+0D4D U+0D30  ഥ്ര
U+0D32 U+0D4D U+0D30  ല്ര

in comparison to these similar constructs...

U+0D15 U+0D4D U+0D30  ക്ര KA VIRAMA RA 
U+0D17 U+0D4D U+0D30  ഗ്ര GA VIRAMA RA 
U+0D18 U+0D4D U+0D30  ഘ്ര GHA VIRAMA RA 
U+0D1B U+0D4D U+0D30  ഛ്ര CHA VIRAMA RA 
U+0D1C U+0D4D U+0D30  ജ്ര JA VIRAMA RA 
U+0D21 U+0D4D U+0D30  ഡ്ര DDA VIRAMA RA 
U+0D22 U+0D4D U+0D30  ഢ്ര DDHA VIRAMA RA 

are we happy with this screenshot or if not, agreed that its a font issue for Rachana/Meera ?

Comment 10 Manilal 2008-08-26 12:26:30 UTC
Created attachment 314980 [details]

Sorry for creating confusion. I missed some test scenarios. This is the screenshot if all the patches are enabled. The coloured rows are wrongly rendered.

Comment 11 Manilal 2008-08-26 12:55:41 UTC
Created attachment 314990 [details]
Screenshot with 2 patches disabled

I think that the patches have some conflicts, especially icu.icu5431.malayam.patch and icu.icu5506.multiplevowels.patch. I don't know the exact reason, but if you look at the screenshots it's evident.

Comment 12 Caolan McNamara 2008-08-26 13:08:18 UTC
ok, well I've long been of the opinion that we've ended up hacking the living hell out of poor pango to get "Lohit Malayalam" semi-working, and then butchering icu to match pango's/kde's behaviour, to the detriment of getting the "right solution" for the general case. So I'm very happy to reset to default icu behaviour when there are fairly capable alternative Malayalam fonts available, and any rendering problems can be directed towards the fonts themselves or to upstream icu and I'll happily backport any changes accepted there :-)

Comment 13 Manilal 2008-08-26 13:16:02 UTC
Created attachment 314996 [details]
Screenshot with patch-5506 disabled

This screenshot is taken after disabling only one patch, ie icu.icu5506.multiplevowels.patch. In this case the issues in Comment #4 and Comment #9 are solved. But still the 3rd row in the screenshot is wrong

Comment 14 Fedora Update System 2008-08-26 13:40:31 UTC
icu-3.8.1-8.fc9 has been submitted as an update for Fedora 9.

Comment 15 Caolan McNamara 2008-08-29 13:41:22 UTC
Looking a big closer I reckon that all the various substitution tables in the various fonts hide some things about icu/pango etc in comparison to uniscribe. And that basically the vanilla icu that this issue reverts to is the right way to go, *maybe* with the addition of the bysyllable patch to avoid gsubs across syllable boundaries. 

It looks to me that the first major difference is that uniscribe, to me at least, seems to have some sort of extra post-gsub reordering rule. You can play around with the test-cases and fonts at (http://bugs.icu-project.org/trac/ticket/6517) or (http://bugzilla.gnome.org/show_bug.cgi?id=549818) to see if I'm smoking crack.

Comment 16 Tony Fu 2008-09-10 03:16:49 UTC
requested by Jens Petersen (#27995)

Comment 17 Fedora Update System 2008-09-10 06:44:59 UTC
icu-3.8.1-8.fc9 has been pushed to the Fedora 9 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update icu'.  You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F9/FEDORA-2008-7655

Note You need to log in before you can comment on or make changes to this bug.