Bug 459698

Summary: [ml_IN] Obsolete Malayalam patches in libicu-3.8.1-7.fc9
Product: [Fedora] Fedora Reporter: libregeek <libregeek>
Component: icuAssignee: Caolan McNamara <caolanm>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: medium    
Version: 9CC: apeter, fedora, i18n-bugs, santhosh.thottingal
Target Milestone: ---Keywords: i18n
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-11-07 14:25:14 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Samvruthokaram patch for Malayalam
none
Screenshot of Actual results
none
Expected result
none
some testcases
none
after reverting patches
none
new screenshot
none
Screenshot
none
Screenshot with 2 patches disabled
none
Screenshot with patch-5506 disabled none

Description libregeek 2008-08-21 11:46:34 UTC
Created attachment 314693 [details]
Samvruthokaram patch for Malayalam

Description of problem:
Since most of the issues in Malayalam rendering are resolved in upstream or in font level, the following patches in libicu package are no longer required:
icu.icu5418.malayam.patch
icu.icu5431.malayam.patch
icu.icu5506.multiplevowels.patch
icu.icuXXXX.malayalam.bysyllable.patch
The above patches cause the word കേന്ദ്രം to be rendered incorrectly. 

Note that the patch to fix the samvruthokaram issue is still required for icu-3.8.1. This patch is attached with this bug report.

Version-Release number of selected component (if applicable):3.8.1-7.fc9.i386


How reproducible:
always

Steps to Reproduce:
1. Open a OpenOffice.org Writer document
2. Type the Malayalam word : കേന്ദ്രം
3. 
  
Actual results:
Refer the screenshot

Expected results:
Refer the screenshot

Additional info:
This issue is caused due to the obsolete patches listed above.

Comment 1 libregeek 2008-08-21 11:49:22 UTC
Created attachment 314694 [details]
Screenshot of Actual results

Comment 2 libregeek 2008-08-21 11:49:59 UTC
Created attachment 314695 [details]
Expected result

Comment 3 Ani Peter 2008-08-21 12:16:42 UTC
In the description section -> steps to reproduce, Step #2 says Type the Malayalam word : കേന്ദ്രം.

Below is the keys to type this word on Inscript Malayalam keyboard. Hope this helps.

കേന്ദ്രം -> ksvdodjx 

Thanks
Ani

Comment 4 Caolan McNamara 2008-08-26 09:32:09 UTC
Created attachment 314972 [details]
some testcases

Comment 5 Caolan McNamara 2008-08-26 09:32:35 UTC
Created attachment 314973 [details]
after reverting patches

Comment 6 Caolan McNamara 2008-08-26 09:34:42 UTC
Nothing would give me more pleasure than dropping all these patches. On the other hand, doing so gives the output of #5 for KA VIRAMA RA and so on. I assume that the output of #4 is what we want, not that of #5 right ?

Comment 7 Ani Peter 2008-08-26 09:42:33 UTC
Yes Caolan, the glyphs shown in the boxes of the attachment in Comment #4 are the correct ones.

Comment 8 libregeek 2008-08-26 10:08:17 UTC
yes Comment #4 is the expected result. In the case comment #5, I think it's a font issue. Please make sure that you are using one of the latest Malayalam fonts(Meera/Rachana).

Comment 9 Caolan McNamara 2008-08-26 10:53:46 UTC
Created attachment 314974 [details]
new screenshot

aha, I have long suspected that our Malayalam font is crud. That's pretty compelling evidence to just drop the lot and blame the font for anything remaining that doesn't work.

I still have some combinations though in those fonts that don't look quite right, e.g.

U+0D16 U+0D4D U+0D30  ഖ്ര
U+0D1A U+0D4D U+0D30  ച്ര
U+0D20 U+0D4D U+0D30  ഠ്ര
U+0D25 U+0D4D U+0D30  ഥ്ര
U+0D32 U+0D4D U+0D30  ല്ര

in comparison to these similar constructs...

U+0D15 U+0D4D U+0D30  ക്ര KA VIRAMA RA 
U+0D17 U+0D4D U+0D30  ഗ്ര GA VIRAMA RA 
U+0D18 U+0D4D U+0D30  ഘ്ര GHA VIRAMA RA 
U+0D1B U+0D4D U+0D30  ഛ്ര CHA VIRAMA RA 
U+0D1C U+0D4D U+0D30  ജ്ര JA VIRAMA RA 
U+0D21 U+0D4D U+0D30  ഡ്ര DDA VIRAMA RA 
U+0D22 U+0D4D U+0D30  ഢ്ര DDHA VIRAMA RA 

are we happy with this screenshot or if not, agreed that its a font issue for Rachana/Meera ?

Comment 10 libregeek 2008-08-26 12:26:30 UTC
Created attachment 314980 [details]
Screenshot

Sorry for creating confusion. I missed some test scenarios. This is the screenshot if all the patches are enabled. The coloured rows are wrongly rendered.

Comment 11 libregeek 2008-08-26 12:55:41 UTC
Created attachment 314990 [details]
Screenshot with 2 patches disabled

I think that the patches have some conflicts, especially icu.icu5431.malayam.patch and icu.icu5506.multiplevowels.patch. I don't know the exact reason, but if you look at the screenshots it's evident.

Comment 12 Caolan McNamara 2008-08-26 13:08:18 UTC
ok, well I've long been of the opinion that we've ended up hacking the living hell out of poor pango to get "Lohit Malayalam" semi-working, and then butchering icu to match pango's/kde's behaviour, to the detriment of getting the "right solution" for the general case. So I'm very happy to reset to default icu behaviour when there are fairly capable alternative Malayalam fonts available, and any rendering problems can be directed towards the fonts themselves or to upstream icu and I'll happily backport any changes accepted there :-)

Comment 13 libregeek 2008-08-26 13:16:02 UTC
Created attachment 314996 [details]
Screenshot with patch-5506 disabled

This screenshot is taken after disabling only one patch, ie icu.icu5506.multiplevowels.patch. In this case the issues in Comment #4 and Comment #9 are solved. But still the 3rd row in the screenshot is wrong

Comment 14 Fedora Update System 2008-08-26 13:40:31 UTC
icu-3.8.1-8.fc9 has been submitted as an update for Fedora 9.
http://admin.fedoraproject.org/updates/icu-3.8.1-8.fc9

Comment 15 Caolan McNamara 2008-08-29 13:41:22 UTC
Looking a big closer I reckon that all the various substitution tables in the various fonts hide some things about icu/pango etc in comparison to uniscribe. And that basically the vanilla icu that this issue reverts to is the right way to go, *maybe* with the addition of the bysyllable patch to avoid gsubs across syllable boundaries. 

It looks to me that the first major difference is that uniscribe, to me at least, seems to have some sort of extra post-gsub reordering rule. You can play around with the test-cases and fonts at (http://bugs.icu-project.org/trac/ticket/6517) or (http://bugzilla.gnome.org/show_bug.cgi?id=549818) to see if I'm smoking crack.

Comment 16 Tony Fu 2008-09-10 03:16:49 UTC
requested by Jens Petersen (#27995)

Comment 17 Fedora Update System 2008-09-10 06:44:59 UTC
icu-3.8.1-8.fc9 has been pushed to the Fedora 9 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update icu'.  You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F9/FEDORA-2008-7655