Bug 437761

Summary: [as_IN][bn_IN][ta_IN][ml_IN][crash] app crash in out of range worst-case expansion on entering combination
Product: [Fedora] Fedora Reporter: A S Alam <aalam>
Component: icuAssignee: Caolan McNamara <caolanm>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: low Docs Contact:
Priority: low    
Version: rawhideCC: aphukan, eng-i18n-bugs, fedora, jnavrati, mshao, psatpute, runab
Target Milestone: ---Keywords: i18n
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: 3.8-6.fc8 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-04-22 22:36:46 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Crash report none

Description A S Alam 2008-03-17 10:10:55 UTC
Description of problem:
Application crashed with entering combination

as_IN: 09F0+09CB+0981
bn_IN: 09B0+09CB+0981

Version-Release number of selected component (if applicable):
openoffice.org-core-2.4.0-10.1.fc9.i386

How reproducible:
100%

Steps to Reproduce:
1. run application
2. type combination with as_IN (Phonetic)
3. r+] + >
  
Actual results:
Application Crashed

Expected results:
should not

Additional info:
Crash report

Comment 1 A S Alam 2008-03-17 10:10:55 UTC
Created attachment 298246 [details]
Crash report

Comment 2 Amitakhya Phukan 2008-03-17 10:16:56 UTC
Hi!

This happens with Fedora 8 recently updated system also.

Version of OpenOffice used : openoffice.org-core-2.3.0-6.11.fc8

Additional info : It seems only the characters 09F0 and 09B0 are affected.
Others are rendered properly in the combination mentioned above.

Comment 3 Amitakhya Phukan 2008-03-17 10:21:00 UTC
Sorry, it seems it is happening with other characters also. Just now I tested
with 09F1, 09A6 and 09AA. It did crash with 09F1 and 09A6.

Comment 4 Amitakhya Phukan 2008-03-17 10:38:02 UTC
hi!

crash also occurs with the following combination types :

1. 09F0+09CB+0983
2. 09B0+09CB+0983
3. 09F0+09CB+0982
4. 09B)+09CB+0982

Seems like 0981, 0982 and 0983 are affected when combined with 09CB.

Comment 5 Caolan McNamara 2008-03-17 19:16:52 UTC
Patched now logged at http://bugs.icu-project.org/trac/ticket/6213

icu built (building) for rawhide as icu-3.8.1-6.fc9

Comment 6 Pravin Satpute 2008-03-18 09:26:51 UTC
same bug is applicable for Tamil(ta_IN) and Malayalam(ml_IN)
ta_IN 
1. 0b95 0bcc 0b82
2. 0b95 0bcb 0b82
3. 0b95 0bca 0b82

ml_IN
0d15 0d4a 0d02

actually generelly speaking problem occurs when we input 
any "consonet+ split_vowel_matra + vowel_modifier", so total levels becomes 4

Comment 7 Caolan McNamara 2008-03-18 09:42:19 UTC
This is where it stands right now after my original fix, the third element, e.g.
the 2-4 range below is the max worst-case-expansion. So e.g. tamil and malayalam
are unchanged in my original fix. So I'm going to reset this to assigned for a
bit to investigate if the tamil and malayalam ones need to be increased as well.
It would be good if you could check the other categories of script below and see
if there are cases for the other ones not covered here already that may need an
increased expansion

static const IndicClassTable devaClassTable = {0x0900, 0x0970, 2,
DEVA_SCRIPT_FLAGS, devaCharClasses, NULL};

static const IndicClassTable bengClassTable = {0x0980, 0x09FA, 4,
BENG_SCRIPT_FLAGS, bengCharClasses, bengSplitTable};

static const IndicClassTable punjClassTable = {0x0A00, 0x0A74, 2,
PUNJ_SCRIPT_FLAGS, punjCharClasses, NULL};

static const IndicClassTable gujrClassTable = {0x0A80, 0x0AEF, 2,
GUJR_SCRIPT_FLAGS, gujrCharClasses, NULL};

static const IndicClassTable oryaClassTable = {0x0B00, 0x0B71, 3,
ORYA_SCRIPT_FLAGS, oryaCharClasses, oryaSplitTable};

static const IndicClassTable tamlClassTable = {0x0B80, 0x0BF2, 3,
TAML_SCRIPT_FLAGS, tamlCharClasses, tamlSplitTable};

static const IndicClassTable teluClassTable = {0x0C00, 0x0C6F, 3,
TELU_SCRIPT_FLAGS, teluCharClasses, teluSplitTable};

static const IndicClassTable kndaClassTable = {0x0C80, 0x0CEF, 4,
KNDA_SCRIPT_FLAGS, kndaCharClasses, kndaSplitTable};

static const IndicClassTable mlymClassTable = {0x0D00, 0x0D6F, 3,
MLYM_SCRIPT_FLAGS, mlymCharClasses, mlymSplitTable};

static const IndicClassTable sinhClassTable = {0x0D80, 0x0DF4, 4,
SINH_SCRIPT_FLAGS, sinhCharClasses, sinhSplitTable};


Comment 8 Amitakhya Phukan 2008-03-18 09:47:58 UTC
pravin,
for assamese (and hopefully bengali also), it happens only with
consonant+09CB+{0981,0982,0983}. i have tested it on my Fedora 8 box that other
vowel_matra and {0981,0982,0983} combination doesn't break. don't know if this
helps caolan in his work, but i just wanted to mention what i found.

Comment 9 Amitakhya Phukan 2008-03-18 09:48:58 UTC
caolan, i have the diff file you have mentioned earlier..how do i apply the
patch and test it on my rawhide box ? can you generate the same diff for Fedora
8 also ?

Comment 10 Runa Bhattacharjee 2008-03-18 10:13:05 UTC
(In reply to comment #8)
> pravin,
> for assamese (and hopefully bengali also), it happens only with
> consonant+09CB+{0981,0982,0983}. 

Just tested for Bengali, consonant+09CC+{0981,0982,0983} breaks as well.

hth

Comment 11 Amitakhya Phukan 2008-03-18 10:17:25 UTC
same for assamese also. so the problem exists with consonant + {09CB,09CC} +
{0981,0982,0983}.

Comment 12 Caolan McNamara 2008-03-18 10:27:03 UTC
So the expansion of bengali/assamese, tamil and malayalam has been modified from
3 to 4, and http://bugs.icu-project.org/trac/ticket/6213 has been updated with
those expansions.

So those scripts are known and hopefully out of the way. The open question is if
the other remaining "3" ones should also be 4, e.g. Oriya and Telegu, is there
known similar constructs that have a parallel in those languages ?

Comment 13 Caolan McNamara 2008-03-18 11:20:12 UTC
http://koji.fedoraproject.org/packages/icu/3.8.1/7.fc9/ for rawhide and
http://koji.fedoraproject.org/packages/icu/3.8/6.fc8/ for F-8 to ease testing

Comment 14 Amitakhya Phukan 2008-03-18 11:28:29 UTC
just tested in my Fedora 8. no crash as of now.

Comment 15 Pravin Satpute 2008-03-18 12:04:07 UTC
i have tested for other script, where it is applicable
it can be for tamil, oriya, and malayalam but it is crashing only for tamil and
malayalam in rawhide as i said
for oriya it working fine

Comment 16 sandeep shedmake 2008-03-18 13:42:16 UTC
Wrt comment #6

For Rawhide using icu-3.8.1-7.fc9.i386.rpm  

ta_IN 
1. 0b95 0bcc 0b82 (OO doesn't break for this combination)
2. 0b95 0bcb 0b82 (OO breaks)
3. 0b95 0bca 0b82 (OO breaks)

ml_IN
0d15 0d4a 0d02 (OO breaks)



Comment 17 Caolan McNamara 2008-03-18 13:50:02 UTC
Hmm, wrt comment #16, are you sure you have the fixed libs ?

What is the output of 
> rpm -qf /usr/lib/libicule.so.38.1 
i.e. it should be...
libicu-3.8.1-7.fc9.i386
which passes the above test for me without crash and burn

Comment 18 sandeep shedmake 2008-03-19 03:23:28 UTC
done...no crash (libs fixed)

Comment 19 Fedora Update System 2008-03-19 07:35:05 UTC
icu-3.8-6.fc8 has been submitted as an update for Fedora 8

Comment 20 Fedora Update System 2008-03-21 22:17:30 UTC
icu-3.8-6.fc8 has been pushed to the Fedora 8 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update icu'.  You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F8/FEDORA-2008-2639

Comment 21 Fedora Update System 2008-04-22 22:36:44 UTC
icu-3.8-6.fc8 has been pushed to the Fedora 8 stable repository.  If problems still persist, please make note of it in this bug report.