This service will be undergoing maintenance at 00:00 UTC, 2016-09-28. It is expected to last about 1 hours
Bug 185716 - [bn, bn_IN] Better handling for BENGALI LETTER A/E
[bn, bn_IN] Better handling for BENGALI LETTER A/E
Status: CLOSED CURRENTRELEASE
Product: Fedora
Classification: Fedora
Component: harfbuzz (Show other bugs)
rawhide
All Linux
medium Severity medium
: ---
: ---
Assigned To: Parag Nemade
: FutureFeature, i18n
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2006-03-16 21:26 EST by Jong Bae KO
Modified: 2013-06-03 04:11 EDT (History)
8 users (show)

See Also:
Fixed In Version:
Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-06-03 04:11:50 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
Image showing rendering of [U+0985] + [U+09C7] (63.98 KB, image/png)
2009-06-27 03:19 EDT, sandeep shedmake
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
GNOME Desktop 118299 None None None Never

  None (edit)
Description Jong Bae KO 2006-03-16 21:26:49 EST
Description of problem:
I quote from Gnome bugzilla
#118299(http://bugzilla.gnome.org/show_bug.cgi?id=118299)

" Trying to keep track of the four different issues in 
bug 113551 was pretty much impossible for me, so splitting
up the comments into separate bug reports.

* unmadindu@Softhome.net (Sayamindu Dasgupta):

1. Yaphala
---------------
b. The sequence 0985 09CD 09AF 09BE (অ্যা) is not
rendered properly.

    I quote from the Unicode Indic FAQ.

	Q: What are the Bengali characters used to transcribe the sound "a" (as in
English "bat") in Unicode?

	A: In Bengali, the sequence "zophola" (U+09CD U+09AF) + the "aa" matra
(U+09BE) is used for transcribing the 		English "a" in "bat". This
zophola_aa can be seen as a special "composite" matra to write a new
Bengali sound, 	  imported from English. Represent these sequences using a
halant (virama):

		Vowel_A_zophola_AA = 0985 09CD 09AF 09BE ( a- halant ya -aa )
		Vowel_E_zophola_AA = 098F 09CD 09AF 09BE ( e- halant ya -aa )
	
	If you need to add a candrabindu or other combining mark in the sequence,
represent the sequence as:

		Vowel_A_zophola_AA + candrabindu = 0985 09CD 09AF 09BE 0981 ( a- halant
ya -aa candrabindu )

* Additional Comments From Taneem Ahmed 2003-06-01 03:13:

Also, a very quick hack (and a bit ugly) is to set U+985 to _ct from _iv, 
this will fix the 1b issue. I will also upload an image with the result. There 
is a small side effect, but I am sure everyone can live with that, instead 
of pango rendering it wrong. 

[ Image is http://bugzilla.gnome.org/showattachment.cgi?attach_id=17030,
  I don't know what the "small side effect" referred to above is  - OT ]

* Additional Comments From Owen Taylor 2003-06-01 04:42:

Two quick thoughts on 1b:

 Does the 'independent vowel + halant + ya + aa' combination
 work in Windows? The OT bengali specification strongly implies
 that uniscribe doesn't handle it.

 It should be pretty trivial to handle by adding an extra
 flag to scriptFlags and writing a special case for it
 in indic_ot_reorder().

* Additional Comments From Taneem Ahmed 2003-06-01 04:54:

I tried what you said, 1b does not get fixed with out the _ct hack. Let me 
explain this problem. Take the following input: 
 
U+985 U+9CD U+9AF U+9BE 
 
The problem with this is that U+985 is an independent vowel, and right 
now this input will become three syllables, (U+985) (U+9CD) (U+9AF 
U+9BE). This is not right obviously. Even if we somehow treat it as one 
syllable, we end up setting the tag blwf_p to all of them. 
 
This is a very very special case for U+985 where it acts as a consonant 
instead of a vowel. If you want to deal with it properly then we will have 
to add quite a few checks for U+985 in the reorder code to add proper 
tags. But as indic-ot.c is used by all the indic scripts, I think it will be 
even a bigger hack, risk, and extra delay. As this is a pure Bengali 
issue, I thought it will be better to keep the hack limited to Bengali :) The 
only side effect for my hack is that U+985 can now take up other 
independent vowels, which may actually be considered as a feature :) 
And I don't have access to a windows box at home, don't know what 
windows does. Can someone else please check? 

* Additional Comments From Owen Taylor 2003-06-01 10:49

It seems to me that the next step for 1b is to:

 - Find a uniscribe enabled copy of Microsoft windows
 - See if 'U+985 U+9CD U+9AF U+9BE' renders as desired
 - Try another sequence that would make sense for a 
   consonant, but doesn't make sense for U+985, 
   say 
       U+985 + halant + <normal consonant>
   and see how that renders.

Another approach would be simply to ask on the 
OpenType mailing list
(http://www.microsoft.com/typography/otspec/otlist.htm)
and ask for clarification of the relationship between
the Unicode Indic FAQ item and the Bengali OpenType spec.

* Additional Comments From Taneem Ahmed 2003-06-01 16:50

I just looked at the Bengali part of chapter 9 of Unicode4.0. It cleary 
states what to do for 1b. I don't think we need to bring it up with 
OpenType mailing list, unless we want to know if they are planning to 
add some new feature in OT layout table. And IMHO if uniscribe does 
not render it properly then we need to let them know, not follow them :)


Comment #1 from Sayamindu Dasgupta (points: 8)
2003-07-25 15:35 UTC [reply]

On a related note, I think the Bengali letter E (098F) should also be
considered as a consonant. This is specified in the Indic FAQ, as well
as in Chapter 9 of the Unicode standard
(http://www.unicode.org/book/preview/ch09.pdf). 
Also, I am not very sure about this, but the sequence 09B0 09CD 098B
should be allowed to form a reph with the vowel 098B. This is required
for the Bengali word "Nairhit" and afaik, the latest beta of Uniscribe
forms a reph (We had some discussion with Paul Nelson of Microsoft
Typography on this - if you want I can forward the related emails to
you) - or do I file this as yet another bug? 


Comment #2 from Sayamindu Dasgupta (points: 8)
2004-02-24 18:36 UTC [reply]

Something I would like to point out here. 
The letter A acts as a consonant, *only* when it is followed by halant
+ ya. In other cases, it should act as a normal vowel. I have just
received a file where the user using a version of pango with the _ct
hack wrote Bengali letter AA as A + AA vowel sign. Visually the result
is the same, but can cause problems while searching anddoing  other
stuff. Example rendering at
http://www.peacefulaction.org/sayamindu/images/garbage.png

Recently I had the chance to play around with a Microsoft Windows XP
box - and they can't handle a halant ya - as Microsoft has not
released official Bengali supporting version of Uniscribe yet.


Comment #3 from Owen Taylor (pango developer, points: 25)
2004-02-24 19:30 UTC [reply]

So, is making the _ct change for A and E better or nothing or 
not? I can leave this bug open, but I want to know whether
I should make that change for 1.4.0.


Comment #4 from Sayamindu Dasgupta (points: 8)
2004-02-25 03:48 UTC [reply]

My proposal - make the changes. Microsoft is doing the same thing with
Uniscribe, and ditto with the QT people. However, we should try to
have a better way to do this in the next versions.


Comment #5 from Owen Taylor (pango developer, points: 25)
2004-02-27 19:43 UTC [reply]

Fri Feb 27 14:26:34 2004  Owen Taylor  <otaylor@redhat.com>
 
        * modules/indic/indic-ot-class-tables.c (bengCharClasses):
        Mark BENGALI LETTER A (U+0985) and BENGALI LETTER E (U+098F)
        as consonants which gives better behavior when they
        are combined wiht halant, though it isn't exactly right.
        (#118299, Sayamindu Dasgupta)

(Filed as ICU bug 3626 (http://www.jtcsv.com/cgibin/icu-bugs/))"





Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Activity log from the external bug reference

Who: otaylor@redhat.com  
When: 2003-07-24 20:21:08 UTC  
What: OtherBugsDependingOnThis  	 
Removed:
Added: 113551

Who: otaylor@redhat.com
When: 2003-07-25 15:12:26 UTC
What: Target Milestone
Removed: 
Added: 1.2.4

Who: otaylor@redhat.com
When: 2003-08-25 14:38:22 UTC
What: Target Milestone
Removed: 1.2.4
Added: 1.2.5

Who: otaylor@redhat.com
When: 2003-09-15 21:50:28 UTC 
What: AssignedTo
Removed: pango-maint@bugzilla.gnome.org
Added: pango-indic-maint@bugzilla.gnome.org

Who: otaylor@redhat.com
When: 2003-09-15 21:51:17 UTC
What: Component
Removed: general
Added: indic

Who: otaylor@redhat.com
When: 2004-02-23 19:47:05 UTC
What: Target Milestone
Removed: 1.2.5
Added: 1.4.0

Who: otaylor@redhat.com
When: 2004-02-27 19:43:39 UTC 
What: Summary(1.4.0)
Removed: Treat BENGALI LETTER A as consonan(1.4.0)
Added: Better handling for BENGALI LETTER A/E(future)

Who: otaylor@redhat.com
When: 2004-12-13 21:22:45 UTC
What: Target Milestone
Removed: future
Added: Small fix


http://bugzilla.gnome.org/show_activity.cgi?id=118299
Comment 1 Matthias Clasen 2006-06-20 01:14:39 EDT
Reassigning pango bugs to Behdad.
Comment 2 Akira TAGOH 2006-11-01 08:41:08 EST
Please confirm if this still happens on fc6.
Comment 3 A S Alam 2007-11-15 06:49:41 EST
it am not able to repdouce in Rawhide
pango-1.19.0-1.fc9
Comment 4 Runa Bhattacharjee 2008-03-03 05:49:38 EST
Changing Language tag to [bn] as it affects both the locales.
Comment 5 Bug Zapper 2008-05-13 22:07:07 EDT
Changing version to '9' as part of upcoming Fedora 9 GA.
More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Comment 6 Tony Fu 2008-09-09 23:11:27 EDT
requested by Jens Petersen (#27995)
Comment 7 Bug Zapper 2009-06-09 18:07:18 EDT
This message is a reminder that Fedora 9 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 9.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '9'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 9's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 9 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Comment 8 Jens Petersen 2009-06-23 02:20:09 EDT
Pravin or Parag is this ok now?
Comment 9 Parag Nemade 2009-06-23 03:40:38 EDT
Runa,
 Can you please check this bug if it is still exists in F11? I think this is very old bug and I need some updated information on this bug.
Comment 10 Runa Bhattacharjee 2009-06-24 23:22:05 EDT
Passing on the needinfo request to Sayamindu
Comment 11 Sayamindu Dasgupta 2009-06-25 11:23:43 EDT
This has not been fixed yet. For example, the sequence: U+0985 BENGALI LETTER A + U+09C7 BENGALI VOWEL SIGN E should not combine, but in F11, it does. 

Actual Result:
অে

Expected Result:
U+0985 BENGALI LETTER A/DOTTEDCIRCLE/U+09C7 BENGALI VOWEL SIGN E
Comment 12 Pravin Satpute 2009-06-26 00:04:45 EDT
changing version to f11
Comment 13 sandeep shedmake 2009-06-27 03:19:43 EDT
Created attachment 349635 [details]
Image showing rendering of [U+0985] + [U+09C7]

From Comment #11,

Is the attached image rendering correctly the required sequence:
U+0985 (BENGALI LETTER A) + U+09C7 (BENGALI VOWEL SIGN E) ?  

I made a change in pango-1.24.1-1.fc11.x86_64, to get the above required sequence.
Comment 14 Sayamindu Dasgupta 2009-06-27 03:55:10 EDT
(In reply to comment #13)
> Created an attachment (id=349635) [details]
> Image showing rendering of [U+0985] + [U+09C7]
> 
> From Comment #11,
> 
> Is the attached image rendering correctly the required sequence:
> U+0985 (BENGALI LETTER A) + U+09C7 (BENGALI VOWEL SIGN E) ?  
> 
> I made a change in pango-1.24.1-1.fc11.x86_64, to get the above required
> sequence.  

Yep - this is the correct rendering.
Comment 15 Pravin Satpute 2009-07-15 09:48:33 EDT
update upstream bugzilla with patches

http://bugzilla.gnome.org/show_bug.cgi?id=118299

hopefully we will get rid of this very old bug 2006-03-16, IMO which is regression of some wrong changes in pango
Comment 17 Bug Zapper 2010-03-15 07:50:05 EDT
This bug appears to have been reported against 'rawhide' during the Fedora 13 development cycle.
Changing version to '13'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Comment 18 Akira TAGOH 2011-05-24 08:23:20 EDT
Does this issue still persist in f15?
Comment 20 Pravin Satpute 2011-05-25 03:51:18 EDT
pango indic development is almost stop, we will try to take care of this in harfbuzz-ng indic development
Comment 21 Fedora Admin XMLRPC Client 2012-01-10 10:43:53 EST
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.
Comment 22 Fedora Admin XMLRPC Client 2013-05-14 08:12:42 EDT
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.
Comment 23 Akira TAGOH 2013-06-03 02:04:25 EDT
Pravin, is this still persist on pango with harfbuzz?
Comment 24 Pravin Satpute 2013-06-03 04:11:50 EDT
Original problem for which the bug was opened is resolved.

Regarding its regression.

i.e.

u0985 (অ) + u09C7 (ে) should not combine. 

This is now allowed in all script. 

Even in Devanagari 
u0905 (अ) + u093f (ि) it is connecting now.  

These combinations specifically allowed since there are chanced or such sequence in comics world for exaggeration. 

Related thread on Harfbuzz http://lists.freedesktop.org/archives/harfbuzz/2013-February/002961.html

So as far as i think this is not bug anymore. If it is from someones opinion it must be debated on harfbuzz mailing list.

Vowel + Matra combination required in such cases. 

Fixed in harfbuzz-0.9.12-2.fc18.x86_64

Note You need to log in before you can comment on or make changes to this bug.