Bug 185716 - [bn, bn_IN] Better handling for BENGALI LETTER A/E
Summary: [bn, bn_IN] Better handling for BENGALI LETTER A/E
Alias: None
Product: Fedora
Classification: Fedora
Component: harfbuzz
Version: rawhide
Hardware: All
OS: Linux
Target Milestone: ---
Assignee: Parag Nemade
QA Contact:
Depends On:
TreeView+ depends on / blocked
Reported: 2006-03-17 02:26 UTC by Jong Bae KO
Modified: 2013-06-03 08:11 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Clone Of:
Last Closed: 2013-06-03 08:11:50 UTC
Type: ---

Attachments (Terms of Use)
Image showing rendering of [U+0985] + [U+09C7] (63.98 KB, image/png)
2009-06-27 07:19 UTC, sandeep shedmake
no flags Details

System ID Private Priority Status Summary Last Updated
GNOME Bugzilla 118299 0 Normal RESOLVED Better handling for BENGALI LETTER A/E 2021-01-27 07:04:04 UTC

Description Jong Bae KO 2006-03-17 02:26:49 UTC
Description of problem:
I quote from Gnome bugzilla

" Trying to keep track of the four different issues in 
bug 113551 was pretty much impossible for me, so splitting
up the comments into separate bug reports.

* unmadindu (Sayamindu Dasgupta):

1. Yaphala
b. The sequence 0985 09CD 09AF 09BE (অ্যা) is not
rendered properly.

    I quote from the Unicode Indic FAQ.

	Q: What are the Bengali characters used to transcribe the sound "a" (as in
English "bat") in Unicode?

	A: In Bengali, the sequence "zophola" (U+09CD U+09AF) + the "aa" matra
(U+09BE) is used for transcribing the 		English "a" in "bat". This
zophola_aa can be seen as a special "composite" matra to write a new
Bengali sound, 	  imported from English. Represent these sequences using a
halant (virama):

		Vowel_A_zophola_AA = 0985 09CD 09AF 09BE ( a- halant ya -aa )
		Vowel_E_zophola_AA = 098F 09CD 09AF 09BE ( e- halant ya -aa )
	If you need to add a candrabindu or other combining mark in the sequence,
represent the sequence as:

		Vowel_A_zophola_AA + candrabindu = 0985 09CD 09AF 09BE 0981 ( a- halant
ya -aa candrabindu )

* Additional Comments From Taneem Ahmed 2003-06-01 03:13:

Also, a very quick hack (and a bit ugly) is to set U+985 to _ct from _iv, 
this will fix the 1b issue. I will also upload an image with the result. There 
is a small side effect, but I am sure everyone can live with that, instead 
of pango rendering it wrong. 

[ Image is http://bugzilla.gnome.org/showattachment.cgi?attach_id=17030,
  I don't know what the "small side effect" referred to above is  - OT ]

* Additional Comments From Owen Taylor 2003-06-01 04:42:

Two quick thoughts on 1b:

 Does the 'independent vowel + halant + ya + aa' combination
 work in Windows? The OT bengali specification strongly implies
 that uniscribe doesn't handle it.

 It should be pretty trivial to handle by adding an extra
 flag to scriptFlags and writing a special case for it
 in indic_ot_reorder().

* Additional Comments From Taneem Ahmed 2003-06-01 04:54:

I tried what you said, 1b does not get fixed with out the _ct hack. Let me 
explain this problem. Take the following input: 
U+985 U+9CD U+9AF U+9BE 
The problem with this is that U+985 is an independent vowel, and right 
now this input will become three syllables, (U+985) (U+9CD) (U+9AF 
U+9BE). This is not right obviously. Even if we somehow treat it as one 
syllable, we end up setting the tag blwf_p to all of them. 
This is a very very special case for U+985 where it acts as a consonant 
instead of a vowel. If you want to deal with it properly then we will have 
to add quite a few checks for U+985 in the reorder code to add proper 
tags. But as indic-ot.c is used by all the indic scripts, I think it will be 
even a bigger hack, risk, and extra delay. As this is a pure Bengali 
issue, I thought it will be better to keep the hack limited to Bengali :) The 
only side effect for my hack is that U+985 can now take up other 
independent vowels, which may actually be considered as a feature :) 
And I don't have access to a windows box at home, don't know what 
windows does. Can someone else please check? 

* Additional Comments From Owen Taylor 2003-06-01 10:49

It seems to me that the next step for 1b is to:

 - Find a uniscribe enabled copy of Microsoft windows
 - See if 'U+985 U+9CD U+9AF U+9BE' renders as desired
 - Try another sequence that would make sense for a 
   consonant, but doesn't make sense for U+985, 
       U+985 + halant + <normal consonant>
   and see how that renders.

Another approach would be simply to ask on the 
OpenType mailing list
and ask for clarification of the relationship between
the Unicode Indic FAQ item and the Bengali OpenType spec.

* Additional Comments From Taneem Ahmed 2003-06-01 16:50

I just looked at the Bengali part of chapter 9 of Unicode4.0. It cleary 
states what to do for 1b. I don't think we need to bring it up with 
OpenType mailing list, unless we want to know if they are planning to 
add some new feature in OT layout table. And IMHO if uniscribe does 
not render it properly then we need to let them know, not follow them :)

Comment #1 from Sayamindu Dasgupta (points: 8)
2003-07-25 15:35 UTC [reply]

On a related note, I think the Bengali letter E (098F) should also be
considered as a consonant. This is specified in the Indic FAQ, as well
as in Chapter 9 of the Unicode standard
Also, I am not very sure about this, but the sequence 09B0 09CD 098B
should be allowed to form a reph with the vowel 098B. This is required
for the Bengali word "Nairhit" and afaik, the latest beta of Uniscribe
forms a reph (We had some discussion with Paul Nelson of Microsoft
Typography on this - if you want I can forward the related emails to
you) - or do I file this as yet another bug? 

Comment #2 from Sayamindu Dasgupta (points: 8)
2004-02-24 18:36 UTC [reply]

Something I would like to point out here. 
The letter A acts as a consonant, *only* when it is followed by halant
+ ya. In other cases, it should act as a normal vowel. I have just
received a file where the user using a version of pango with the _ct
hack wrote Bengali letter AA as A + AA vowel sign. Visually the result
is the same, but can cause problems while searching anddoing  other
stuff. Example rendering at

Recently I had the chance to play around with a Microsoft Windows XP
box - and they can't handle a halant ya - as Microsoft has not
released official Bengali supporting version of Uniscribe yet.

Comment #3 from Owen Taylor (pango developer, points: 25)
2004-02-24 19:30 UTC [reply]

So, is making the _ct change for A and E better or nothing or 
not? I can leave this bug open, but I want to know whether
I should make that change for 1.4.0.

Comment #4 from Sayamindu Dasgupta (points: 8)
2004-02-25 03:48 UTC [reply]

My proposal - make the changes. Microsoft is doing the same thing with
Uniscribe, and ditto with the QT people. However, we should try to
have a better way to do this in the next versions.

Comment #5 from Owen Taylor (pango developer, points: 25)
2004-02-27 19:43 UTC [reply]

Fri Feb 27 14:26:34 2004  Owen Taylor  <otaylor>
        * modules/indic/indic-ot-class-tables.c (bengCharClasses):
        Mark BENGALI LETTER A (U+0985) and BENGALI LETTER E (U+098F)
        as consonants which gives better behavior when they
        are combined wiht halant, though it isn't exactly right.
        (#118299, Sayamindu Dasgupta)

(Filed as ICU bug 3626 (http://www.jtcsv.com/cgibin/icu-bugs/))"

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
Actual results:

Expected results:

Additional info:

Activity log from the external bug reference

Who: otaylor  
When: 2003-07-24 20:21:08 UTC  
What: OtherBugsDependingOnThis  	 
Added: 113551

Who: otaylor
When: 2003-07-25 15:12:26 UTC
What: Target Milestone
Added: 1.2.4

Who: otaylor
When: 2003-08-25 14:38:22 UTC
What: Target Milestone
Removed: 1.2.4
Added: 1.2.5

Who: otaylor
When: 2003-09-15 21:50:28 UTC 
What: AssignedTo
Removed: pango-maint.org
Added: pango-indic-maint.org

Who: otaylor
When: 2003-09-15 21:51:17 UTC
What: Component
Removed: general
Added: indic

Who: otaylor
When: 2004-02-23 19:47:05 UTC
What: Target Milestone
Removed: 1.2.5
Added: 1.4.0

Who: otaylor
When: 2004-02-27 19:43:39 UTC 
What: Summary(1.4.0)
Removed: Treat BENGALI LETTER A as consonan(1.4.0)
Added: Better handling for BENGALI LETTER A/E(future)

Who: otaylor
When: 2004-12-13 21:22:45 UTC
What: Target Milestone
Removed: future
Added: Small fix


Comment 1 Matthias Clasen 2006-06-20 05:14:39 UTC
Reassigning pango bugs to Behdad.

Comment 2 Akira TAGOH 2006-11-01 13:41:08 UTC
Please confirm if this still happens on fc6.

Comment 3 A S Alam 2007-11-15 11:49:41 UTC
it am not able to repdouce in Rawhide

Comment 4 Runa Bhattacharjee 2008-03-03 10:49:38 UTC
Changing Language tag to [bn] as it affects both the locales.

Comment 5 Bug Zapper 2008-05-14 02:07:07 UTC
Changing version to '9' as part of upcoming Fedora 9 GA.
More information and reason for this action is here:

Comment 6 Tony Fu 2008-09-10 03:11:27 UTC
requested by Jens Petersen (#27995)

Comment 7 Bug Zapper 2009-06-09 22:07:18 UTC
This message is a reminder that Fedora 9 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 9.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '9'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 9's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 9 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 

Comment 8 Jens Petersen 2009-06-23 06:20:09 UTC
Pravin or Parag is this ok now?

Comment 9 Parag Nemade 2009-06-23 07:40:38 UTC
 Can you please check this bug if it is still exists in F11? I think this is very old bug and I need some updated information on this bug.

Comment 10 Runa Bhattacharjee 2009-06-25 03:22:05 UTC
Passing on the needinfo request to Sayamindu

Comment 11 Sayamindu Dasgupta 2009-06-25 15:23:43 UTC
This has not been fixed yet. For example, the sequence: U+0985 BENGALI LETTER A + U+09C7 BENGALI VOWEL SIGN E should not combine, but in F11, it does. 

Actual Result:

Expected Result:

Comment 12 Pravin Satpute 2009-06-26 04:04:45 UTC
changing version to f11

Comment 13 sandeep shedmake 2009-06-27 07:19:43 UTC
Created attachment 349635 [details]
Image showing rendering of [U+0985] + [U+09C7]

From Comment #11,

Is the attached image rendering correctly the required sequence:

I made a change in pango-1.24.1-1.fc11.x86_64, to get the above required sequence.

Comment 14 Sayamindu Dasgupta 2009-06-27 07:55:10 UTC
(In reply to comment #13)
> Created an attachment (id=349635) [details]
> Image showing rendering of [U+0985] + [U+09C7]
> From Comment #11,
> Is the attached image rendering correctly the required sequence:
> I made a change in pango-1.24.1-1.fc11.x86_64, to get the above required
> sequence.  

Yep - this is the correct rendering.

Comment 15 Pravin Satpute 2009-07-15 13:48:33 UTC
update upstream bugzilla with patches


hopefully we will get rid of this very old bug 2006-03-16, IMO which is regression of some wrong changes in pango

Comment 17 Bug Zapper 2010-03-15 11:50:05 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 13 development cycle.
Changing version to '13'.

More information and reason for this action is here:

Comment 18 Akira TAGOH 2011-05-24 12:23:20 UTC
Does this issue still persist in f15?

Comment 20 Pravin Satpute 2011-05-25 07:51:18 UTC
pango indic development is almost stop, we will try to take care of this in harfbuzz-ng indic development

Comment 21 Fedora Admin XMLRPC Client 2012-01-10 15:43:53 UTC
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.

Comment 22 Fedora Admin XMLRPC Client 2013-05-14 12:12:42 UTC
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.

Comment 23 Akira TAGOH 2013-06-03 06:04:25 UTC
Pravin, is this still persist on pango with harfbuzz?

Comment 24 Pravin Satpute 2013-06-03 08:11:50 UTC
Original problem for which the bug was opened is resolved.

Regarding its regression.


u0985 (অ) + u09C7 (ে) should not combine. 

This is now allowed in all script. 

Even in Devanagari 
u0905 (अ) + u093f (ि) it is connecting now.  

These combinations specifically allowed since there are chanced or such sequence in comics world for exaggeration. 

Related thread on Harfbuzz http://lists.freedesktop.org/archives/harfbuzz/2013-February/002961.html

So as far as i think this is not bug anymore. If it is from someones opinion it must be debated on harfbuzz mailing list.

Vowel + Matra combination required in such cases. 

Fixed in harfbuzz-0.9.12-2.fc18.x86_64

Note You need to log in before you can comment on or make changes to this bug.