freebangfont-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Freebangfont-devel] Re: [Issue N22662] Bengali rendering bugs in


From: Deepayan Sarkar
Subject: Re: [Freebangfont-devel] Re: [Issue N22662] Bengali rendering bugs in Qt 3.2 beta
Date: Wed, 21 May 2003 13:16:19 -0500
User-agent: KMail/1.5.1

Hi,

wow, that was fast ! I'll report on the rest when I finish compiling (which 
may take a while), but let me answer you on the a + ya-phala thing.

[Others on the FBF list, please correct me if I have said something wrong]

On Wednesday 21 May 2003 05:01, address@hidden wrote:
> > 2. a + ya-phala
> > ===============
> >
> > The sequence 0985 09CD 09AF 09BE is not rendered correctly.
> > (Microsoft's engine doesn't render this correctly either yet, but it
> > will.)
> >
> > I quote from http://www.unicode.org/faq/indic.html#13
> >
> > ------------
> >
> > Q: What are the Bengali characters used to transcribe the sound "a"
> >    (as in English "bat") in Unicode?
> >
> >
> > A: In Bengali, the sequence "zophola" (U+09CD U+09AF) + the "aa" matra
> >    (U+09BE) is used for transcribing the English "a" in "bat". This
> >    zophola_aa can be seen as a special "composite" matra to write a
> >    new Bengali sound, imported from English. Represent these sequences
> >    using a halant (virama):
> >
> >    Vowel_A_zophola_AA = 0985 09CD 09AF 09BE ( a- halant ya -aa )
> >    Vowel_E_zophola_AA = 098F 09CD 09AF 09BE ( e- halant ya -aa )
> >
> >    If you need to add a candrabindu or other combining mark in the
> >    sequence, represent the sequence as:
> >
> >    Vowel_A_zophola_AA + candrabindu = 0985 09CD 09AF 09BE 0981
> >    ( a- halant ya -aa candrabindu )
> >
> > --------------
>
> Ok, that's something new for me. I always thought combinations of
> independent vowels+halant were forbidden, and that's how I handled it
> in Qt. Looks like I need an exception for bengali.

You are correct, from a linguistic point of view. 

The problem was that there was no official way to write english words like 
'at' (because there was no vowel with the correct sound -- I believe 
Devanagari doesn't have one even now). I guess some smart guy decided that 
Bengali should have the ability to do this, and essentially added a new vowel 
to the language. Unfortunately, no new character was created to represent 
this vowel, instead two completely arbitrary combinations of existing symbols 
were assigned to represent this sound, namely what are referred to above as

Vowel_A_zophola_AA and Vowel_E_zophola_AA

Both are completely illegal constructs in classical Bengali. 

> The faq entry you quote is not 100% clear to me. Does this mean any
> combination of
>
> independent vowel + halant + ya + -aa
>
> forms a valid syllable in bengali?

No, as far as I know, no other vowel should have this construct (but such 
combinations would be illegal anyway, so personally I wouldn't care how they 
are rendered).

> What about the general
> vowel + halant + consonant + matra
> case?

Nope, these should be illegal as well. Basically (as far as I know) 
Vowel_A_zophola_AA and Vowel_E_zophola_AA are two very explicit exceptions to 
the otherwise correct general rule you already have. (These 2 can be followed 
by combining marks like candrabindu, bisarga, etc, but I don't think that's 
an issue here.)

> I've worked around this for now by treating an Independent Vowel at the
> start of a syllable identical to a consonant for syllable breaking
> rules in Bengali, so your example renders correctly. I do however not
> know if this breaks anything else.

It shouldn't. The independent vowel + hasanta construct is illegal except for 
these 2 exceptions, so they should never occur otherwise in valid bengali 
text. I don't know what the unicode/opentype rules are when it comes to 
displaying something invalid, but I don't foresee any practical problems.

Deepayan

P.S.: There's another related issue (which I don't think has been completely 
resolved yet), which is how to render combinations of 

ra + hasanta + ya (09B0 + 09CD + 09AF)

should it be "reph + ya" or "ra + zophola" (the second is rare, but needed 
again for writing english words like 'rat'). This is an ambiguity in the 
language, and at some point, Unicode should come up with a recommended way to 
represent this. I'll let you know when I come to know of anything concrete.








reply via email to

[Prev in Thread] Current Thread [Next in Thread]