[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Information on Portuguese
From: |
Rodrigo Severo |
Subject: |
Information on Portuguese |
Date: |
Fri, 12 Mar 1999 11:2:14 |
I am sending information on portuguese to be included on Aspell.
Character set: ISO 8859-1
Vowels: a e i o u
Obs: Portuguese uses letters like "LATIN SMALL LETTER E WITH ACUTE". Do you
need them listed as separate vowels? If you do, here they are:
"Accented" vowels:
a? 00e3 LATIN SMALL LETTER A WITH TILDE
o? 00f5 LATIN SMALL LETTER O WITH TILDE
a' 00e1 LATIN SMALL LETTER A WITH ACUTE
e' 00e9 LATIN SMALL LETTER E WITH ACUTE
i' 00ed LATIN SMALL LETTER I WITH ACUTE
o' 00f3 LATIN SMALL LETTER O WITH ACUTE
u' 00fa LATIN SMALL LETTER U WITH ACUTE
a! 00e0 LATIN SMALL LETTER A WITH GRAVE
a> 00e2 LATIN SMALL LETTER A WITH CIRCUMFLEX
e> 00ea LATIN SMALL LETTER E WITH CIRCUMFLEX
o> 00f4 LATIN SMALL LETTER O WITH CIRCUMFLEX
u: 00fc LATIN SMALL LETTER U WITH DIAERESIS
o! 00f2 LATIN SMALL LETTER O WITH GRAVE
Additional charters: just "-" like French
Additional considerations:
From what I understood about affix compression, it is EXTREMELLY that
affix compression is implemented to Aspell deal well with portuguese and,
AFAIK, with all Latin languages: French, Spanish and Italian comme to my
mind right now.
Let me brieflly explain one example to be sure everybody undestands
why.
In portuguese 80% of all verbs follow one single pattern that is, for
example:
CANTAR (to sing)
You keep the CANT that never changes and join it with:
O - CANTO
AS - CANTAS
A - CANTA
AMOS - CANTAMOS
AIS - CANTAIS
AM - AM
AVA
AVAS
AVAMOS
AVEIS
AVAM
And the list goes on up to 56 different affixes. So, I believe that
there are 2 options, or Aspell uses some kind of automatic termination
method where there would be "affixes groups" (is this affix compression?)
or for each verb to be included, there would be 56 different words.
Word lists:
I can create one. My idea is to choose 2 different pocket size
dictionaires and use the words appearing on both. This would solve the
mistype problem and, I believe, even the copyright one.
Soundlike code:
I can try but I can't promise any success on that. If there is anybody
willing to invest time on soundlike code for portuguese around please let
me know.
I believe that's all,
Rodrigo Severo
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- Information on Portuguese,
Rodrigo Severo <=