|
From: | Dennis R. Crosby |
Subject: | [aspell-devel] more hints |
Date: | Tue, 18 May 2010 23:34:20 +0200 |
I’ve done some more link jumping from your site and
read other peoples comments also. I’m afraid everyone is using some kind of word based
approach. Words aren’t basic enough. Everything is based on morphemes, units of meaning, that
might correspond to words, but also correspond to inflectional affixes (and
sound shifts). Starting from English is a bad start. Practical applications
are possible short term, but will be too unwieldy for other languages (some
languages will break the infrastructure immediately). Given time, even English based systems will cease working,
or require increasingly complex additional layers of ‘correction’. All languages utilize morphemes and their allomorphs. Even
polysynthetic (where “word” in isolation is a meaningless concept
and utterance (sentence) is as small as you’re going to get. They too are
composed of strings of morphemes in functional-contextual allomorphic variants. Also: besides starting with your database of morphemes and conditional
rules, you need to identify morphemes on context frequencies (how often does
this morpheme indicate plural, how often possession, how often the abbreviated
allomorph form of 3rd person singular “to be”, as in the
case of “s” (and its allomorphic variations “es”, “ren”,
and null (as well as “z”, which is pronounced, though not written). Context indexes need to include etymological identifiers as
well as functional identifiers. Why? Because where a word comes from originally,
was well as when it was adopted into English and sometimes even the root it
took (i.e. latin root imported via French or Spanish or technological
revolution born necessity [or should I say technological revolution *borne*
necessity? Neither is wrong, depending on what I want to emphasize. Either
choice is wrong if I mean to emphasize the other meaning]) = These factors
actually DETERMINE the set of rules that apply to spelling variation as well a
usage in English. They are they ‘why’ that we scratch our heads
about but go on applying consistently, knowing something is wrong when we try
to do otherwise. Morphemes are packages, units of meaning with variants that
are just as bound to their lineage as they are to the meanings they carry and
the phonemes of which they are composed (which in turn cause them to be subject
to another set of rules governing contextual sound variations). That
form/meaning package of variants has a set of rules that govern them – a “citizenship”
with rights and obligations if you will. A twin morpheme (homophone) a “citizen”
of another country, has other duties and other rights-based expectations. And the rules governing contextual sound shifts MAY BE
DIFFERENT. Lineage is the reason. Sometimes we didn’t just borrow words.
Sometimes we borrowed the manual that went with the word. Sometimes we didn’t
read the manual, or read it well, or it’s been so long – how did
that go? Maybe the manual got lost. This stupid analogy illustrates how many things can be
involved in “proper” spelling. Periodically, the culture gets tired
of learning to apply all these things nobody can remember why and usually get
wrong and a wave of simplification sweeps through the language. English has suffered from domination by French speaking
Danish descendants for 200 years followed by pretentions of loyalty and
proclamations of convictions – which were enormous factors in word
choice, spelling and grammar. English doesn’t look at all like Icelandic,
but it used to. We didn’t bother to educate the slaves or their descendants,
so they never quite got around to imitating our usage exactly, sometimes never
quite abandoned grammatical transformations that (quite conveniently) expressed
day-to-day sameness in a way ‘standard’ usage ignored completely
and lo and behold, 200 years later every good white Anglo-Saxon descendant in
the US knows exactly “what they be talkin’ ‘bout”.
Those lineage based forms (in the latter case, concepts bound in rules that
apply to categories – verbs) are INTERNALIZED IN OUR WHOLE CULTURE. How
many people do you know who can explain why? Sorry, this was meant to be a short letter. I can blame the
medicine partly, but it’s rooted in basic nature, fed by much thinking
and bottled up due to a world-wide lack of interest in the subject. I really hope you can glean some useful points out of my
rantings. My points are important, but I’m afraid I’m burying them
too deeply. Sorry. __________ Information from ESET NOD32 Antivirus, version of virus signature database 5125 (20100518) __________ The message was checked by ESET NOD32 Antivirus. http://www.eset.com |
[Prev in Thread] | Current Thread | [Next in Thread] |