freecats-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Freecats-Dev] Suggestion about project team organization


From: Henri Chorand
Subject: [Freecats-Dev] Suggestion about project team organization
Date: Fri, 11 Jul 2003 11:01:54 +0200

Hi all,

I saw a number of very interesting messages concerning language-specific
indexing issues.

I quite like Jean-Christophe's idea about taking into consideration some of
the hardest languages first, in order to address needs for which existing
CAT tools need some progress. This is a more mature version of my Klingon
idea - the power of marketing.


May I only suggest that it would be handy if we had one coordinator for each
language? This person would simply help sorting all related info so as to
provide the development team with an overall picture of this language's
specific problems.

For instance, for Japanese, Jean-Christophe Helary seems the perfect
candidate, and Julien Poireau, who studied this language for some time, and
of course Yves Champollion (who also speaks Japanese at home) could help him
in this task. I believe Julien is eager to start a topic about Japanese,
he's been thinking about it for some time.


Anyway, I believe this kind of organization will become more necessary over
time.


Personally, I can say a few things about English (but I'm not a native
speaker) and French, but I don't have any true linguistic background. Oh
yes, I can utter a few things about Chinese, too, at least until Chinese
developers come up (I spent some time learning basic traditional Chinese, a
long time ago). Mind you, there must be an awful lot of experts in this
country, from an approx. 1.3 billion population, and they seem very
interested in free software, too. We should try to contact some Chinese free
software teams.

For Chinese, each character being a word in its own right, I guess it could
be the simplest of all languages as far as indexing is concerned. There will
also be a (rather short, I think) list of "caractères outils" (words which
weight should be lowered). Also, we will need to treat traditional and
simplified Chinese as two languages. No need to extract substrings here.

The fact that some (esp. modern, technical) terms are made up of several
characters does not really matter. This is also how things work in French,
and French mixes them a lot with tool words.

Do not hesitate to correct the rubbish in the above!


Cheers,

Henri





reply via email to

[Prev in Thread] Current Thread [Next in Thread]