freecats-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Freecats-Dev] Re: Trados/other CAT, Python/Java, German/English


From: Charles Stewart
Subject: [Freecats-Dev] Re: Trados/other CAT, Python/Java, German/English
Date: Mon, 24 Feb 2003 07:00:02 -0500 (EST)

Dear Henri,

Congratulations on the excellent work you are doing as project leader.
FreeCATS has acquired a great deal of momentum in its short period of
existence, and I think most of the credit for that can be put at your
door.

I wrote:
>>   - If what we build is good enough, people will switch
>>   tools and come to us, especially if we are nice enough
>>   to provide 2-way migration tools.  So the fear of Trados
>>   and wordfast gaining too much momentum is illusory, not
>>   real.

Henri wrote:
> Whatever the fear - major translation agencies don't give a damn about
> Trados per se. Like other categories of software users in their
> respective fields, they mostly selected Trados because it looked like
> the best software in town - that begun before MS took a share in Trados.
> Now, it's more a de facto standard than anything else. D\351j\340Vu, SDLX &
> Wordfast are definitely NOT soon-to-be-extinct beasts.

> Just a remark about Dave's point of view: Trados compatibility IS a must
> with translation agencies for technical translation & localization
> project - not Trados itself, at least not quite as much.

So import and export filters for TMX is a must.  Are there any
descriptions of this file format available?

> I expect these translation agencies to get interested in our tool if it
> comes into existence, only because it will be free and Trados-compatible.

> Note that, to the best of my knowledge, e.g. Berlitz (now part of
> Bowne), or Lionbridge (except for the projects for which they request
> and provide their baby, Foreign Desk), did NOT massively upgrade to
> Trados 5, for reasons like cost and lack of new, essential features.

Sounds good for us if there is suspicion of the major tool.


>>   - I am more worried about lock-in, especially wrt. the
>>   Python programming language.  It is an excellent tool
>>   for doing quick hacks that need OO, but it behaves almost
>>   completely unlike any other programming language in its
>>   semantics.  If the reference implementation is Python,
>>   we will find it difficult to support programmers who loathe
>>   Python (and they do exist).

> Of course - that's the trouble with having to choose a language ;-)

Certainly, but it doesn't follow that all languages are equal.  The
point I wish to make is that Python is something of an outlier in the
family of languages, and while it is quite intuitive and flexible, not
all the claims that the Python language enthusiasts make for it should
be taken at face value.  It has one of the strangest approaches to
variable extent that I have ever seen, one that is often misdescribed
as `lexical scoping' but would be better described as `a lexical tower
of dynamic scopes'; the approach is also quite expensive in terms of
demanding many run-time dictionary lookups, and while good results have
been achieved with a ruthlessly-optimising Python compiler, I think a
real performance penalty will be paid if we adopt python as our
scripting language as opposed to Tcl or Perl (Tcl has an excellent
compiler, and AFAIK Perl's compiler gets much better results than
Python's).

>>   - I am all for a Java implementation.  Java has excellent
>>   libraries, and many PLs can target the JVM, including
>>   python, tcl and Scheme.

> Well, I'd say the one with skills & availability makes the choice, then.
> At least, here we go with portability, and we can stop worrying about OS
> X zealots: they'll drink from the same barrel as the other team members.

> Even through poor folks like me (with rather limited coding skills &
> availability) would not dare to begin coding in Java, maybe we'll be
> able to do a few things once the code's overall structure is nicely laid
> out by any volunteer. Whatever I believe - and who cares anyway ;-) -
> Java IS a possible, reasonable option.

Java isn't my favourite language, but agreeing on the Java runtime
doesn't preclude coding in another language.  An option is to use
Jython (the Python-on-JVM implementation) to code quick hacks and
rewrite in Java.  The language I am most productive in, Scheme, has an
excellent JVM implementation, namely SISC.  I don't know of a good
Perl implementation on the JVM, but that may just be my ignorance.


>> The recent brouhaha about "java is inefficient, even Sun
>> says so" applies *only* to Sun's own JVM.  IBM's JVM is
>> better, and we can compile to native C using gcc+CLASSPATH.

> Do you mean we can convert Java code into native C code? That would be
> even better then. Please correct me if I missed the target here.

What I wrote maybe is a bit misleading.  While I think it would not be
difficult to emit equivalent C code from Java source, I don't think
this capability is part of the current gcc-3.* tool set.  What gcc can
do since gcj was merged in (ie. since gcc-3.0) is: compile Java to
native code, compile Java to JVM bytecode and compile JVM bytecode to 
native code (with the composition of the last two steps resulting in
slightly less efficient code that the first step).

The gcc compiler also supports something called the Cygnus Native
Interface, allowing Java code to be linked to C/C++ code in rather the
same way as Sun's JNI does but with much greater ease of use and
efficiency.  What gcc doesn't do is provide a runtime with support
for the usual Java class hierarchy; the GNU CLASSPATH folks have provided
an extensive but incomplete implementation here: last time I looked
Swing, for instance, was missing.


> I'll wait for the project team to vote. Mind you, there weren't that
> many votes around for now, so maybe (especially if OmegaT's Keith
> Godfrey invites us to join him) your votes will make the decision.

> Of course, I will only accept votes from people who can and WILL provide
> SOME coding effort ;-)))
> Even though I knew from the start that the Chief Software Architect's
> cap is too large for my little head, still, I hope I can contribute to
> code, even 1%.

I would like to code, but I have a rather full timetable over the next
two months and another free software project commitment which has
priority for me at the moment, so it is difficult for me to promise
anything definite in advance.  Also how much contribution I make will
depend on which language is adopted. Despite my reservations above, I
would be willing to code in Python.

> Seriously, though, IF we start from Keith's OmegaT project, then we're
> all heading to Java then (warm & sunny island).

...and in Java...

> Also, even more seriously: we do need a charitable soul to build our
> project's framework in order to start moving ahead. Once a true
> programmer has built it and is available to provide explanations and
> some setup help to week-end programmers like me, those week-end
> programmers will be able to concentrate on small bits of code and
> possibly improve it or write similar things.

> So, Charles, if you do code in Java and want to start something, please
> do it and we'll follow. Of course, the same is true for Keith or whoever
> else.

...but if *I* am the one to start coding, I will almost certainly code
in SISC (ie. the scheme-on-JVM I mentioned before), and I will not be
starting in the next few weeks.

> Note that I haven't subscribed Keith to our dev-list yet, as I strictly
> follow an opt-in policy, but I'll CC: him this message.

> Keith, just FYI: today was the most busy (some might say a bit heated)
> day our little Dev-list has known yet - see the archive on Savannah if  
> you want. We'll also warmly welcome you on freecats-dev list.

> Part of the story is about wether KDE's KBabel:
> - is a nice candidate from which to build Free CATS' future interactive
> translation client (answer: quite possibly, still a lot of work to do)
> - is portable (answer:not quite yet, uses KDE-specific libraries)
> - as a free software team, wants the whole bunch of us, including the
> non-coders & barkers, to distract them from a purely KDE approach and to
> invade them ;-)

> Stanislav & other readers, please correct any blatant (and inadvertent)
> mistake above.

> Of course, we're asking ourselves more or less the same questions about
> OmegaT. To date, KBabel & OmegaT are the two only projects of interest  
> which we heard about.   


> Coming back to Charles' other stuff:

>>   - I am assembling an argument that we will need to handle
>>   hierarchical structure to get results with German->English
>>   translation.  More to follow, not necessarily all that soon.

> Great - at least somebody kep working today while all of us talked. 

> I'm not sure what you mean with "hierarchical structure" here, but I
> suppose I'll just have to wait a little and I'll see it. I hope it
> nicely fits in the picture as a clever indexing feature on top of raw
> segment storage in a TM.

Hope to send a message on this later this week.  By hierarchical
structure I mean the parse trees that linguists represent using x-bar
grammars, eg.:

                Sentence
                /       \
        Noun phrase     Verb phrase
        /       \       /       \
    Determiner  Noun  Verb      Noun phrase
        |        |      |        /      \
      The       cat    sat      Prep    Noun phrase
                                 |      /       \
                                on    Det       Noun
                                       |         |
                                      the       mat

Best,

Charles








reply via email to

[Prev in Thread] Current Thread [Next in Thread]