freecats-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Freecats-Dev] OmegaT


From: Marc Prior
Subject: Re: [Freecats-Dev] OmegaT
Date: Thu, 27 Feb 2003 10:03:18 +0100

Hello All,

I have subscribed to the Free CATS mailing list and would be happy to share 
my thoughts with you (as many of you already know me, you will also know that 
I think you're entitled to my opinion ;-) ), but at the moment work on the 
OmegaT project is taking up any spare time I have. (I also have to do some 
translation in-between, in order to buy food :-). Very inconvenient!)

I'll now resopnd to various comments on the list, mainly by Henri.

> To put it in a nutshell: Free CATS dreamed about it, and Keith did it

This really sums it up. OmegaT is available, accessible, and usable. Indeed, 
people are using it - in fact, interest is really starting to pick up 
now. It is easy to see potential areas for further improvement of OmegaT, but 
my view is that it is better to make an open-source product available, and 
then to modify it in the light of user feedback. With a commercial product, 
there may be financial and marketing reasons for getting it right first time, 
but I feel strongly that a process of continual development is preferable 
for projects such as OmegaT and Free CATS. These projects are dependent upon 
generating enthusiasm, and technical specifications aren't very exciting. 
Also, an obvious source of support is the open-source coding community 
outside of translation, as there are programmers there willing to contribute, 
and they in turn can benefit from the product by using it to localize 
open-source products. The problem here is that the programmers do not 
necessarily have experience of the translation process, and if allowed to do 
so, may work towards solutions which are not practical and do not meet the 
actual needs of translators.

The other thing I would encourage you all to do is, of course, to try OmegaT. 
It is easy to learn - in fact, a university lecturer here in Germany says it 
can be learnt in an hour. (Think of that - OmegaT as a university subject. 
OmegaTology, or OmegaTics?) See 

http://www.fask.uni-mainz.de/cafl/linuxfaq/fasklinuxfaq.html

I have been suggesting that users try version 0.9.9, as version 1.0.0 had 
certain bugs, and there is little documentation for it as yet (give me 
another three-four weeks). However, the bugs have been cured in version 
1.0.2, and I imagine list subscribers will be able to find their way around 
1.0.2 with what documentation there is (the 0.9.7 manual, plus Keith's 
release notes for 1.0.0 on the Sourceforge site). The additional 
functionality of 1.0.2, in particular the change to TMX as the native TM 
format, is well worth having. 1.0.2 is available from the OmegaT home page 
(not Sourceforge yet). Make sure you have version 1.4.x of the J2RE installed.

On the current discussion of fuzzy matching algorithms:

> Please tell us what you think about it, and if ever you came with a
> similar solution. Marc Prior will certainly have something to say here,
> as I understand he works with German language, full of "déclinaison"

I have been using TM, first Trados, then Wordfast and now OmegaT, for six 
years now. I must say that for my work, all three have fuzzy matching 
algorithms that are more than adequate. There may be performance gains to be 
had by changing these algorithms, but such improvements would be a long way 
down the list of priorities for me.

I should perhaps point out that my work is seldom highly repetitive. It 
seldom involves translating a text which is 90% identical to an existing one, 
where the accuracy of fuzzy matching and the time savings gained from making 
minimum modifications to an existing segment are critical. It is much more 
often the case that I need to retrieve isolated words or phrases from past 
translations each of which would take me several minutes to find by other 
methods (searching the file system, opening the file, finding the location in 
the text, locating the parallel (translated) file, opening it, finding the 
location). This is the real benefit of TM for me. And, I believe, also for 
many other translators who don't consider using translation memory because 
they think it is only for repetitive texts.

One weakness of OmegaT is that the fuzzy matching algorithm treats inline 
formatting tags as words (or rather parts of words), and this reduces the 
match rate substantially on heavily formatted texts. That is not a fault of 
the algorithm itself, though.

On the subject of formatting:

OmegaT retains all of OOo's formatting information. I have not yet noticed 
any formatting loss whatsoever. I have imported large (5 MB), complex 
(styles, images, TOCs, etc.) MS Word files directly (i.e. without going 
through RTF) into OOo, translated them without difficulty in OmegaT, and 
exported them directly from OOo to Word, without formatting loss.

(Apart from demonstrating OmegaT's strengths, this also proves the value of 
the XML format structure for text documents. But don't get me started on that 
subject!)

You translate around the tags in the way with which most of you will be 
familiar. However, although you can't add formatting, you do have a certain 
amount of control over the existing formatting. (I need to do more testing to 
find how much.) You can delete, multilply, and change the sequence of certain 
formatting attributes. So, the sentence 

This is BOLD, this is ITALIC and this is UNDERLINED.

(where BOLD, ITALIC and UNDERLINED have the corresponding formatting)

appears in OmegaT as 

This is <f0>bold</f0>, this is <f1>italic</f1> and this is <f2>underlined</f2>

and can be changed to 

This is <f2>underlined</f2>, this is not italic and this is <f0><f2>bold and 
underlined</f2></f0>.

This may seem trivial to a programmer, but if you translate between two 
languages which frequently have a different word order and which use 
different font formatting mechanisms, you will appreciate how valuable it is. 
My point here is really that OmegaT should not be seen as leaving the 
translator with no formatting control.

Enough for now.
Marc




reply via email to

[Prev in Thread] Current Thread [Next in Thread]