[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Freecats-Dev] Segmenting and fuzzy matching fom a user's point of view
From: |
Dave Simons |
Subject: |
[Freecats-Dev] Segmenting and fuzzy matching fom a user's point of view |
Date: |
Thu, 27 Feb 2003 10:56:16 +0100 |
Hello folks,
Some things I'm going to say here might be considered premature detail but
at least they will have been said.
Please file the comments away as you think fit :-)
(pause while Dave puts his translator's hat on)
Segmenting.
=========
Many of the texts I translate are definitions of very precise,
point-by-point procedures; they do not consist of eloquent descriptions.
These and lots of other types of document (keyword lists, help texts, etc.)
contain standalone segments which are very short -- so short that in a lot
of cases they can't even be parsed.
At the other end of the segmentation spectrum, due the way some authors lay
their documents out (I'm not saying it's the best way but they do have a
right their quirks) I find it annoying when I'm forced to end my segment at
the end of a paragraph, because some of these are in fact "faux" paragraphs.
That's why it's essential for me to have maximum flexibility and to keep
control as far as segment definition goes. I nearly always choose the
smallest possible segment boundary (the smallest reasonable delimiter being
":" and/or ";") then glue segments together when this breaks down. Having
done that, I want to be able to continue expanding the segment
indefinitely, repeatedly gluing the next one along the line right till the
end of the document if necessary! I've never needed to go to this extreme,
but I'm just making the point that I don't want ANYTHING to limit me,
whether the program designers think it's good for my health or not.
I, the translator, must be the one who has the final choice and I'm adamant
about that.
Unfortunately, sometimes, in trados/wordfast type clients, without there
being a visible reason, the program doesn't allow gluing at all in certain
points in the text. I guess this might be due to "interference" from
MSWord's proprietary formatting markers etc. As for gluing across
paragraphs, I don't know of any CAT tool that allows this.
None of this, of course, will stop other translators using different
segment boundaries (like paragraph markers) if they so wish. They will have
just as much choice and control as I do.
Fuzzies
======
I support Henri when he says translators probably don't give a damn if a
fuzzy match turns out to be nonsense. What I do give a damn about is having
these fuzzy matches more or less forced on me. This is what every CAT tool
I've ever tried tends to do. They preset the "translated" window to the
fuzzy match. Now believe me, when you've been translating repetitive stuff
for a couple of hours and fatigue starts setting in, when you look at a
fuzzy match -- providing you've remained alert enough to recognize it as
such (yes I know it's displayed in a different colour but fatigue is
fatigue...) -- you're sometimes hard-pressed to spot exactly how fuzzy the
match is in semantic terms rather than in % terms. (I've seen 98% matches
that contain fatal flaws and 75% matches that are just about spot-on.) In
circumstances like these, a bad fuzzy can easily get past your guard.
I'm not sure what the solution to this problem is. Maybe the program should
display the original-language half of the fuzzy segment right up alongside
the new text to be translated so that the differences can more easily be
seen. I'm convinced that comparing two texts one language is much less
fraught with difficulties than comparing one text in two languages. In any
case, if the program really does insist on presetting the "translated"
window to the fuzzy match, it should at least give me the opportunity to
banish it to the pits of doom with a single keystroke, leaving me a virgin
window to work with.
Talking about fuzzies, a feature I'd like to see dropped -- or at least
carefully rethought as regards default settings -- is "pre-translation". I
never use it myself and can't see why anyone would want to use it, but
among the agent part of my clientele, there are some customers who insist
on using it. The problem is that it's a dangerous weapon in their hands
because they do not have the technical nous to configure it correctly.
Imagine a pre-translated file full of fuzzy-matched parts numbers and
you'll see what I mean.
That's about it for now
Dave
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [Freecats-Dev] Segmenting and fuzzy matching fom a user's point of view,
Dave Simons <=