freecats-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Freecats-Dev] Segmenting and fuzzy matching fom a user's point of view


From: Dave Simons
Subject: [Freecats-Dev] Segmenting and fuzzy matching fom a user's point of view
Date: Thu, 27 Feb 2003 10:56:16 +0100

Hello folks,

Some things I'm going to say here might be considered premature detail but at least they will have been said.
Please file the comments away as you think fit :-)

(pause while Dave puts his translator's hat on)

Segmenting.
=========

Many of the texts I translate are definitions of very precise, point-by-point procedures; they do not consist of eloquent descriptions. These and lots of other types of document (keyword lists, help texts, etc.) contain standalone segments which are very short -- so short that in a lot of cases they can't even be parsed. At the other end of the segmentation spectrum, due the way some authors lay their documents out (I'm not saying it's the best way but they do have a right their quirks) I find it annoying when I'm forced to end my segment at the end of a paragraph, because some of these are in fact "faux" paragraphs.

That's why it's essential for me to have maximum flexibility and to keep control as far as segment definition goes. I nearly always choose the smallest possible segment boundary (the smallest reasonable delimiter being ":" and/or ";") then glue segments together when this breaks down. Having done that, I want to be able to continue expanding the segment indefinitely, repeatedly gluing the next one along the line right till the end of the document if necessary! I've never needed to go to this extreme, but I'm just making the point that I don't want ANYTHING to limit me, whether the program designers think it's good for my health or not. I, the translator, must be the one who has the final choice and I'm adamant about that.

Unfortunately, sometimes, in trados/wordfast type clients, without there being a visible reason, the program doesn't allow gluing at all in certain points in the text. I guess this might be due to "interference" from MSWord's proprietary formatting markers etc. As for gluing across paragraphs, I don't know of any CAT tool that allows this.

None of this, of course, will stop other translators using different segment boundaries (like paragraph markers) if they so wish. They will have just as much choice and control as I do.

Fuzzies
======

I support Henri when he says translators probably don't give a damn if a fuzzy match turns out to be nonsense. What I do give a damn about is having these fuzzy matches more or less forced on me. This is what every CAT tool I've ever tried tends to do. They preset the "translated" window to the fuzzy match. Now believe me, when you've been translating repetitive stuff for a couple of hours and fatigue starts setting in, when you look at a fuzzy match -- providing you've remained alert enough to recognize it as such (yes I know it's displayed in a different colour but fatigue is fatigue...) -- you're sometimes hard-pressed to spot exactly how fuzzy the match is in semantic terms rather than in % terms. (I've seen 98% matches that contain fatal flaws and 75% matches that are just about spot-on.) In circumstances like these, a bad fuzzy can easily get past your guard.

I'm not sure what the solution to this problem is. Maybe the program should display the original-language half of the fuzzy segment right up alongside the new text to be translated so that the differences can more easily be seen. I'm convinced that comparing two texts one language is much less fraught with difficulties than comparing one text in two languages. In any case, if the program really does insist on presetting the "translated" window to the fuzzy match, it should at least give me the opportunity to banish it to the pits of doom with a single keystroke, leaving me a virgin window to work with.

Talking about fuzzies, a feature I'd like to see dropped -- or at least carefully rethought as regards default settings -- is "pre-translation". I never use it myself and can't see why anyone would want to use it, but among the agent part of my clientele, there are some customers who insist on using it. The problem is that it's a dangerous weapon in their hands because they do not have the technical nous to configure it correctly. Imagine a pre-translated file full of fuzzy-matched parts numbers and you'll see what I mean.

That's about it for now

Dave





reply via email to

[Prev in Thread] Current Thread [Next in Thread]