freecats-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Freecats-Dev] KBabel option - what's at stake


From: Stanislav Visnovsky
Subject: Re: [Freecats-Dev] KBabel option - what's at stake
Date: Thu, 20 Feb 2003 17:27:26 +0100 (CET)

Hi!

On Wed, 19 Feb 2003, Henri Chorand wrote:

> > > The things to consider are:
> > >
> > > 1) .PO files
> > > (...)
> >
> > KBabel as of version 1.2 will support import/export filters, ATM we open
> > PO files and Qt Linguist files.
> >
> > BTW, a PO file is a standard format of GNU gettext and de facto standard
> > for translation of Linux software. In opensource world I'm aware only of
> > two other formats: Mozilla and OpenOffice. You can even use PO files to
> > translate PHP (I've never done this myself though).
>
> This is fine to know. At this stage, I'm confident about KBabel's ability to
> process all these Linux-specific (sorry for lacking a better term) resource
> files.
>
> For me, in the progress curve from KBabel's present features to our dream
> product, the most sensitive issue is to make it able to handle documents in
> a tagged format (XML structure tags, HTML layout tags).

This really depends on what would the server provide. Otherwise KBabel
will use probably the po4a project for this.

>
> > > 2) Server API & connection
> > > Even if I wrote out an API which certainly looks very close from what we
> > > will end up with, we have not accurately stated the dialog
> > between client &
> > > server.
> > > I'm presently thinking about it and I'll try to provide a
> > detailed document
> > > within a few weeks. Basically, seeing what we may be able to
> > optimize and
> > > the low number of connected clients (as compared with an average HTTP
> > > server), I believe it will be a (simple) connected mode.
> >
> > It sounds reasonable. Let's start with simpler methods and then introduce
> > advanced features.
>
> We might start from the existing Berkeley plug-in:
> - assessing its features
> - adding HTTP access (connection management & read/update queries in
> connected mode)
> - bringing improvements

Sad news: it seems like the major Linux distributions started to drop
support for DB II, so the KBabel plugin rewrite is very near :-(

> Stanislav, I see us as being able to provide good guidelines very soon, but
> this area will obviously deserve to be refined over time. In my DB Indexing
> document, I have given an example of a system in which we can quickly
> perform advanced testing by changing parameter values used by the indexing
> functions.

Where can I access the document?

>
> > ATM, we have the following plugins:
> > 1. Translation memory based on Berkeley Database II
> >    - supports storing he translations on-the-fly, but without
> >      possibility to control what goes in and what does not
>
> You mean that you send update queries and don't control the result (data
> written or not) afterwards?
>
> >    - retrieving exact translation and also single word translation works
>
> Right, these are perfect matches in our jargon.
>
> If you wrote a specifications document, I'll be happy to review it, I'm sure
> I'll learn from it - and I'll stop asking "detailed" questions before
> grasping the whole picture more clearly.

I can write down some text over the weekend about KBabel architecture
with a focus on dictionary modules.

>
> > 2. PO compendium
> >    - retrieving exact/partial translations from compendium
>
> If I understand you well, compendium the name you give to a set (catalogue)
> of reference source segments ( a translation memory)?
> So partial here means fuzzy matching (retrieving more or less similar,
> already translated source segments and using their translation to more
> quickly enter the proper translation of current source segment)

Yes, but these are typically generated for a project, for example KDE and
are stored on the KDE server.

>
> > 3. Auxiliary PO file
> >    - retrievve exact translations from other PO - used for similar
> >      languages, e.g. translating to Slovak could use this module
> >      to initialize the translation from Czech (it's not very ideal,
> >      but can help a lot).
> >    - This plugin is typically used for _searching_ the translation,
> >      not for automatic initialization.
>
> I'm not sure we would use it, but if it helps others, why not. I'm not sure
> we'll bring new things in this area.

Typically, the application can be already translated in other languages.
If you encounter a text without a clear meaning (context missing, new
terms), you can easily check other langauges. I know Slovak use Czech and
vise versa, Swedish sometimes Danish and vice versa.

> > > - Does it recycle (remember) existing translations for a given source &
> > > target language pair, and if yes, how?
> >
> > It's done by Translation memory. ATM, you can allow to store the
> > translation (language,file,original message,translated message) on every
> > change or manually let the database to read a file.
>
> We are not accustomed to keeping track of such a reference to a specific
> file, but we understand the obvious reasons why you used it. We plan to keep
> track of some ancillary data too.

In fact, this is used for "diffing". In free software world, the texts are
changing rapidly. If a developer only corrects the text somehow (fixes
punctuation, clarifies it etc), the diff feature of KBabel can lookup the
database for a previous text and markup the changes.

> > > Let me also know if you can easily access our specification
> > > documents which you can find as attachments in the mailing
> > > list archive, or if you prefer me to send them to your e-mail
> > > address.
> >
> > It seems quite easy to find them. Maybe it's time to move them
> > outside of the mailing list archive and put on the web page with
> > a notice that it's work in progress.
>
> Sure. Ideally, I need a few days to split the first one into chapters,
> quickly review some of them and provide them as separate HTML files.
>
> Simos, could you please handle the document publishing aspect on Savannah?

That would be great.

>
> > (...)
> > Yes, it is. I would be happy if you could test KDE 3.1 (contains KBabel
> > 1.0 with a lot of enhancements, but mostly editor-wise, not for plugins).
> >
> > All the latest releases are available on the KBabel homepage in source
> > form.
>
> Fine. I just saw that KDE 3.1 considered stable.
> I hope Red Hat makes it available via RPMs, as for now, I've only installed
> a couple of small things on my box, and KDE is a large project.

In fact, you should need only kdelibs and kdesdk package. But beware, Red
Hat typically misses the DB 2 module.

>
> > > (...)
> > > So, I would like to know:
> > > - How close or how far KBabel is from being able to be ported
> > > on Win32 and Mac OS X platforms
> >
> > Mac OS X support is pretty close, since an effort to port KDE to Mac OS X
> > progress nicely. Win32 support is a problem, since KBabel relies on KDE
> > libraries heavily.
>
> I was not clear here. We're not asking for a full KDE port - how would I
> dare ;-)  - only KBabel.

This means porting of kdelibs and KBabel. Port of kdelibs is the hardest
part of all KDE port, so it's quite similar :-(

> I also want to clearly state this only is a query about a state of things.
> It does not mean we require anything here.
> For us, Win32 & Mac OS X support will obviously be a major goal. For now,
> one may find a way via a double boot configuration, Virtual PC or whatever
> else.
> Maybe ENSTB team could work on porting KBabel to Windows if we choose this
> option (starting from KBabel and closely cooperate in order to develop new
> features on it).

I would suggest to adapt PoEdit is the author agrees (your today post).

>
> I think we should let you a little time to:
> - assess the effort associated to porting KBabel to Win32 / Mac OS X
> - determine how much of this effort you can/cannot provide

Almost none of it. Other teams are working on this thing, see www.kde.org.

> - see how KDE team could take care of NOT breaking this portability for all
> future KBabel upgrades

I can try to keep KBabel compatible with current KDElibs as long as
possible. This way you don't need to upgrade the KDE libraries.

> > It's portable, but we have identified the following problem which hunts us
> > pretty hard: it is not source/binary compatible between major versions. So
> > if the application is developed for version 2 (as is the KBabel plugin),
> > one needs to adapt it for different versions with a need of database
> > rebuild. We plan to rewrite the module using generic SQL and allow to
> > connect to any SQL database (there is also SQLite, which is nice for
> > personal use without too much hassle).
>
> A SQL-type database is handy in many ways, but from our point of view, it
> might not bring that much help. Have you considered using the filesystem and
> a number of flat files for a TM server, or do you consider this option as
> highly exotic/risky/bad for whatever reason?

It simply does not scale well. KDE project itself contains cca 50 000
messages for GUI, and roughly the same size for documentation.

Now, get a version control in and you start to see files of few MB to be
accessed fast. A typical DB II database for a decent translator takes
about 10MB of disk space.

On the other hand, you can more easily create "strange" indexes for fast
fuzzy matching.

> > > So, Stanislav, if you believe KBabel is an option for us - if
> > > KBabel team, which you manage, sees the following features
> > > as a Good Thing:
> > > - GUI client portability
> >
> > A bit of problem. You really need KDE libs, available on Unix-like
> > platforms only ATM (there is KDE 2.2 port to windows, but newer
> > KBabel versions need a more recent libraries).
>
> I would really appreciate if you can provide a more detailed picture
> (including possibly taking the time to explain things to non-Linux
> programming experts like us).
> This is where it might hurt.

Clearly stated: KBabel works great on a typical Linux system, can be run
on Darwin and hardly the current version on Windows. It is Unix-portable
only (Windows and Mac OS 9).

> > > then I'm sure we'll all want to join and help to:
> > > - Immediately elect KBabel as our translation client of choice for
> > > Free CATS, which scope would be reduced to building up a server
> > > component and helping to enhance KBabel
> >
> > No need to elect something, you should provide as much flexibility as
> > possible.
>
> > > - Build our Free CATS TM server on top of Berkeley DB in close
> > > cooperation with what you already did for KBabel.
> >
> > As I've already mentioned, this is probably not a good option.
>
> Well, if you read "filesystem" instead of "Berkeley DB", would your answer
> be different? (this is my last try at it)

Technically, yes. You can code it very easily. The first version of the
server (0.0.1 :-) can work that way. But you need clearly define
"internal" interface to the "storage" part of the server, so the
implementation can be replaced easily.

>
> > > While few of us can directly contribute to code (and it also
> > > depends on the language used), we can still do an awful lot
> > > in terms of documentation, interface localization & public
> > > relations (how to make a free software developers
> > > localization tool into the best tool available for professional
> > > translators & free software localization volunteers alike.
> >
> > But the goals fo FreeCATS are so ambitious, that it's necessary
> > to do a small steps and KBabel could provide a foundation/
> > replacement for missing parts while developing other.
>
> Well, yes and no. That's the nice thing with any modular architecture.
> Once our TM server works somehow, everybody will be free to develop whatever
> clients (interactive translation, alignment of legacy translation, counting
> & analysis) they want, portable or not.
>
> So, if you add a plugin to KBabel that enables it to use a Free CATS TM
> server, it's very nice to know, very useful for tests and so on.

That is my intention.

>
> All in all, at least at first sight, and apart from the portability issues,
> there seems to be a strong closeness between what KBabel presently provides
> and the kind of interactive translation client we would like to see. At
> least, this was what I thought this morning :-)
>
> We are not presently able to build a GUI client to our Free CATS server
> soon, so it's tempting to select one of the most similar (or less different)
> free software projects available and to see if we can make it fit into our
> plans - OF COURSE, only if, and as long as it's fit; if it's reasonably in
> line with the original project's own goals.
>
> There are several options, all of which include, building up our TM server.
> I'm considering the translation client here:
>
> 1) we start from scratch, with whatever tools we want, and are happy to see
> other teams developing other clients to use our server if/when they feel
> like it (nice idea, except we lack resources for now)
>
> 2) we start from another project which we consider is close enough, then
> work on our own, at our own pace (possibly slow) - but isn't that forking?

Yes, but most of the project will be very happy if you opt for (3), I'm
almost sure. The only problem can be a lack of time.

Be sure to contact also the author of PoEdit! If he is active ATM, he will
not mind to help. In fact, it can reduce a lot of his efforts as well.

Stanislav






reply via email to

[Prev in Thread] Current Thread [Next in Thread]