help-smalltalk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Help-smalltalk] Fwd: Fwd: Convert csv to xml


From: Paolo Bonzini
Subject: [Help-smalltalk] Fwd: Fwd: Convert csv to xml
Date: Thu, 01 Oct 2009 14:19:37 +0200
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.1) Gecko/20090814 Fedora/3.0-2.6.b3.fc11 Lightning/1.0pre Thunderbird/3.0b3

---------- Forwarded message ----------
From: *Sabine Emmy Eller* <address@hidden>
Date: Sun, Sep 27, 2009 at 8:22 AM
Subject: Convert csv to xml
To: address@hidden <mailto:address@hidden>


Good morning!

I am looking for an example/examples of a conversion script from csv to
xml format that I could adapt to what is needed to convert csv tables
(for now three columns) to .kvtml format which is used by Parley, a
vocabulary trainer from the KDE-Educational project.

I did not program anything for over 20 years, so let's say I am a 100%
beginner who is able to understand strange enough some parts of what is
written when I see it.

As a typical leraning by doing person I have my problems with manuals :-)

CSV File format:

ID,eng,deu
1,house,Haus
2,cat,Katze
3,dog,Hund
...

In future the format will become something like:
ID,eng-tern,eng-POS,deu-term,deu-POS
1,house,noun,Haus,noun-neuter
2,cat,noun,Katze,noun-female
3,dog,noun,Hund,noun-male

but that will be easy to build by myself once I see how the three-column
part is done.


The source and target language may vary - it could be well
ID,spa,cat
or whatever combination - for now we have approx. 15 language
combinations for Parley ready.

The target file looks like this:
(I keep out the header which is built manually with the license data,
copyright etc. - that can be built in a second step - for now only the
vocabulary conversion is really necessary) - the identifier part in this
example is much more than is needed - I will have only the three colums
above for now. The information part can also be built manually since it
anyway is different for each file, so it is not read in.

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE kvtml PUBLIC "kvtml2.dtd" "http://edu.kde.org/kvtml/kvtml2.dtd";>
<kvtml version="2.0" >
<information>
<generator>Parley 0.9.2</generator>
<title>en-esp-no</title>
<author>Stian Drøbak</author>
<contact>Daikichi.Komusubi -at- gmail.com <http://gmail.com></contact>
<license>GPLv2+ (GNU General Public License version 2 or later)</license>
<comment>This file is possible because of the following people/sites/teams:

www.lingolex.com <http://www.lingolex.com>
www.spanishpronto.com <http://www.spanishpronto.com>
for commonly used words
www.spanish.com <http://www.spanish.com>
for other words.

Alfredo Gonzalez &lt;funeral_rite -at- hotmail.com <http://hotmail.com>>
for checking the files for errors, and ispiring to add norwegian
translations.

Daniella Hermosilla
for awakening a small interest in spanish

the KDE team for developing this awesome application called Parley :D

Hopefuly there will be released more at som later date :)</comment>
<date>2009-08-28</date>
<category>Languages</category>
</information>
<identifiers>
<identifier id="0" >
<name>English</name>
<locale>en</locale>
<article>
<singular>
<definite>
<male>the</male>
<female>the</female>
<neutral>the</neutral>
</definite>
<indefinite>
<male>a</male>
<female>a</female>
<neutral>a</neutral>
</indefinite>
</singular>
<plural>
<definite>
<male>the</male>
<female>the</female>
<neutral>the</neutral>
</definite>
<indefinite>
<male>a</male>
<female>a</female>
<neutral>a</neutral>
</indefinite>
</plural>
</article>
<personalpronouns>
<singular>
<firstperson>I</firstperson>
<secondperson>you</secondperson>
<thirdpersonneutralcommon>he/she/it</thirdpersonneutralcommon>
</singular>
<plural>
<firstperson>we</firstperson>
<secondperson>you</secondperson>
<thirdpersonneutralcommon>they</thirdpersonneutralcommon>
</plural>
</personalpronouns>
<tense>Simple Present</tense>
<tense>Simple Past</tense>
<tense>Simple Future (will)</tense>
<tense>Present Perfect</tense>
<tense>Past Perfect</tense>
<tense>Future Perfect</tense>
<tense>Present Progressive</tense>
<tense>Past Progressive</tense>
<tense>Future Progressive</tense>
<tense>Present Perfect Progressive</tense>
<tense>Past Perfect Progressive</tense>
<tense>Future Perfect Progressive</tense>
</identifier>
<identifier id="1" >
<name>Spanish</name>
<locale>es</locale>
<article>
<singular>
<definite>
<male>el</male>
<female>la</female>
<neutral>lo</neutral>
</definite>
<indefinite>
<male>un</male>
<female>una</female>
</indefinite>
</singular>
<plural>
<definite>
<male>los</male>
<female>las</female>
</definite>
<indefinite>
<male>unos</male>
<female>unas</female>
</indefinite>
</plural>
</article>
<personalpronouns>
<singular>
<firstperson>yo</firstperson>
<secondperson>tú</secondperson>
<thirdpersonneutralcommon>él/ella</thirdpersonneutralcommon>
</singular>
<plural>
<firstperson>nosotro/as</firstperson>
<secondperson>vosotros/as</secondperson>
<thirdpersonneutralcommon>ellos/as</thirdpersonneutralcommon>
</plural>
</personalpronouns>
<tense>Pretérito perfecto</tense>
<tense>Imperativo</tense>
<tense>Pretérito indefinido</tense>
<tense>Gerundio</tense>
<tense>Subjuntivo presente</tense>
<tense>Presente</tense>
<tense>Futuro imperfecto</tense>
<tense>Condicional simple</tense>
<tense>Participio</tense>
</identifier>
<identifier id="2" >
<name>Norwegian Bokmål</name>
<locale>nb</locale>
</identifier>
</identifiers>
<entries>
<entry id="0" >
<translation id="0" >
<text>Pharmacy</text>
</translation>
<translation id="1" >
<text>la Farmacia</text>
</translation>
<translation id="2" >
<text>Apotek</text>
</translation>
</entry>
<entry id="1" >
<translation id="0" >
<text>Computer</text>
</translation>
<translation id="1" >
<text>la Computadora</text>
</translation>
<translation id="2" >
<text>Datamaskin</text>
</translation>
</entry>

For me it is really relevant to understand these things because I will
need it over and over again and I will need smalltalk also for another
software that is being programmed, always to convert data, but there
from a database to an XML format like TMX or TBX or between two
different XML formats.

For now I go ahead reading some free books about Smalltalk I found
online as well as the documentation and it would be really great if you
could point me to examples that are similar to what I need to do or
maybe also pages where I can find code examples (I found some on the
smalltalk homepage, but really do need some more).

Or maybe someone of you is crazy enough and helps us to get this first
script working? That would be great.

Thank you in advance!

Sabine

*****
Sabine Emmy Eller
CCO-Vox Humanitatis
s.eller [at] voxhumanitatis [dot] org
skype: sabinecretella





reply via email to

[Prev in Thread] Current Thread [Next in Thread]