koha-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Koha-devel] Re: MARC character encoding


From: paul POULAIN
Subject: Re: [Koha-devel] Re: MARC character encoding
Date: Mon Jan 20 08:46:02 2003
User-agent: Mozilla/5.0 (X11; U; Linux i686; fr-FR; rv:1.1) Gecko/20020826

Ed Summers a écrit:
Hi Paul:
On Wed, Jan 15, 2003 at 03:20:47PM +0100, paul POULAIN wrote:
  
Could someone explain how to translate the "MARC21" charset to a more 
convenient one (and which is more convenient ?)
Same question for UNIMARC (which is ISO646 if my docs are right)
    
If we lived in a perfect world we would all be using Unicode (UTF8)
since it covers so many of the worlds scripts [1]. Unfortunately the
world is not perfect. MARC has been around longer than Unicode, so 
MARC-8 character encoding to allow non Latin scripts to live in MARC 
records. I guess the world has bigger problems than character encodings
(Mr George Bush comes to mind), but I'll leave that particular problem
alone :)
50% of your news here are related to Mr George Bush.
Unfortunately for me, the other 50% are NOT related to character encoding :-)))
I wasn't aware that UNIMARC had defined a different standard for
character encoding. Isn't ISO646 just an synonym for ASCII? [2] Which docs 
describe the character sets used in UNIMARC?
No, you're right. ISO646 IS Ascii.
What i don't understand is how they code >127 codes on 2 digits.
for example,  \xc3\x65 = ê
It's not ASCII ?

  
I tried MARC-Charset, which seems to translate from "MARC21" to UNICODE, 
but i don't know what to do with my unicode ;-(
    
Yes, MARC::Charset is an implementation of the MARC-8 ==> Unicode
(UTF-8) mappings published by the Library of Congress. [3] In MARC-8
there is a special way of 'escaping' to other character sets (Hebrew,
Cyrillic, East-Asian, etc). 
this special way is \xc3\x65 ?

  

  
You mapped all the UNIMARC fields to MARC fields!?! I was under the
impression that this was quite a big undertaking to do completely. Is
your code currently checked into CVS? Having a UNIMARC filter in
MARC::Record (MARC::File::UNIMARC) has been a long term goal. Maybe we
could roll this work into the MARC::Record package?
NO, of course. You're right, this is a BIG job.
I did this only for a few fields/subfields I needed (around 20-25 fields)
It's a 10-20 lines script (+ the mapping array) with MARC::Record

- 
Paul POULAIN
Consultant indépendant en logiciels libres
responsable francophone de koha (SIGB libre http://www.koha-fr.org)

reply via email to

[Prev in Thread] Current Thread [Next in Thread]