Hi Paul:
On Wed, Jan 15, 2003 at 03:20:47PM +0100, paul POULAIN wrote:
Could someone explain how to translate the "MARC21" charset to a more
convenient one (and which is more convenient ?)
Same question for UNIMARC (which is ISO646 if my docs are right)
If we lived in a perfect world we would all be using Unicode (UTF8)
since it covers so many of the worlds scripts [1]. Unfortunately the
world is not perfect. MARC has been around longer than Unicode, so
MARC-8 character encoding to allow non Latin scripts to live in MARC
records. I guess the world has bigger problems than character encodings
(Mr George Bush comes to mind), but I'll leave that particular problem
alone :)
50% of your news here are related to Mr George Bush.
I tried MARC-Charset, which seems to translate from "MARC21" to UNICODE,
but i don't know what to do with my unicode ;-(
Yes, MARC::Charset is an implementation of the MARC-8 ==> Unicode
(UTF-8) mappings published by the Library of Congress. [3] In MARC-8
there is a special way of 'escaping' to other character sets (Hebrew,
Cyrillic, East-Asian, etc).
this special way is \xc3\x65 ?