[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: [Koha-zebra] Re: Unimarc, marc21, Unicode, and MARC::File::XML
From: |
Tümer Garip |
Subject: |
RE: [Koha-zebra] Re: Unimarc, marc21, Unicode, and MARC::File::XML |
Date: |
Tue, 21 Mar 2006 23:38:20 +0200 |
I thought I explained it but here it is again:
I do not think which method you use is relevant here but but just try
this:
In the release version ZEBRA test/usmarc folder change the zebra.cfg to
read
recordType: grs.xml
in the tabs folder change marc21.abs to read record.abs
Use zebraidx to create the database with the single XML record I sent to
you.
Start the zebrasrv at the required port.
Use yaz-client
f @attr 1=1016 book
format xml
show
I see the xml record header saying
<?xml version="1.0" encoding="ISO-8859-1" standalone="yes"?>
Further down you'll see utf-8 characters of correct hex as
\XC5\X9F
Now stop the server.
Add line encoding:utf-8 to your zebra.cfg
Restart the server
Do the same search you get
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
Conclusion:
The database does keep the data in UTF-8 as expected.
Server does not know about database character set or the xml record taht
was parsed in and unless specificly set to UTF-8 in Zebra.cfg srever
goes ahead and changes the header or in fact it produces itself a header
saying iso-8859-1 while giving out utf-8 characters.
I did not ask any help on this thanks. Just clearing some issues with
Paul's problem.
Tumer
-----Original Message-----
From: Adam Dickmeiss [mailto:address@hidden
Sent: Tuesday, March 21, 2006 9:58 PM
To: Tümer Garip
Cc: address@hidden
Subject: Re: [Koha-zebra] Re: Unimarc, marc21, Unicode, and
MARC::File::XML
Tümer Garip wrote:
> Hi Adam,
> You seem a bit offended that was not my intention, just frustation
> sometimes makes me use harsh words and translanting them to english
> may be too harsh.
>
> I do not need to send you any config+examples cause I tested this with
> your default config files. I am attaching an xml record in utf-8
If you're to receive help from me you need to to tell me which zebra.cfg
you're using. And show me the record + the way you indexed it (zebraidx
update ?)
>
> Briefly I had default configuration files and build zebra with xml
> records. When I noticed the problem I used yaz-client to see what was
> going on. On my log I could see data going in the zebra was with
> encoding utf-8 While yaz client was returning xml with headers saying
> iso-8859-1 while I could actually see the utf-8 characters as they
> show as hex in yaz client.
I also need to know what you see? And you you'd expect to see.
/ Adam
> I have retried this procedures just now and it seems the same. Just
> adding encoding:UTF-8 to zebra.cfg and restarting the server you get
> correct heading and correct data. Please note that server has to be
> restarted but zebradb does not have to be rebuilt.
>
> Thanks
> Tumer
>
> -----Original Message-----
> From: Adam Dickmeiss [mailto:address@hidden
> Sent: Tuesday, March 21, 2006 9:00 PM
> To: Tümer Garip
> Cc: address@hidden; address@hidden
> Subject: Re: [Koha-zebra] Re: Unimarc, marc21, Unicode, and
> MARC::File::XML
>
>
> Tümer Garip wrote:
>
>>Hi,
>>
>>This problem if I understood it correctly has got nothing to do with
>>mysql or perl it has to do with ZEBRA unless it is to do with UNIMARC
>>which I am not familiar with. As you know (Paul) I have an utf-8
>>version working.
>>
>>I had the same problem from records coming from zebra and found out
>>that it is not doing what it is supposed to do unless you explicitly
>>set it to utf-8. You have to explicitly put "encoding utf-8" in all
>>your zebra config files especially the zebra.cfg and your .abs .
>>Otherwise unlike the documentation saying that zebra character code is
>
>
>>automatically set by the xml encoding it DOES NOT.
>
> I can't reproduce this (bug). Care to share a a config+example that
> illustrates this (Inserts an XML record from Perl in UTF-8) ?
>
>
>>Perl send xml to zebra with encoding utf-8 on the header and utf-8
>>data in it. Zebra saves all the data in utf-8 but returns an xml
>>saying encoding iso8859-1 at the header and utf-8 characters in data.
>>No module can correct this as it is stupid.
>
> Just need to know when the stupidity starts:-)
>
> / Adam
>
>
>>I corrected the problem by adding encoding:UTF-8 in zebra.cfg,
>>record.abs, sort-string.chr
>>
>>Hope it solves yours,
>>
>>Tumer
>>
>>
>>
>>_______________________________________________
>>Koha-zebra mailing list
>>address@hidden
>>http://lists.nongnu.org/mailman/listinfo/koha-zebra
>>
>
>
>
>
> _______________________________________________
> Koha-zebra mailing list
> address@hidden
> http://lists.nongnu.org/mailman/listinfo/koha-zebra
>
- [Koha-zebra] Re: Unimarc, marc21, Unicode, and MARC::File::XML, Paul POULAIN, 2006/03/20
- [Koha-zebra] Re: Unimarc, marc21, Unicode, and MARC::File::XML, Mike Rylander, 2006/03/20
- Re: [Koha-zebra] Re: Unimarc, marc21, Unicode, and MARC::File::XML, Pierrick LE GALL, 2006/03/20
- Re: [Koha-zebra] Re: Unimarc, marc21, Unicode, and MARC::File::XML, Mike Rylander, 2006/03/20
- Re: [Koha-zebra] Re: Unimarc, marc21, Unicode, and MARC::File::XML, Pierrick LE GALL, 2006/03/21
- [Koha-zebra] Re: Unimarc, marc21, Unicode, and MARC::File::XML, Paul POULAIN, 2006/03/20
- [Koha-zebra] Re: Unimarc, marc21, Unicode, and MARC::File::XML, Tümer Garip, 2006/03/21
- Re: [Koha-zebra] Re: Unimarc, marc21, Unicode, and MARC::File::XML, Adam Dickmeiss, 2006/03/21
- RE: [Koha-zebra] Re: Unimarc, marc21, Unicode, and MARC::File::XML, Tümer Garip, 2006/03/21
- Re: [Koha-zebra] Re: Unimarc, marc21, Unicode, and MARC::File::XML, Adam Dickmeiss, 2006/03/21
- RE: [Koha-zebra] Re: Unimarc, marc21, Unicode, and MARC::File::XML,
Tümer Garip <=
- Re: [Koha-zebra] Re: Unimarc, marc21, Unicode, and MARC::File::XML, Adam Dickmeiss, 2006/03/21
- [Koha-zebra] Re: Unicode, XML,Zebra,Windows, Tümer Garip, 2006/03/22