koha-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Koha-devel] Building zebradb


From: Paul POULAIN
Subject: Re: [Koha-devel] Building zebradb
Date: Wed, 15 Mar 2006 18:18:54 +0100
User-agent: Mozilla Thunderbird 1.0.6-7.2.20060mdk (X11/20050322)

Tümer Garip a écrit :
Hi,

Hello Tümer,

We have now put the zebra into production level systems. So here is some
experience to share.
Building the zebra database from single records is a veeeeery looong
process. (100K records 150k items)

Best method we found:

1- Change zebra.cfg file to include

iso2079.recordType:grs.marcxml.collection
recordType:grs.xml.collection
if I understand, you now have 2 types of records in your DB (or 2 differents representations of a record)

2- Write (or hack export.pl) to export all the marc records as one big
chunk to the correct directory with an extension .iso2079 And system
call "zebraidx -g iso2079 -d <dbnamehere> update records -n".

Could you send us the code for export.pl ?

This ensures that zebra knows its reading marc records rather than xml
and builds 100K+ records in zooming speed.
Your zoom module always uses the grs.xml filter while you can anytime
update or reindex any big chunk of the database as long as you have marc
records.

Great, I think I understand.

3-We are still using the old API weso  read the xml and use
MARC::Record->new_from_xml( $xmldata )
A note here that we did not had to upgrade MARC::Record or MARC::Charset
at all. Any marc created within KOHA is UTF8 and any marc imported into
KOHA (old marc_subfield_tables) was correctly decoded to utf8 with
char_decode of biblio.

Could it be possible to use this zebra.cfg to manage iso2709 through Perl-ZOOM ? If yes, we could avoid marc => xml => zoom and zoom => xml => marc transformations.

4- We modified circ2.pm and items table to have item onloan field and
mapped it to marc holdings data. Now our opac search do not call mysql
but for the branchname.

Could you send us/me the code too ?

5- Average updates per day is about 2000 (circulation+cataloger). I can
say that the speed of the zoom search which slows down during a commit
operation is acceptable considering the speed gain we have on the
search.

6- Zebra behaves very well with searches but is very tempremental with
updates. A queue of updates sometimes crashes the zebraserver. When the
database crash we can not save anything even though we are using shadow
files. I'll be reporting on this issue once we can isolate the problems.

You're definetly a gem too ;-)

--
Paul POULAIN et Henri Damien LAURENT
Consultants indépendants
en logiciels libres et bibliothéconomie (http://www.koha-fr.org)




reply via email to

[Prev in Thread] Current Thread [Next in Thread]