[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Koha-devel] Zebra Searching
From: |
Joshua Ferraro |
Subject: |
[Koha-devel] Zebra Searching |
Date: |
Mon Jun 13 13:59:21 2005 |
User-agent: |
Mutt/1.4.1i |
Hi everyone,
In case you haven't been following the IRC logs we've been discussing
Zebra as a potential searching engine. From Indexdata's website:
Zebra is a high-performance, general-purpose structured text indexing and
retrieval engine. It reads structured records in a variety of input formats
(eg. email, XML, MARC) and allows access to them through exact boolean search
expressions and relevance-ranked free-text queries.
Zebra supports large databases (more than ten gigabytes of data, tens of
millions of records). It supports incremental, safe database updates on live
systems. You can access data stored in Zebra using a variety of Index Data
tools (eg. YAZ and PHP/YAZ) as well as commercial and freeware Z39.50 clients
and toolkits.
http://indexdata.dk/zebra
I've setup a zebra test site running on LibLime's server. It currently
has access to three Zebra datasets, Nelsonville's 150K records, LibLime's
5 million records (recently donated by sanspach), and Paul Poulain's 13K
records. (Paul is still working out some issues with indexing unimarc
records so stay tuned for that one to work).
http://liblime.com/zap/advanced.html
Note that the search and retrieval is done via the Z39.50 protocol with
the server that ships with Zebra and both the index and the server can
be customized based on the kinds of searches you want to perform (the
above site is just a proof of concept) -- we'd have support for relevence
ranking, stemming, the whole gambit of searching technologies.
In all my tests searches are returned in under a second.
If we decide to work with Zebra we will need to decide what to do with
non-marc libraries. Should we develop an export utility that will allow
Zebra to index the records (in say, XML format)? Should we use the Koha
tables to create a basic MARC record for use with Zebra? Should we leave
the Koha 1.x searching methods unchanged and only use Zebra for
MARC libraries? Also, what should we do with the existing marc_*_table
tables?
So ... it's clearly time to schedule a "Koha 2.4 Searching Group Meeting" on
IRC. I'd like to pick a time when everyone can be represented. how
is Thursday, June 23 at 9:00 GMT? Here's the time in your area:
http://tinyurl.com/925c8
Please let me know on-list if you will not be able to attend and what
time you can attend.
Comments, suggestions, concerns?
--
Joshua Ferraro VENDOR SERVICES FOR OPEN-SOURCE SOFTWARE
President, Technology migration, training, maintenance, support
LibLime Koha ILS, Mambo Intranet, DiscrimiNet Filter
address@hidden | Full Demos at http://liblime.com | 1(888)KohaILS
- [Koha-devel] Zebra Searching,
Joshua Ferraro <=