koha-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Koha-devel] Authorities indexing


From: Galen Charlton
Subject: [Koha-devel] Authorities indexing
Date: Tue, 29 Jan 2008 16:31:14 -0600

Hi,

I have been doing some work with the (Zebra) indexing definitions for
MARC21 authority records.  One of the current problems is that it is
impossible to effectively search on the complete heading if
subdivisions are present.  For example, a topical heading such as

150 $a Medicine $x Ability testing

is currently indexed as two separate phrases, "Medicine" and "Ability
testing".  A search for "medicine ability testing" doesn't work, and
searching on the two phrases separately (he,ext=medicine and
he,ext="ability testing") would also pick up a "ability testing --
medicine" heading if such existed in the authority file.

I have found no way to fix this using Zebra's GRS-1 filter and its elm
and melm directives in records.abs.  Consequently, I have been working
with Zebra's DOM XML filter, which generates index entries by mapping
a MARC XML record to index terms via XSLT, and have been able to get
much better results.

Two patches containing the XSLT can be found at
http://manage-gmc.dev.kohalibrary.com/patches/

Since manually maintaining the whole indexing XSL would be rather
unwieldy, I've created a concise XML representation of the indexing
definitions, authority-koha-indexdefs.xml.  This in turn is mapped via
the XSL transform koha-indexdefs-to-zebra.xsl to produce the XSL to be
used by Zebra, authority-zebra-indexdefs.xsl.

I have not yet made a formal schema for the Koha indexing definitions
XML, but once its syntax has been extended to cover UNIMARC and
bibliographic  records, this could be an approach to take to have all
Zebra indexing use the DOM XML module.  In addition, it opens the
possibility of creating code tthat allows users to work with the Zebra
indexing definitions from the staff interface -- the definitions could
be stored in the database, then serialized to XML and transformed to
the Zebra indexing XSLT.

Comments requested.

Regards,

Galen
-- 
Galen Charlton
Koha Application Developer
LibLime
address@hidden
p: 1-888-564-2457 x709




reply via email to

[Prev in Thread] Current Thread [Next in Thread]