koha-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Koha-devel] marc_word and searching


From: paul POULAIN
Subject: Re: [Koha-devel] marc_word and searching
Date: Wed May 26 07:40:30 2004
User-agent: Mozilla/5.0 (X11; U; Linux i686; fr-FR; rv:1.6) Gecko/20040115

Stephen Hedges a écrit :

At what point does marc_word become so big and clunky that it becomes a
liability instead of an asset?  NPL's marc-word file is full of 'junk'
entries like "(pa." (picked up when an ISBN number has "(pa.)" after it to
denote paperback) and other such MARC oddities.  Our stopword file should
ideally be expanded to catch all of this junk, but I haven't done that
yet.  Now we're talking about adding punctuation marks and single letters!
I agree with Joshua that this is what should be done if we're going to
depend on using marc_word and expect to get any meaningful search results.
My question is:  maybe it would be more efficient to just use
marc_subfield_table for these searches and forget about marc_word?

you're right stephen...
I have an other idea that could be coded quickly : in the MARC framework, we could add a checkbox called "do NOT index this subfield". If checked, the subfield wouldn't be stored in marc_word (but stored in marc_subfield_table)
(Needs a script to clean the DB too, should be quite easy :
foreach subfield in marc_subfield_structure {
   if checkbox checked {
      delete from marc_word where subfield= this one
   }
}
...)

--
Paul POULAIN
Consultant indépendant en logiciels libres
responsable francophone de koha (SIGB libre http://www.koha-fr.org)




reply via email to

[Prev in Thread] Current Thread [Next in Thread]