koha-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Koha-devel] marc_word and searching


From: Joshua Ferraro
Subject: [Koha-devel] marc_word and searching
Date: Mon May 24 10:44:02 2004
User-agent: Mutt/1.4.1i

Paul et al,

I've been trying to figure out how best to solve our ' and , problem
with the marc searching and I've got a few comments to make about the
way that the searches are currently done (using marc_word) and the
problems with how marc_word stores data.

So here's a classic example of an author that fails currently:
o'brian, patrick

right now the search seperates the 'o' and the 'brian' and the 'patrick'
and the resulting query looks like this:

select distinct m1.bibid from biblio,biblioitems,marc_biblio,marc_word as 
m1,marc_word as m2,marc_word as m3,marc_word as m4 where 
biblio.biblionumber=marc_biblio.biblionumber and 
biblio.biblionumber=biblioitems.biblionumber and m1.bibid=marc_biblio.bibid and 
(m1.bibid=m2.bibid and m1.bibid=m3.bibid and m1.bibid=m4.bibid) and ((m1.word  
like 'o%' and m1.tag+m1.subfieldid in ('100a','110a', '700a', '710a'))and 
(m2.word like '\'%' and m2.tag+m2.subfieldid in('100a','110a', '700a', 
'710a'))and (m3.word like 'brian%' and m3.tag+m3.subfieldid in('100a','110a', 
'700a', '710a'))and (m4.word like 'patrick%' and m4.tag+m4.subfieldid 
in('100a','110a', '700a', '710a'))) order by biblio.title

So there is at least one major problem with this query which does not return
any results): marc_word does not store values as small as ' or o.  So of course
there are no results ...

Even if I strip out the ' and , from the query and search on something like
(I add the following after line 117 in SearchMarc.pm):

@$value[$i] =~ s/'/ /g;
@$value[$i] =~ s/,/ /g;

which turns out like:

'o brian patrick' 

it fails ('o' is too small for marc_word); and of course 

@$value[$i] =~ s/'//g;
@$value[$i] =~ s/,//g;

resulting in:

'obrian patrick' 

fails too--the data simply isn't stored right for this kind of search.

So I see two ways to fix this problem: 1) stop using marc_word for these
kinds of searches and use marc_subfield_table (which has the whole 
'o'brian, patrick' in subfield_value) or 2) fix the way that marc_word
stores small values (it should store everything including , ' and single
letters like 'a', 'o', etc.

Any comments?  Further suggestions? 

Joshua




reply via email to

[Prev in Thread] Current Thread [Next in Thread]