koha-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Koha-devel] new features to test


From: paul POULAIN
Subject: Re: [Koha-devel] new features to test
Date: Mon Mar 22 06:40:05 2004
User-agent: Mozilla/5.0 (X11; U; Linux i686; fr-FR; rv:1.4) Gecko/20030630

Stephen Hedges wrote:

Paul -

I've been testing the changes you made to search.marc/search.pl and
SearchMarc.pm and thought you would like to see some results.

yep, thanks.

First some questions.  The "query" that gets displayed as you are building
the search is confusing.  It always seems to start with "biblioitems.illus
contains ''."  Is that actually correct?  And does the query start by
using the koha fields? Is that where it's finding the tags to search on? (I should have taken a closer look at your code.)

a bug, probably. that's why it's in HEAD branch :-D

Also three general problems.  The marclist seems to be broken, since it is
not available in the pull-down menu (neither in the default template nor
the npl template). And MARCdetail.pl never seems to display any results. And all of the queries are followed in the log with this error:

DBD::mysql::st execute failed: Unknown column 'seealso' in 'field list' at
/usr/local/koha/intranet/modules//C4/Biblio.pm line 235.
[sounds like we need to updatedatabase]

yes ;-)

So now for some search results --

* Search on keyword "dog" (worked OK):
explain select distinct m1.bibid from
biblio,biblioitems,marc_biblio,marc_word as m1 where
biblio.biblionumber=marc_biblio.biblionumber and
biblio.biblionumber=biblioitems.biblionumber and
m1.bibid=marc_biblio.bibid and (m1.word  like 'dog%' and
m1.tag+m1.subfieldid in ('')) order by biblio.title;
+-------------+--------+----------------------+----------+---------+--------------------------+------+----------------------------------------------+
| table       | type   | possible_keys        | key      | key_len | ref
             | rows | Extra                                        |
+-------------+--------+----------------------+----------+---------+--------------------------+------+----------------------------------------------+
| m1          | range  | bibid,word           | word     |     255 | NULL
             | 2200 | Using where; Using temporary; Using filesort |
| marc_biblio | eq_ref | PRIMARY,biblionumber | PRIMARY  |       8 | m1.bibid
             |    1 | Using where; Distinct                        |
| biblio      | eq_ref | PRIMARY,blbnoidx     | PRIMARY  |       4 |
marc_biblio.biblionumber | 1 | Distinct |
| biblioitems | ref    | bibnoidx             | bibnoidx |       4 |
biblio.biblionumber | 12 | Using index; Distinct |
+-------------+--------+----------------------+----------+---------+--------------------------+------+----------------------------------------------+

Nice result set : 2200 x 1 x 1 x12 rows to parse. Not too many.

* Search on author "Henry" (worked OK):

<snip>

* Search on author "Sue Henry" order by author (worked _very_ well!)

<snip>

* Search on author "Sue Henry" order by title (worked _very_ well!)

<snip>

* BUT NOTE -- Search on author "Henry, Sue" did NOT work:
select distinct m1.bibid from biblio,biblioitems,marc_biblio,marc_word as
m1,marc_word as m2 where biblio.biblionumber=marc_biblio.biblionumber and
biblio.biblionumber=biblioitems.biblionumber and
m1.bibid=marc_biblio.bibid and (m1.bibid=m2.bibid) and ((m1.word  like
'henry,%' and m1.tag+m1.subfieldid in ('100a'))and (m2.word like 'sue%'
and m2.tag+m2.subfieldid in('100a'))) order by biblio.author
[no results returned]
* Search on title "I'll be home for Christmas" (failed):
select distinct m1.bibid from biblio,biblioitems,marc_biblio,marc_word as
m1,marc_word as m2,marc_word as m3 where
biblio.biblionumber=marc_biblio.biblionumber and
biblio.biblionumber=biblioitems.biblionumber and
m1.bibid=marc_biblio.bibid and (m1.bibid=m2.bibid and m1.bibid=m3.bibid)
and ((m1.word  like 'I'll%' and m1.tag+m1.subfieldid in ('245a'))and
(m2.word like 'home%' and m2.tag+m2.subfieldid in('245a'))and (m3.word
like 'Christmas%' and m3.tag+m3.subfieldid in('245a'))) order by
biblio.title
DBD::mysql::st execute failed: You have an error in your SQL syntax. Check the manual that corresponds to your MySQL server version for the
right syntax to use near 'll%' and m1.tag+m1.subfieldid in ('245a'))and
(m2.word like 'ho
[we need to escape apostrophes]

we need to remove ' and , and other useless characters. enter a bugs.koha.org pls

* Search on barcode (worked very well):
+-------------+--------+----------------------+----------+---------+--------------------------+------+----------------------------------------------+
| table       | type   | possible_keys        | key      | key_len | ref
             | rows | Extra                                        |
+-------------+--------+----------------------+----------+---------+--------------------------+------+----------------------------------------------+
| m1          | range  | bibid,word           | word     |     255 | NULL
             |    1 | Using where; Using temporary; Using filesort |
| marc_biblio | eq_ref | PRIMARY,biblionumber | PRIMARY  |       8 | m1.bibid
             |    1 | Using where; Distinct                        |
| biblio      | eq_ref | PRIMARY,blbnoidx     | PRIMARY  |       4 |
marc_biblio.biblionumber | 1 | Distinct |
| biblioitems | ref    | bibnoidx             | bibnoidx |       4 |
biblio.biblionumber | 12 | Using index; Distinct |
+-------------+--------+----------------------+----------+---------+--------------------------+------+----------------------------------------------+

wonderful :1 x 1 x 1 x 12 rows to parse in the DB !!! For a 1Gb DB if your test DB is a copy of your real one ?

* Search ("Q2") on keyword "England" and itemtype "AB" (failed--returned
no results);
explain select distinct m1.bibid from
biblio,biblioitems,marc_biblio,marc_word as m1,marc_word as m2 where
biblio.biblionumber=marc_biblio.biblionumber and
biblio.biblionumber=biblioitems.biblionumber and
m1.bibid=marc_biblio.bibid and (m1.bibid=m2.bibid) and ((m1.word  like
'england%' and m1.tag+m1.subfieldid in (''))and (m2.word like 'AB%' and
m2.tag+m2.subfieldid in('item'))) order by biblio.title;
+-------------+--------+----------------------+----------+---------+--------------------------+------+----------------------------------------------+
| table       | type   | possible_keys        | key      | key_len | ref
             | rows | Extra                                        |
+-------------+--------+----------------------+----------+---------+--------------------------+------+----------------------------------------------+
| m1          | range  | bibid,word           | word     |     255 | NULL
             | 5377 | Using where; Using temporary; Using filesort |
| marc_biblio | eq_ref | PRIMARY,biblionumber | PRIMARY  |       8 | m1.bibid
             |    1 | Using where; Distinct                        |
| biblio      | eq_ref | PRIMARY,blbnoidx     | PRIMARY  |       4 |
marc_biblio.biblionumber | 1 | Distinct |
| biblioitems | ref    | bibnoidx             | bibnoidx |       4 |
biblio.biblionumber | 12 | Using index; Distinct |
| m2          | ref    | bibid,word           | bibid    |       8 | m1.bibid
             |   20 | Using where; Distinct                        |
+-------------+--------+----------------------+----------+---------+--------------------------+------+----------------------------------------------+

mmm... this one is important : the marc_word does NOT store words <= 2 letters.

sub MARCaddword {
# split a subfield string and adds it into the word table.
# removes stopwords
my ($dbh,$bibid,$tag,$tagorder,$subfieldid,$subfieldorder,$sentence) address@hidden;
   $sentence =~ s/(\.|\?|\:|\!|\'|,|\-|\"|\(|\)|\[|\]|\{|\})/ /g;
   my @words = split / /,$sentence;
   my $stopwords= C4::Context->stopwords;
my $sth=$dbh->prepare("insert into marc_word (bibid, tag, tagorder, subfieldid, subfieldorder, word, sndx_word)
           values (?,?,?,?,?,?,soundex(?))");
   foreach my $word (@words) {
# we record only words longer than 2 car and not in stopwords hash
   if (length($word)>2 and !($stopwords->{uc($word)})) {
$sth->execute($bibid,$tag,$tagorder,$subfieldid,$subfieldorder,$word,$word);
       if ($sth->err()) {
warn "ERROR ==> insert into marc_word (bibid, tag, tagorder, subfieldid, subfieldorder, word, sndx_word) values ($bibid,$tag,$tagorder,$subfieldid,$subfieldorder,$word,soundex($word))\n";
       }
   }
   }
}

What should we decide here ?
* also store <=2 letters word.
* say that itemtype & other fields must be >2 chars ?

Hope this helps --- Stephen

yes, thanks a lot.

--
Paul POULAIN
Consultant indépendant en logiciels libres
responsable francophone de koha (SIGB libre http://www.koha-fr.org)





reply via email to

[Prev in Thread] Current Thread [Next in Thread]