[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Koha-zebra] Koha Zebra Searching Report (from NPL)
From: |
Joshua Ferraro |
Subject: |
Re: [Koha-zebra] Koha Zebra Searching Report (from NPL) |
Date: |
Wed, 22 Mar 2006 17:54:20 -0800 |
User-agent: |
Mutt/1.4.1i |
On Wed, Mar 22, 2006 at 08:28:26PM -0500, Sebastian Hammer wrote:
> Can't do XOR today. I suppose it would be a possible new feature, but
> I've frankly never heard of it in an ILS.. can a XOR b be mapped to
>
> (a OR b) NOT (a AND b) ? or am I just showing my fading math skills to
> ill effect, here?
Yep, that's the correct mapping. Voyager's where NPL originally
saw the XOR function.
> Why do you see yourelf limited to Bib-1? Within Koha, you can do
> whatever you want -- specifically extend Bib-1 into the 8000-range
> (IIRC) for local USE attributes or define a private set.
Right, I was just hoping there was some way to map it to bib-1 as
I assume that would be useful in cross-domain searching. If not we
can certainly do a locally defined attribute or set.
> >SPELLCHECKING
> It isn't soundex, but it will behave somewhat the same in many cases.
> Try searching with truncation=Regexp-2 (103). This enables
> error-tolerant searching. By default, one error (insert/delete/replace)
> per term will still lead to a match. More at
> http://www.indexdata.com/zebra/doc/protocol-support.tkl#search
Neat, we'll look into it.
> >TITLE SEARCHING
> >
> This would, I believe, require new development. It's possible that one
> of the experimental ranking algorithms that are included might provide
> better results for these people, but I *think* that boosting the score
> for one field in a ranked keyword search would require an extension to
> the index structure.
I've looked high and low for documentation on the ranking algorithms in
Zebra but haven't found much more than a few sentences in the official
docs and some list messages ...
> >AUTHOR SEARCHING
> >
> >Again, the current relevance ranking doesn't quite cut it. A good
> >example is a relevance ranked author search on "James Joyce". Some
> >records sneak into high relevance because they have multiple authors
> >with names like "James Henry" and "Paul Joyce" (take "Bob the Builder
> >in the NPL database as an example
> >
> It might be worth checking whether one of the custom ranking algos did
> better on this..you an look in the NEWS file for instructions on how to
> enable them.
Will do.
> >relevance ranking
> >should account for proximity and use that as the highest ranking
> >consideration to ensure that a search on "James Joyce" returns all the
> >books by "James Joyce" first. Also, they requested that the default
> >ranking secondarily sort the items by date as well because they often
> >are asked to find the 'latest' book by so and so. We concluded that
> >the copyright date stored in the 008 is probably the only date
> >normalized enough to use for sorting though I'm not sure if zebra can
> >use that for sorting.
> >
> >
> It could with the XSLT index rules of Zebra 1.4.
Cool, and are there docs on that somewhere? :-)
> >SUBJECT SEARCHING
> >
> >They seemed pleased with the way subject searching was working, it
> >will correctly find things like "horses--psychology" where the first
> >term is in 650$a and the second in $x. However, it seems not to
> >rank things based on proximity within a tag -- meaning that a search
> >on horses--psychology will pull up records containing:
> >
> >650$a horses
> >$x pets
> >
> >650$a humans
> >$x psychology
> >
> >and records with the actual 'horses--psychology' (650$a$x) subject
> >heading aren't given any favor in the ranking (I misplaced my actual
> >example and the one above is one I invented).
> >
> Same thing. I don't know how hard it would be to add a score for
> proximity.. that data is at least in the index structure, but I've no
> idea how hard it would be to fit into the code. We can ask the Zebra
> wranglers what it would entail if you're interested.
Yes, please do, we're very interested in that particular one.
> >SUBJECT HEADING SEARCH
> >
> >NPL would like to see a demonstration of a 'Subject Heading' search
> >using authorities generated from the data to compile a list of
> >authoritative headings (which would be compiled from multiple fields
> >within a given subject tag such as $650$a$v$x, etc.). So I think
> >to do this right we'd need to look at putting our authority records
> >in Zebra as well.
> >
> Hmm. Not sure I fully grok the requirement here.. you seem to suggest
> both constructing a specific index key based on a concatenation of
> multiple fields (easy in the XSLT indexing rules of 1.4, not compatible
> with the 'melm' directive.
I'm unclear about the differences between 'elm' and 'melm'. The docs
seem to indicate that they are the same...
Thanks!
--
Joshua Ferraro VENDOR SERVICES FOR OPEN-SOURCE SOFTWARE
President, Technology migration, training, maintenance, support
LibLime Featuring Koha Open-Source ILS
address@hidden |Full Demos at http://liblime.com/koha |1(888)KohaILS
- [Koha-zebra] Koha Zebra Searching Report (from NPL), Joshua Ferraro, 2006/03/22
- Re: [Koha-zebra] Koha Zebra Searching Report (from NPL), Joshua Ferraro, 2006/03/27
- Re: [Koha-zebra] Koha Zebra Searching Report (from NPL), Sebastian Hammer, 2006/03/27
- Re: [Koha-zebra] Koha Zebra Searching Report (from NPL), Mike Taylor, 2006/03/28
- Re: [Koha-zebra] Koha Zebra Searching Report (from NPL), Chris Cormack, 2006/03/28
- Re: [Koha-zebra] Koha Zebra Searching Report (from NPL), Mike Taylor, 2006/03/29
- Re: [Koha-zebra] Koha Zebra Searching Report (from NPL), Adam Dickmeiss, 2006/03/29
- Re: [Koha-zebra] Koha Zebra Searching Report (from NPL), Mike Taylor, 2006/03/29