Re: [sdx-developers] QueryParser... encore

sdx-developers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [sdx-developers] QueryParser... encore

From:	Pierrick Brihaye
Subject:	Re: [sdx-developers] QueryParser... encore
Date:	Fri, 26 Sep 2003 09:23:57 +0200
User-agent:	Mozilla/5.0 (Windows; U; Win98; fr-FR; rv:1.0.2) Gecko/20030208 Netscape/7.02

Salut,

Pierrick Brihaye a écrit:

Si vous êtes d'accord, je fais le commit.

???

Je dois cependant gérer le cas
complexe où on aurait un panachage de tokens ayant des positions
différentes. En clair, vos PhraseQuery devraient encore fonctionner (le code
est le même) mais les miennes ne sont pas près de le faire :-)

Ca y est : ça fonctionne. V. fichier joint (désolé : difficile de faireun patch sans version CVS de référence).


L'implémentation est asssez lourde :
1) ça fait des Phrasequery, même quand on n'a qu'un seul token

2) je n'ai pas réussi à tirer parti de Query.clone (je redoute fort quel'implémentation de Cloneable ne soit que purement factuelle).


... mais ça tourne.

A part le problème de getStringQuery et celui des acronymes ;-), jeconsidère donc le problème du QueryParser comme étant réglé...


A+

--
Pierrick Brihaye, informaticien
Service régional de l'Inventaire
DRAC Bretagne
mailto:address@hidden

/* ====================================================================
 * The Apache Software License, Version 1.1
 *
 * Copyright (c) 2001, 2002, 2003 The Apache Software Foundation.  All
 * rights reserved.
 *
 * Redistribution and use in source and binary forms, with or without
 * modification, are permitted provided that the following conditions
 * are met:
 *
 * 1. Redistributions of source code must retain the above copyright
 *    notice, this list of conditions and the following disclaimer.
 *
 * 2. Redistributions in binary form must reproduce the above copyright
 *    notice, this list of conditions and the following disclaimer in
 *    the documentation and/or other materials provided with the
 *    distribution.
 *
 * 3. The end-user documentation included with the redistribution,
 *    if any, must include the following acknowledgment:
 *       "This product includes software developed by the
 *        Apache Software Foundation (http://www.apache.org/)."
 *    Alternately, this acknowledgment may appear in the software itself,
 *    if and wherever such third-party acknowledgments normally appear.
 *
 * 4. The names "Apache" and "Apache Software Foundation" and
 *    "Apache Lucene" must not be used to endorse or promote products
 *    derived from this software without prior written permission. For
 *    written permission, please contact address@hidden
 *
 * 5. Products derived from this software may not be called "Apache",
 *    "Apache Lucene", nor may "Apache" appear in their name, without
 *    prior written permission of the Apache Software Foundation.
 *
 * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED
 * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
 * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
 * DISCLAIMED.  IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR
 * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
 * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
 * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF
 * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
 * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
 * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
 * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
 * SUCH DAMAGE.
 * ====================================================================
 *
 * This software consists of voluntary contributions made by many
 * individuals on behalf of the Apache Software Foundation.  For more
 * information on the Apache Software Foundation, please see
 * <http://www.apache.org/>.
 */


options {
  STATIC=false;
  JAVA_UNICODE_ESCAPE=true;
  USER_CHAR_STREAM=true;
}

PARSER_BEGIN(DefaultQueryParser)

package fr.gouv.culture.sdx.search.lucene.queryparser;

import java.util.Vector;
import java.io.*;
import java.text.*;
import java.util.*;
import org.apache.lucene.index.Term;
import org.apache.lucene.analysis.*;
import org.apache.lucene.document.*;
import org.apache.lucene.search.*;
import fr.gouv.culture.sdx.search.lucene.DateField;
import fr.gouv.culture.sdx.search.lucene.analysis.Analyzer;
import fr.gouv.culture.sdx.search.lucene.analysis.DefaultAnalyzer;

/**
 * This class is generated by JavaCC.  The only method that clients should need
 * to call is <a href="#parse">parse()</a>.
 *
 * The syntax for query strings is as follows:
 * A Query is a series of clauses.
 * A clause may be prefixed by:
 * <ul>
 * <li> a plus (<code>+</code>) or a minus (<code>-</code>) sign, indicating
 * that the clause is required or prohibited respectively; or
 * <li> a term followed by a colon, indicating the field to be searched.
 * This enables one to construct queries which search multiple fields.
 * </ul>
 *
 * A clause may be either:
 * <ul>
 * <li> a term, indicating all the documents that contain this term; or
 * <li> a nested query, enclosed in parentheses.  Note that this may be used
 * with a <code>+</code>/<code>-</code> prefix to require any of a set of
 * terms.
 * </ul>
 *
 * Thus, in BNF, the query grammar is:
 * <pre>
 *   Query  ::= ( Clause )*
 *   Clause ::= ["+", "-"] [&lt;TERM&gt; ":"] ( &lt;TERM&gt; | "(" Query ")" )
 * </pre>
 *
 * <p>
 * Examples of appropriately formatted queries can be found in the <a
 * 
href="http://jakarta.apache.org/lucene/src/test/org/apache/lucene/queryParser/TestDefaultQueryParser.java";>test
 cases</a>.
 * </p>
 *
 * @author Brian Goetz
 * @author Peter Halacsy
 * @author Tatu Saloranta
 */

public class DefaultQueryParser implements QueryParser {

  private static final int CONJ_NONE   = 0;
  private static final int CONJ_AND    = 1;
  private static final int CONJ_OR     = 2;

  private static final int MOD_NONE    = 0;
  private static final int MOD_NOT     = 10;
  private static final int MOD_REQ     = 11;

  public static final int DEFAULT_OPERATOR_OR  = 0;
  public static final int DEFAULT_OPERATOR_AND = 1;

  /** The actual operator that parser uses to combine query terms */
  private int operator = DEFAULT_OPERATOR_OR;

  /**
   * Whether terms of wildcard and prefix queries are to be automatically
   * lower-cased or not.  Default is <code>true</code>.
   */
  boolean lowercaseWildcardTerms = true;

  Analyzer analyzer;
  String field;
  int phraseSlop = 0;


  /** Parses a query string, returning a address@hidden 
org.apache.lucene.search.Query}.
   *  @param query      the query string to be parsed.
   *  @param field      the default field for query terms.
   *  @param analyzer   used to find terms in the query text.
   *  @throws ParseException if the parsing fails
   */
  static public Query parse(String query, String field, Analyzer analyzer)
       throws ParseException {
    try {
      DefaultQueryParser parser = new DefaultQueryParser(field, analyzer);
      return parser.parse(query);
    }
    catch (TokenMgrError tme) {
      throw new ParseException(tme.getMessage());
    }
  }

   /**Constructs a query parser.*/
  public DefaultQueryParser(){
    this(new FastCharStream(new StringReader("")));
  }

  /** Constructs a query parser.
   *  @param f  the default field for query terms.
   *  @param a   used to find terms in the query text.
   */
  public DefaultQueryParser(String f, Analyzer a) {
    this(new FastCharStream(new StringReader("")));
    analyzer = a;
    field = f;
  }

  /**Sets the fields of the query parser
  *  @param f   the default field for query terms.
  *  @param a   used to find terms in the query text.
  */
  public void setUp(String f, Analyzer a){
    analyzer = a;
    field = f;
  }

  /**Sets the fields of the query parser
  *  @param a           used to find terms in the query text.
  *  @param phraseSlop  the slop
  *  @param operator    the operator
  */
  public void setUp(Analyzer a, int phraseSlop, int operator){
    analyzer = a;
    setPhraseSlop(phraseSlop);
    setOperator(operator);
  }

  /** Sets the fields of the query parser
   *  @param f  the default field for query terms.
   *  @param a   used to find terms in the query text.
   *  @param phraseSlop the slop
   *  @param operator   the operator
   */
  public void setUp(String f, Analyzer a, int phraseSlop, int operator){
    field = f;
    setUp(a, phraseSlop, operator);
  }

  /** Parses a query string, returning a
   * <a href="lucene.search.Query.html">Query</a>.
   *  @param query      the query string to be parsed.
   *  @throws ParseException if the parsing fails
   *  @throws TokenMgrError if ther parsing fails
   */
  public Query parse(String query) throws ParseException, TokenMgrError {
    ReInit(new FastCharStream(new StringReader(query)));
    return Query(field);
  }

  /**
   * Sets the default slop for phrases.  If zero, then exact phrase matches
   * are required.  Default value is zero.
   */
  public void setPhraseSlop(int phraseSlop) {
    this.phraseSlop = phraseSlop;
  }

  /**
   * Gets the default slop for phrases.
   */
  public int getPhraseSlop() {
    return phraseSlop;
  }

  /**
   * Sets the boolean operator of the DefaultQueryParser.
   * In classic mode (<code>DEFAULT_OPERATOR_OR</mode>) terms without any 
modifiers
   * are considered optional: for example <code>capital of Hungary</code> is 
equal to
   * <code>capital OR of OR Hungary</code>.<br/>
   * In <code>DEFAULT_OPERATOR_AND</code> terms are considered to be in 
conjuction: the
   * above mentioned query is parsed as <code>capital AND of AND Hungary</code>
   */
  public void setOperator(int operator) {
    this.operator = operator;
  }

  public int getOperator() {
    return operator;
  }

  public void setLowercaseWildcardTerms(boolean lowercaseWildcardTerms) {
    this.lowercaseWildcardTerms = lowercaseWildcardTerms;
  }

  public boolean getLowercaseWildcardTerms() {
    return lowercaseWildcardTerms;
  }

  protected void addClause(Vector clauses, int conj, int mods, Query q) {
    boolean required, prohibited;

    // If this term is introduced by AND, make the preceding term required,
    // unless it's already prohibited
    if (conj == CONJ_AND) {
      BooleanClause c = (BooleanClause) clauses.elementAt(clauses.size()-1);
      if (!c.prohibited)
        c.required = true;
    }

    if (operator == DEFAULT_OPERATOR_AND && conj == CONJ_OR) {
      // If this term is introduced by OR, make the preceding term optional,
      // unless it's prohibited (that means we leave -a OR b but +a OR b-->a OR 
b)
      // notice if the input is a OR b, first term is parsed as required; 
without
      // this modification a OR b would parsed as +a OR b
      BooleanClause c = (BooleanClause) clauses.elementAt(clauses.size()-1);
      if (!c.prohibited)
        c.required = false;
    }

    // We might have been passed a null query; the term might have been
    // filtered away by the analyzer.
    if (q == null)
      return;

    if (operator == DEFAULT_OPERATOR_OR) {
      // We set REQUIRED if we're introduced by AND or +; PROHIBITED if
      // introduced by NOT or -; make sure not to set both.
      prohibited = (mods == MOD_NOT);
      required = (mods == MOD_REQ);
      if (conj == CONJ_AND && !prohibited) {
        required = true;
      }
    } else {
      // We set PROHIBITED if we're introduced by NOT or -; We set REQUIRED
      // if not PROHIBITED and not introduced by OR
      prohibited = (mods == MOD_NOT);
      required   = (!prohibited && conj != CONJ_OR);
    }
    clauses.addElement(new BooleanClause(q, required, prohibited));
  }

        protected Query getFieldQuery(String field,
        Analyzer analyzer,
        String queryText) {
                // Use the analyzer to get all the tokens, and then build a 
TermQuery,
                // PhraseQuery, or nothing based on the term count
                
                TokenStream source = analyzer.tokenStream(field,
                new StringReader(queryText));
                Vector v = new Vector();
                org.apache.lucene.analysis.Token t;
                
                while (true) {
                        try {
                                t = source.next();
                        }
                        catch (IOException e) {
                                t = null;
                        }
                        if (t == null)
                                break;
                        v.addElement(t);
                }
                if (v.size() == 0)
                        return null;
                else if (v.size() == 1) {
                        t = (org.apache.lucene.analysis.Token)v.elementAt(0);
                        return new TermQuery(new Term(field, t.termText()));
                }
                else {

                    BooleanQuery bq = new BooleanQuery();
                    Vector queriesSoFar = new Vector();
                    Vector newQueries = new Vector();               
                    PhraseQuery currentQuery;
                    int currentPosition = 0;

                    for (int i=0 ; i < v.size() ; i++) {
                        t = (org.apache.lucene.analysis.Token)v.elementAt(i);
                        //detect new position
                        if (t.getPositionIncrement() == 1) {
                            queriesSoFar.removeAllElements();
                            queriesSoFar.addAll(newQueries);
                            newQueries.removeAllElements();
                            currentPosition++;
                            System.out.println("New position : " + 
currentPosition);
                        }
                        currentQuery = new PhraseQuery();
                        currentQuery.setSlop(phraseSlop);
                        //create first queries                  
                        if (currentPosition == 1) {
                            currentQuery.add(new Term(field, t.termText()));
                            System.out.println("Added term at first position : 
" + t.termText());                           
                        //re-use previous queries
                        } else {
                            for (int j=0 ; j < queriesSoFar.size() ; j++) {     
                                PhraseQuery previousQuery = 
(PhraseQuery)queriesSoFar.elementAt(j);
                                Term[] terms = previousQuery.getTerms();
                                for (int k=0 ; k < terms.length ; k++) {        
                                    currentQuery.add(terms[k]);
                                }                               
                                System.out.println("Added term : " + 
t.termText() + " to query : " + currentQuery.toString());                       
   
                                currentQuery.add(new Term(field, 
t.termText()));                                
                            }
                        }
                        newQueries.add(currentQuery);                       
                    }               
                    for (int l=0 ; l < newQueries.size() ; l++) {
                        currentQuery = (PhraseQuery)newQueries.elementAt(l);
                        bq.add(currentQuery, false, false);
                    }
                    System.out.println("Final query : " + bq.toString());
                    return bq;

                }
        }


  private Query getStringQuery(String field, String queryText) {
    // check for nulls etc.
    return new TermQuery(new Term(field, queryText));
  }

  private Query getRangeQuery(String field,
                              Analyzer analyzer,
                              String part1,
                              String part2,
                              boolean inclusive)
  {
    boolean isDate = false, isNumber = false;

    try {
      /*DateFormat df = DateFormat.getDateInstance(DateFormat.SHORT);
      df.setLenient(true);
      Date d1 = df.parse(part1);
      Date d2 = df.parse(part2);
      part1 = DateField.dateToString(d1);
      part2 = DateField.dateToString(d2);
      isDate = true;*/
       //DateFormat df = DateFormat.getDateInstance(DateFormat.SHORT);
      //df.setLenient(true);
      //Date d1 = df.parse(part1);
      //Date d2 = df.parse(part2);
      //since we use our own date format when we store date fields, we should 
do the same upon searching-rbp
      Date d1 = fr.gouv.culture.sdx.utils.Date.parseDate(part1);
      Date d2 = fr.gouv.culture.sdx.utils.Date.parseDate(part2);
      //using sdx date field support-rbp
      part1 = fr.gouv.culture.sdx.search.lucene.DateField.dateToString(d1);
      part2 = fr.gouv.culture.sdx.search.lucene.DateField.dateToString(d2);
      isDate = true;
    }
    catch (Exception e) { }

    if (!isDate) {
      // @@@ Add number support
    }

    return new RangeQuery(new Term(field, part1),
                          new Term(field, part2),
                          inclusive);
  }

  /**
   * Factory method for generating query, given a set of clauses.
   * By default creates a boolean query composed of clauses passed in.
   *
   * Can be overridden by extending classes, to modify query being
   * returned.
   *
   * @param clauses Vector that contains address@hidden BooleanClause} instances
   *    to join.
   *
   * @return Resulting address@hidden Query} object.
   */
  protected Query getBooleanQuery(Vector clauses)
  {
    BooleanQuery query = new BooleanQuery();
    for (int i = 0; i < clauses.size(); i++) {
        query.add((BooleanClause)clauses.elementAt(i));
    }
    return query;
  }

  /**
   * Factory method for generating a query. Called when parser
   * parses an input term token that contains one or more wildcard
   * characters (? and *), but is not a prefix term token (one
   * that has just a single * character at the end)
   *<p>
   * Depending on settings, prefix term may be lower-cased
   * automatically. It will not go through the default Analyzer,
   * however, since normal Analyzers are unlikely to work properly
   * with wildcard templates.
   *<p>
   * Can be overridden by extending classes, to provide custom handling for
   * wildcard queries, which may be necessary due to missing analyzer calls.
   *
   * @param field Name of the field query will use.
   * @param termStr Term token that contains one or more wild card
   *   characters (? or *), but is not simple prefix term
   *
   * @return Resulting address@hidden Query} built for the term
   */
  protected Query getWildcardQuery(String field, String termStr)
  {
    if (lowercaseWildcardTerms) {
        termStr = termStr.toLowerCase();
    }
    Term t = new Term(field, termStr);
    return new WildcardQuery(t);
  }

  /**
   * Factory method for generating a query (similar to
   * (address@hidden #getWildcardQuery}). Called when parser parses an input 
term
   * token that uses prefix notation; that is, contains a single '*' wildcard
   * character as its last character. Since this is a special case
   * of generic wildcard term, and such a query can be optimized easily,
   * this usually results in a different query object.
   *<p>
   * Depending on settings, a prefix term may be lower-cased
   * automatically. It will not go through the default Analyzer,
   * however, since normal Analyzers are unlikely to work properly
   * with wildcard templates.
   *<p>
   * Can be overridden by extending classes, to provide custom handling for
   * wild card queries, which may be necessary due to missing analyzer calls.
   *
   * @param field Name of the field query will use.
   * @param termStr Term token to use for building term for the query
   *    (<b>without</b> trailing '*' character!)
   *
   * @return Resulting address@hidden Query} built for the term
   */
  protected Query getPrefixQuery(String field, String termStr)
  {
    if (lowercaseWildcardTerms) {
        termStr = termStr.toLowerCase();
    }
    Term t = new Term(field, termStr);
    return new PrefixQuery(t);
  }

  /**
   * Factory method for generating a query (similar to
   * (address@hidden #getWildcardQuery}). Called when parser parses
   * an input term token that has the fuzzy suffix (~) appended.
   *
   * @param field Name of the field query will use.
   * @param termStr Term token to use for building term for the query
   *
   * @return Resulting address@hidden Query} built for the term
   */
  protected Query getFuzzyQuery(String field, String termStr)
  {
    Term t = new Term(field, termStr);
    return new FuzzyQuery(t);
  }

  public static void main(String[] args) throws Exception {
    DefaultQueryParser qp = new DefaultQueryParser("field",
                           new DefaultAnalyzer());
    Query q = qp.parse(args[0]);
    System.out.println(q.toString("field"));
  }
}

PARSER_END(DefaultQueryParser)

/* ***************** */
/* Token Definitions */
/* ***************** */

<*> TOKEN : {
  <#_NUM_CHAR:   ["0"-"9"] >
| <#_ESCAPED_CHAR: "\\" [ "\\", "+", "-", "!", "(", ")", ":", "^",
                          "[", "]", "\"", "{", "}", "~", "*", "?" ] >
| <#_TERM_START_CHAR: ( ~[ " ", "\t", "+", "-", "!", "(", ")", ":", "^",
                           "[", "]", "\"", "{", "}", "~", "*", "?" ]
                       | <_ESCAPED_CHAR> ) >
| <#_TERM_CHAR: ( <_TERM_START_CHAR> | <_ESCAPED_CHAR> ) >
| <#_WHITESPACE: ( " " | "\t" ) >
}

<DEFAULT, RangeIn, RangeEx> SKIP : {
  <<_WHITESPACE>>
}

// OG: to support prefix queries:
// http://nagoya.apache.org/bugzilla/show_bug.cgi?id=12137
// Change from:
// | <WILDTERM:  <_TERM_START_CHAR>
//              (<_TERM_CHAR> | ( [ "*", "?" ] ))* >
// To:
//
// | <WILDTERM:  (<_TERM_CHAR> | ( [ "*", "?" ] ))* >

<DEFAULT> TOKEN : {
  <AND:       ("AND" | "&&") >
| <OR:        ("OR" | "||") >
| <NOT:       ("NOT" | "!") >
| <PLUS:      "+" >
| <MINUS:     "-" >
| <LPAREN:    "(" >
| <RPAREN:    ")" >
| <COLON:     ":" >
| <CARAT:     "^" > : Boost
| <QUOTED:     "\"" (~["\""])+ "\"">
| <STRING:     "|" (~["|"])+ "|">
| <TERM:      <_TERM_START_CHAR> (<_TERM_CHAR>)*  >
| <FUZZY:     "~" >
| <SLOP:      "~" (<_NUM_CHAR>)+ >
| <PREFIXTERM:  <_TERM_START_CHAR> (<_TERM_CHAR>)* "*" >
| <WILDTERM:  (<_TERM_CHAR> | ( [ "*", "?" ] ))* >
| <RANGEIN_START: "[" > : RangeIn
| <RANGEEX_START: "{" > : RangeEx
}

<Boost> TOKEN : {
<NUMBER:    (<_NUM_CHAR>)+ ( "." (<_NUM_CHAR>)+ )? > : DEFAULT
}

<RangeIn> TOKEN : {
<RANGEIN_TO: "TO">
| <RANGEIN_END: "]"> : DEFAULT
| <RANGEIN_QUOTED: "\"" (~["\""])+ "\"">
| <RANGEIN_GOOP: (~[ " ", "]" ])+ >
}

<RangeEx> TOKEN : {
<RANGEEX_TO: "TO">
| <RANGEEX_END: "}"> : DEFAULT
| <RANGEEX_QUOTED: "\"" (~["\""])+ "\"">
| <RANGEEX_GOOP: (~[ " ", "}" ])+ >
}

// *   Query  ::= ( Clause )*
// *   Clause ::= ["+", "-"] [<TERM> ":"] ( <TERM> | "(" Query ")" )

int Conjunction() : {
  int ret = CONJ_NONE;
}
{
  [
    <AND> { ret = CONJ_AND; }
    | <OR>  { ret = CONJ_OR; }
  ]
  { return ret; }
}

int Modifiers() : {
  int ret = MOD_NONE;
}
{
  [
     <PLUS> { ret = MOD_REQ; }
     | <MINUS> { ret = MOD_NOT; }
     | <NOT> { ret = MOD_NOT; }
  ]
  { return ret; }
}

Query Query(String field) :
{
  Vector clauses = new Vector();
  Query q, firstQuery=null;
  int conj, mods;
}
{
  mods=Modifiers() q=Clause(field)
  {
    addClause(clauses, CONJ_NONE, mods, q);
    if (mods == MOD_NONE)
        firstQuery=q;
  }
  (
    conj=Conjunction() mods=Modifiers() q=Clause(field)
    { addClause(clauses, conj, mods, q); }
  )*
    {
      if (clauses.size() == 1 && firstQuery != null)
        return firstQuery;
      else {
        return getBooleanQuery(clauses);
      }
    }
}

Query Clause(String field) : {
  Query q;
  Token fieldToken=null, boost=null;
}
{
  [
    LOOKAHEAD(2)
    fieldToken=<TERM> <COLON> { field = fieldToken.image; }
  ]

  (
   q=Term(field)
   | <LPAREN> q=Query(field) <RPAREN> (<CARAT> boost=<NUMBER>)?

  )
    {
      if (boost != null) {
        float f = (float)1.0;
        try {
          f = Float.valueOf(boost.image).floatValue();
          q.setBoost(f);
        } catch (Exception ignored) { }
      }
      return q;
    }
}


Query Term(String field) : {
  Token term, boost=null, slop=null, goop1, goop2;
  boolean prefix = false;
  boolean wildcard = false;
  boolean fuzzy = false;
  boolean rangein = false;
  Query q;
}
{
  (
     (
       term=<TERM>
       | term=<PREFIXTERM> { prefix=true; }
       | term=<WILDTERM> { wildcard=true; }
       | term=<NUMBER>
     )
     [ <FUZZY> { fuzzy=true; } ]
     [ <CARAT> boost=<NUMBER> [ <FUZZY> { fuzzy=true; } ] ]
     {
       if (wildcard) {
         q = getWildcardQuery(field, term.image);
       } else if (prefix) {
         q = getPrefixQuery(field, term.image.substring
                            (0, term.image.length()-1));
       } else if (fuzzy) {
         q = getFuzzyQuery(field, term.image);
       } else {
         q = getFieldQuery(field, analyzer, term.image);
       }
     }
     | ( <RANGEIN_START> ( goop1=<RANGEIN_GOOP>|goop1=<RANGEIN_QUOTED> )
         [ <RANGEIN_TO> ] ( goop2=<RANGEIN_GOOP>|goop2=<RANGEIN_QUOTED> )
         <RANGEIN_END> )
       [ <CARAT> boost=<NUMBER> ]
        {
          if (goop1.kind == RANGEIN_QUOTED)
            goop1.image = goop1.image.substring(1, goop1.image.length()-1);
          if (goop2.kind == RANGEIN_QUOTED)
            goop2.image = goop2.image.substring(1, goop2.image.length()-1);

          q = getRangeQuery(field, analyzer, goop1.image, goop2.image, true);
        }
     | ( <RANGEEX_START> ( goop1=<RANGEEX_GOOP>|goop1=<RANGEEX_QUOTED> )
         [ <RANGEEX_TO> ] ( goop2=<RANGEEX_GOOP>|goop2=<RANGEEX_QUOTED> )
         <RANGEEX_END> )
       [ <CARAT> boost=<NUMBER> ]
        {
          if (goop1.kind == RANGEEX_QUOTED)
            goop1.image = goop1.image.substring(1, goop1.image.length()-1);
          if (goop2.kind == RANGEEX_QUOTED)
            goop2.image = goop2.image.substring(1, goop2.image.length()-1);

          q = getRangeQuery(field, analyzer, goop1.image, goop2.image, false);
        }
     | term=<QUOTED>
       [ slop=<SLOP> ]
       [ <CARAT> boost=<NUMBER> ]
       {
         q = getFieldQuery(field, analyzer,
                           term.image.substring(1, term.image.length()-1));
         if (slop != null && q instanceof PhraseQuery) {
           try {
             int s = Float.valueOf(slop.image.substring(1)).intValue();
             ((PhraseQuery) q).setSlop(s);
           }
           catch (Exception ignored) { }
         }
       }
       | term=<STRING>
       [ slop=<SLOP> ]
       [ <CARAT> boost=<NUMBER> ]
       {
         q = getStringQuery(field, term.image.substring(1, 
term.image.length()-1));
         if (slop != null && q instanceof PhraseQuery) {
           try {
             int s = Float.valueOf(slop.image.substring(1)).intValue();
             ((PhraseQuery) q).setSlop(s);
           }
           catch (Exception ignored) { }
         }
       }
  )
  {
    if (boost != null) {
      float f = (float) 1.0;
      try {
        f = Float.valueOf(boost.image).floatValue();
      }
      catch (Exception ignored) {
          /* Should this be handled somehow? (defaults to "no boost", if
           * boost number is invalid)
           */
      }

      // avoid boosting null queries, such as those caused by stop words
      if (q != null) {
        q.setBoost(f);
      }
    }
    return q;
  }
}

[Prev in Thread]

Current Thread

[Next in Thread]

[sdx-developers] QueryParser... encore, Pierrick Brihaye, 2003/09/18
- RE : [sdx-developers] QueryParser... encore, Rasik Pandey, 2003/09/18
- Re: [sdx-developers] QueryParser... encore, Nicolas Maisonneuve, 2003/09/18
  - Re: [sdx-developers] QueryParser... encore, Pierrick Brihaye, 2003/09/18
- Re: [sdx-developers] QueryParser... encore, Pierrick Brihaye, 2003/09/18
  - Re: [sdx-developers] QueryParser... encore, Pierrick Brihaye <=

Prev by Date: [sdx-developers] FSRepository
Next by Date: RE : [sdx-developers] FSRepository
Previous by thread: Re: [sdx-developers] QueryParser... encore
Next by thread: [sdx-developers] Sous documents
Index(es):
- Date
- Thread