aramorph-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Aramorph-users] Arabic indexing and searching problem


From: Pierrick Brihaye
Subject: Re: [Aramorph-users] Arabic indexing and searching problem
Date: Fri, 22 Apr 2005 08:01:52 +0200
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; fr-FR; rv:1.7) Gecko/20040608

Hi,

Manjeet Chaudhary a écrit :

I have used the Arabic analyzer designed by Mr Pierrick Brihaye. But I am facing
some problems at the time of search.

Well... you want to use Lucene for indexing, don't you ?

1.   I am using Arabic data in UTF-8 format for indexing.
2.   I think indexing is working properly.

You can check this easily with a small useful utility : http://www.getopt.org/luke/.

3.   After indexing i am searching the data using Unicode.

Of course. Java works natively with Unicode strings.

4.   But the searcher is unable to find data.

It may be normal since you are searching the following string :

      File indexDir = new File("index");
      String q ="\u0661\u0630\u0642\u0644";

Which is :
ARABIC-INDIC DIGIT ONE
ARABIC LETTER THAL
ARABIC LETTER QAF
ARABIC LETTER LAM

This "word" is unlikely to appear, isn't it ?

Cheers,

p.b.






reply via email to

[Prev in Thread] Current Thread [Next in Thread]