bug-findutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: signature method in locate


From: Ondrej Bilka
Subject: Re: signature method in locate
Date: Tue, 12 May 2009 08:51:44 +0200
User-agent: Mutt/1.5.18 (2008-05-17)

On Tue, May 12, 2009 at 12:08:21AM +0200, James Youngman wrote:
> On Mon, May 11, 2009 at 10:42 AM, Ondrej Bilka <address@hidden> wrote:
> > On Mon, May 11, 2009 at 10:39:11AM +0200, James Youngman wrote:
> >> On Mon, May 11, 2009 at 8:14 AM, Ondrej Bilka <address@hidden> wrote:
> >> > Hello,
> >> > Was signature method discused here? search gives only pgp.
> >> > Its that you give to each letter bit create signature.
> >> > When searching you first create signature from largest continuous part 
> >> > not containing /. Then you can compare signature of each directory and 
> >> > filename.
> >> > There is risk it will be slowdown because you need to read more.
> >>
> >> I'm sorry, I did not understand what you were trying to express.
> >> Would you try again with an example?
> > Sure. Say you have files aaa ,baba, abcd and aaccee. Signatures are
> > 10000, 11000, 11110 and 10101
> 
> I don't understand the mapping you are implying here.
> 
> > If we want find *ba*, it coresponds to 11000
> 
> But above, you indicated that 11000 represented baba.   I don't
> understand what mapping or calculation you are trying to describe.
If a is present first bit is 1, if b present second bit is 1, z 26... and you 
or it. Then you can tell what letters are present. 
> Are you suggesting some kind of Bloom filter?
Yes.
But I would use this signature function
int makesignature(char *s){
  int sig=0;
  char c1,c2;
  c1=*s++;
  c2=*s++;
  while (*s){
    sig |= 1<<((121*c1+11*c2 +*s)%31);
    c1=c2;
    c2=*s++;
  }
  return sig;
}
int canmatch(sigstr,sigpat){return sigstr&sigpat==sigpat;}





reply via email to

[Prev in Thread] Current Thread [Next in Thread]