[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [patch #6899] Speed-up for searching in multibyte and ignore-icase.
From: |
Aharon Robbins |
Subject: |
Re: [patch #6899] Speed-up for searching in multibyte and ignore-icase. |
Date: |
Thu, 11 Mar 2010 07:28:13 +0200 |
Thanks for the explanations of the dfa vs. regex.
> > I think I have an obligation at this point to mention:
> >
> > http://swtch.com/~rsc/regexp/
> >
> > In particular, there is code there for an "Efficient (non-backtracking)
> > NFA implementation with submatch tracking. Accepts UTF-8 and
> > wide-character Unicode input. Traditional egrep syntax only. Written by
> > Rob Pike."
> >
> > Perhaps this can serve as the basis for a new unified matcher?
>
> I think this is very much related to the algorithms already in use by
> regex. A unified matcher will always be slower than DFA.
I understood that regex did backtracking - these algorithms, based on
the papers by RSC, are not used by the GNU library. It's worth a
careful review. But this is a long term issue in any case.
Thanks,
Arnold