[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Scanner takes a long time to build.
From: |
Hans Aberg |
Subject: |
Re: Scanner takes a long time to build. |
Date: |
Fri, 26 Jul 2002 11:59:14 +0200 |
At 17:50 -0400 2002/07/25, Scott A Crosby wrote:
>On Thu, 25 Jul 2002, John Millaway wrote:
>
>> > Paragraph\ .a.{0,10}2.{0,10}C.\ of\ S.\ 1618
>>
>> That last regex is enough to blow up the DFA. It's not the number of regular
>> expressions killing your flex file, it's the patterns. Have a look:
>I suspected it was the non-determinism and backing up of '.',
>Thanks for your help, though I've concluded that its not practical to use
>flex to speed up spamassassin.
It is in the nature of the NFA -> DFA conversion algorithm that some
regular expressions may blow up the DFA exponentially, as mentioned in Aho,
et al., fig 3.32 (end of sec.3.7). Of course, one can use the NFA instead,
which will be slow.
But loc.cit. also says that one can compute the DFA transitions on the fly,
caching them. This will combine the NFA space complexity with the DFA time
complexity, it says.
Hans Aberg
Re: Scanner takes a long time to build., Scott A Crosby, 2002/07/27
Re: Scanner takes a long time to build., Vern Paxson, 2002/07/27