bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: sync dfaexec from gawk: fixes real bugs in grep


From: Paolo Bonzini
Subject: Re: sync dfaexec from gawk: fixes real bugs in grep
Date: Fri, 12 Mar 2010 12:17:32 +0100
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.7) Gecko/20100120 Fedora/3.0.1-1.fc12 Lightning/1.0b2pre Thunderbird/3.0.1


+  *end = eol;

As a follow-up I'd make this *end++ = eol, so that the code below either simplifies (fewer "+ 1") or becomes more idiomatic (e.g. comparing "< end" and ">= end" instead of <= and >).

-      MALLOC(mblen_buf, unsigned char, end - (unsigned char const *)begin + 2);
-      MALLOC(inputwcs, wchar_t, end - (unsigned char const *)begin + 2);
+      MALLOC(mblen_buf, unsigned char, end - begin + 2);
+      MALLOC(inputwcs, wchar_t, end - begin + 2);

Should this become + 3?

+           if ((char *) p>  end)
+             break;

So the sentinel doesn't work when doing MB matching?

+      while ((t = trans[s]) != 0) { /* hand-optimized loop */

Would you change this comment to "hand-unrolled loop"?

+      /* If the previous character was a newline, count it. */
+      if (count&&  (char *) p<= end&&  p[-1] == eol)
+       ++*count;

Probably present in gawk too but, why p <= end? That would be always true due to the sentinel; wouldn't the right test be p > begin?

Or, what happens if you grep in a file that starts with '\n'?

* src/search.c: Adjust to new dfaexec API.
Now, dfaexec returns a pointer, not an integer,
and the third parameter is END, not buffer size.

I suppose you'd squash 1/3 and 2/3 for bisectability? Or could the search.c part be applied already to 1/3?

+         *end = saved_end;                             \

Do we need to restore this or can we treat it as a scratch area?

Paolo




reply via email to

[Prev in Thread] Current Thread [Next in Thread]