[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: "grep -o" skips some matching patterns
From: |
Roberto Gordo Saez |
Subject: |
Re: "grep -o" skips some matching patterns |
Date: |
Fri, 2 Jan 2004 10:05:37 +0100 |
User-agent: |
Mutt/1.5.4i |
On Thu, Jan 01, 2004 at 01:04:26PM +0100, Stepan Kasal wrote:
> Hello,
> thank very much to Michael for the explanation, which helped.
>
> I'd like to add a theoretical comment which won't help at all ;-)
>
> On Tue, Dec 30, 2003 at 04:24:23PM +0100, Roberto Gordo Saez wrote:
> > $ echo abc0111def | grep -o "[01]*"
>
> This command should in fact print seven lines, three empty ones,
> then the non-empty match and another three empty ones.
> The reason can be explained by the following (the sed command performs
> global substitution of your regex by "(...)"):
>
> $ echo abc0111def|sed 's/[01]*/(&)/g'
> ()a()b()c(0111)d()e()f()
>
> So you have actually triggered a bug in ``grep -o''.
>
> I hope to help with fixing it...
>
> Stepan Kasal
So it is true, it has a bug :-)
grep stops with the first empty match, because the non-empy
ones at the begining are printed. It is easy to find why (from grep.c):
| if (only_matching)
| {
| size_t match_size;
| size_t match_offset;
| while ((match_offset = (*execute) (beg, lim - beg, &match_size, 1))
| != (size_t) -1)
| {
| char const *b = beg + match_offset;
| if (b == lim)
| break;
| if (match_size == 0)
| break;
^^^^^^^^^^^^^^^^^^^^
Here, match_size is zero. As a test, doing { beg = b + 1; continue; }
instead of "break" will print all non-empty matchs (That is not a fix,
it is only a test, i know nothing about grep source code).
| if(color_option)
| printf("\33[%sm", grep_color);
| fwrite(b, sizeof (char), match_size, stdout);
| if(color_option)
| fputs("\33[00m", stdout);
| fputs("\n", stdout);
| beg = b + match_size;
| }
--
Roberto Gordo Saez - Free Software Engineer
Linalco "Especialistas en Linux y Software Libre"
http://www.linalco.com/ Tel: +34-914561700