bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: gawk: {} repetition in patterns doesn't work?


From: tt
Subject: Re: gawk: {} repetition in patterns doesn't work?
Date: 23 Mar 2001 12:06:21 +0200

address@hidden (Aharon Robbins) writes:

> Turning interval expressions on breaks
> old programs, period. 

And leaving them off breaks POSIX-conforming programs
and causes extra work in porting such to Gnu systems.
Admittedly I didn't RTM far enough, but it begins by
claiming POSIX-compliance without qualifications.

> There are programs in the A,K,&W awk book that
> don't work with interval expresssions enabled.

Are there examples that'd break with egrep-style solution of
defaulting to old behaviour with invalid interval specifications?

> Right now,
> you must use one of --posix, --re-interval or setting POSIXLY_CORRECT
> in the environment to get interval expressions to work.  I think having
> /a{2}/  be an interval expression but  /{.*}/ be literal is confusing.

As Paul Eggert pointed out, there are already examples of similar
behaviour like /*/. My preference would be POSIX-conformant
behaviour by default with fallback to old with invalid intervals,
possibly with a flag to disable interval expressions completely,
should there be cases where it breaks something even so.
 
> Gawk's current behavior has been in place for many years and generates
> little or no complaint.  My experience is that people don't use interval
> expressions much with gawk; the --re-interval option was in place,
> but DIDN'T WORK for at least a year or two until somebody noticed.

I suspect people have a tendency to avoid features that must be
enabled with long non-standard options. If intervals worked
by default, they'd probably be used much more.

> POSIX Compliance is a goal to meet when it's reasonable, but it's not
> an overriding requirement, and I think the current behavior strikes
> a reasonable balance. 

Reasonable, perhaps, but not ideal, IMHO.

> If you want full POSIX compliance, put
> POSIXLY_CORRECT in your environment, or set up a shell file, function
> or alias that adds --re-interval.

I can do that, sure. I just hate it when I have to document that
"this works in POSIX-compliant systems but in a Gnu system
you must do ..."

> As Paul points out, it's not something you can rely on for portability
> to other awks, in any case, even if gawk isn't part of the picture
> at all.

Of course. Same goes for everything POSIX standardized: old systems
can't be expected to conform. Nonetheless I'd like to see convergence
towards similar behaviour, which is the whole point of standards in
the first place.

Sometimes standards are broken in the sense that conforming would
break too many old things, or it may be just too difficult to do,
but if conformance can be achieved without breaking much at
a reasonable amount of work, I think it would be worthwhile.
I haven't looked into gawk code though so I can't really judge
if that'd be the case here, though.

-- 
Tapani Tarvainen



reply via email to

[Prev in Thread] Current Thread [Next in Thread]