bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gawk] FPAT bug?


From: Manuel Collado
Subject: Re: [bug-gawk] FPAT bug?
Date: Sun, 2 Apr 2017 18:08:58 +0200
User-agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.5.0

El 02/04/2017 a las 15:55, Andrew J. Schorr escribió:
On Sat, Apr 01, 2017 at 04:27:46PM -0500, Ed Morton wrote:
Might be useful until a fix is available. Any better ideas for now?

I have no better ideas, although I can't imagine that such a solution would
please anyone. Here's a crazy question: do you ever use FPAT for anything other
than CSV parsing? If gawk had a proper CSV parser library, would we need FPAT
at all? I don't think I have ever used FPAT (my personal CSV parsing library
takes a brute force string scanning approach), and it seems from the observed
bugginess that FPAT is not often used...

Any FPAT regex that matches null strings is inherently ambiguous. It could provide any number of null fields at any position. So FPAT must be extended with some implicit restrictions on field separators.

I assume that the most intuitive approach if to allow null field separators only between contiguous non-null fields. I assume that FPAT and patsplit() implementations try to achieve this policy, but may fail in some cases.

But even the above is not enough. What about a null record? Should it provide a single null field or no fields at all?

So, what is the precise unambiguous specification of the FPAT split feature, if it allows null fields?

Hope this helps.
--
Manuel Collado - http://lml.ls.fi.upm.es/~mcollado




reply via email to

[Prev in Thread] Current Thread [Next in Thread]