bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gawk] FPAT bug?


From: Manuel Collado
Subject: Re: [bug-gawk] FPAT bug?
Date: Wed, 5 Apr 2017 23:04:47 +0200
User-agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.5.0

El 03/04/2017 a las 8:29, address@hidden escribió:
[...]
I will get around to investigating the issue with FPAT.

Perhaps a problem with the FPAT feature is how the parsing algorithm is structured. Thinking about it, I feel there is a simpler way to structure the code.

Attached is what can be an awk reference implementation of the FPAT parser. It differs from the current implementation in some ways:

- Each iteration of the parsing loop identifies exactly one
      (separator[n-1],field[n])
  pair. The last iteration may deliver just a final separator.
- Each (separator/field) pair must consume some input characters.
  Except for the first (sep[0]/field[1]) that can be both nulls.
- So a null field is only valid as a first field or after
  a non-null separator.

This avoid the use of flags to remember some state from the previous loop iteration.

The attached code has been tested with the (adapted) set of fpat*.awk tests, and some more.

Please advice if it is worth the pain of trying to restructure the current code, or if there is a simpler way to fix the FPAT bug.

Hope this helps. Regards.
--
Manuel Collado - http://lml.ls.fi.upm.es/~mcollado

Attachment: refpatsplit.awk
Description: Text document


reply via email to

[Prev in Thread] Current Thread [Next in Thread]