help-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Any way to set "--csv" in BEGIN?


From: Ed Morton
Subject: Re: Any way to set "--csv" in BEGIN?
Date: Fri, 10 Nov 2023 10:11:44 -0600
User-agent: Mozilla Thunderbird

Thanks for the quick response Andy.

It's not a huge deal for me personally as I don't expect to ever NEED that functionality, e.g. I'd normally write a shell script to call any non-trivial awk script so I can always add `--csv` when appropriate from shell, and if I wanted to switch to CSV mode mid-processing to do anything like parse a field that contains CSV then FPAT/patsplit(), or FS/split() if the opposite, are almost certainly good enough. And I could always pipe+getline to a new script in a subshell if absolutely necessary.

It just seemed like something there would be a way to do (I had tried and failed setting PROCINFO["CSV"]=1) but I couldn't figure out how so thanks for confirming there isn't!

    Ed.

On 11/10/2023 8:47 AM, Andrew J. Schorr wrote:
Hi Ed,

I don't see any way to do that currently. The choice of CSV parsing impacts the
guts of the record parsing code (because newline characters could be inside a
quoted CSV field), and I don't know whether it would be feasible to change this
on the fly. But I can see how it might be desirable to manipulate this from
BEGIN or based on which file is being parsed. Currently, when invoked with
--csv, gawk sets PROCINFO["CSV"] to 1. One could imagine having a hook in the
PROCINFO array such that if user code manipulates that value, it would change
the parsing mode (by pointing PROCINFO_node->array_funcs to special array
methods that check or updates to "CSV", similarly to how ENVIRON_node is
configured to use the env_array_func methods to impact the environment). I
don't know how easy it would be to get this to work properly though.
If somebody changed this mid-stream, it might hork things.

Regards,
Andy

On Fri, Nov 10, 2023 at 08:16:18AM -0600, Ed Morton wrote:
I'm loving the new "--csv" option for CSV parsing in gawk 5.3,
thanks, but I don't see anything in the documentation about
specifying the input is CSV from inside an awk script so, in case
I'm just not seeing it - is there any variable we can set or any
other way to specify the equivalent of "--csv" from inside an awk
script, e.g. in a BEGIN section, like we can set RS, FS, FPAT,
and/or FIELDWIDTHS for the existing types of input parsing?

     Ed.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]