[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [bug-gawk] Tentative CSV extension - please advise
From: |
Hermann Peifer |
Subject: |
Re: [bug-gawk] Tentative CSV extension - please advise |
Date: |
Tue, 15 Mar 2016 16:34:28 +0100 |
User-agent: |
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:38.0) Gecko/20100101 Thunderbird/38.6.0 |
On 2016-03-15 15:54, Miriam English wrote:
>
> address@hidden wrote:
>
>> Yes. That's getting too messy. Instead such an extension should
>> simply define
>> a csvsplit() function.
>>
>> The big problem is embedded newlines in the fields. Sigh. If not for
>> that
>> we could deal with this issue much more easily...
>
> If the fields are preprocessed by turning
> "the quick
> brown fox"
> into
> "the quick\nbrown fox"
> then it becomes much easier. Then after fields are extracted the escaped
> characters can be returned to their literals.
>
> It would be slightly simplified using something similar to what sed
> recently has done in adding the -z option, where the input "line" is
> delimited by a zero byte instead of a newline. I often use that to
> process entire files as a single line, treating newlines as just another
> character.
>
> It reduces the problem to working out which newlines end a csv record
> and shouldn't be escaped.
>
>
I assume people know http://lorance.freeshell.org/csv/csv.txt, which
works fine, as far as I can tell from occasional usage.
As big problem, I rather remember quotes inside quotes.
Hermann