[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [bug-gawk] Computed regex and getline bug / issue
From: |
Davide Brini |
Subject: |
Re: [bug-gawk] Computed regex and getline bug / issue |
Date: |
Sun, 4 May 2014 18:09:01 +0200 |
On Sun, 4 May 2014 17:31:55 +0800, Grail Dane <address@hidden> wrote:
> Hello
> As part of an exercise in displaying data from a file I have come across
> an issue which neither myself nor any of thegood people at
> linuquestions.org have been able to solve and believe it may be a bug
> within gawk. Using the following data as an input file: 1 , 23 , 45 ,
> 67 , 89 , 10 In case this does not display correctly, the format is -
> number space comma space number Using the following basic gawk we are
> able to return data as follows: $ awk '{print "|"$0"|"}' RS='[,\n]'
> file|1 || 2||3 || 4||5 || 6||7 || 8||9 || 10| Pipes included to simply
> show white space. If we then use getline prior to our print we receive: $
> awk '{getline;print "|"$0"|"}' RS='[,\n]' file| 2|| 4|| 6|| 8|| 10| Which
> again is all fine, however, if we then extend the RS computed regex to
> allow for spaces, our original output is the same but minus the spaces: $
> awk '{print "|"$0"|"}' RS='[,\n ]+' file|1||2||3||4||5||6||7||8||9||10|
> Again, as expected. Once we go back to our getline version where we
> expect to return every second record, we now see our 'bug': $ awk
> '{getline;print "|"$0"|"}' RS='[,\n ]+' f2|2||4||6||8||9| <-- This
> should have been |10| The thread for further discussion on this issue can
> be found here :-
> http://www.linuxquestions.org/questions/programming-9/peculiar-awk-behaviour-confusing-me-4175503599/
> Please advise if you should require any further information? CheersGrail
>
I have been able to reduce the behavior to these simple test cases, which
(unless I'm missing something obvious) should behave identically but don't:
$ printf '1,2,' | gawk 'BEGIN{RS="[,]"}{print; a = getline; print "-"a"-";
print}'
1
-1-
2
$ printf '1,2,' | gawk 'BEGIN{RS="[,]+"}{print; a = getline; print "-"a"-";
print}'
1
-0-
1
That is, in the second case getline detects EOF and does not update $0.
Curiously, removing the square brackets works:
$ printf '1,2,' | gawk 'BEGIN{RS=","}{print; a = getline; print "-"a"-"; print}'
1
-1-
2
$ printf '1,2,' | gawk 'BEGIN{RS=",+"}{print; a = getline; print "-"a"-";
print}'
1
-1-
2
--
D.
- [bug-gawk] Computed regex and getline bug / issue, Grail Dane, 2014/05/04
- Re: [bug-gawk] Computed regex and getline bug / issue,
Davide Brini <=
- Re: [bug-gawk] Computed regex and getline bug / issue, Andrew J. Schorr, 2014/05/04
- Re: [bug-gawk] Computed regex and getline bug / issue, Aharon Robbins, 2014/05/09
- Re: [bug-gawk] Computed regex and getline bug / issue, Andrew J. Schorr, 2014/05/09
- Re: [bug-gawk] Computed regex and getline bug / issue, Aharon Robbins, 2014/05/10
- Re: [bug-gawk] Computed regex and getline bug / issue, Andrew J. Schorr, 2014/05/11
- Re: [bug-gawk] Computed regex and getline bug / issue, arnold, 2014/05/11