bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gawk] gawk stops reading input at SUB character


From: Andrew J. Schorr
Subject: Re: [bug-gawk] gawk stops reading input at SUB character
Date: Tue, 12 Sep 2017 12:58:56 -0400
User-agent: Mutt/1.5.21 (2010-09-15)

Hi,

On Tue, Sep 12, 2017 at 05:58:51PM +0300, Paavo Tamminen wrote:
> I have successfully used gawk with mixed text--and-binary content.
> 
> However, I ran into problem as gawk stops reading the input file if there
> is a <SUB> character in the file. The character <SUB> is a control
> character 'substitute', x1A in hex.
> 
> *input file (**test.txt:) has three lines with *
> *<SUB> at line two:*
> line 1 aA
> line 2 b<SUB>B
> line 3 cC
> 
> 
> On windows cmd-promt the following shows output only to the up to character
> b. So <SUB> seems to be treated as an end of file.
> 
> *gawk.exe "{print $0}" test.txt*
> line 1 aA
> line 2 b
> 
> *gawk.exe --version*
> GNU Awk 4.1.4, API: 1.1 (GNU MPFR 3.1.0-p8, GNU MP 5.0.2)
> 
> My gawk (gawk-4.1.4-w32-bin.zip) is loaded from
> https://sourceforge.net/projects/ezwinports/
> 
> <https://sourceforge.net/projects/ezwinports/>

I guess this is probably a Windows issue, since 0x1A Ctrl-Z typically means EOF
in DOS, if I recall correctly from the dark days of my youth.  I tested on
Linux and on Cygwin, and it works correctly on both:

bash-4.2$ gawk -l ordchr 'BEGIN {printf "aA\nb%sB\ncC\n", chr(0x1a)}' > test.txt
bash-4.2$ od -c -tx1 test.txt
0000000   a   A  \n   b 032   B  \n   c   C  \n
         61  41  0a  62  1a  42  0a  63  43  0a
0000012
bash-4.2$ gawk '{print}' test.txt | od -c -tx1
0000000   a   A  \n   b 032   B  \n   c   C  \n
         61  41  0a  62  1a  42  0a  63  43  0a
0000012

I have attached the input file test.txt. Can you please confirm that this
is the input that shows the problem?

Regards,
Andy

Attachment: test.txt
Description: Text document


reply via email to

[Prev in Thread] Current Thread [Next in Thread]