bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug-gawk] Problem with substr() after match() with non-ASCII characters


From: Janis Papanagnou
Subject: [bug-gawk] Problem with substr() after match() with non-ASCII characters
Date: Sat, 22 Aug 2015 22:33:52 +0200

The issue was observed using GNU awk 4.1.2 and confirmed to show the
same behaviour in GNU awk 4.1.3.

With the attached program 'testprog' applied on the attached data 'testdata'
I do *not* get the expected result of four lines containing "2007" each, but
instead I get:

  2007
  0703
  2007
  0071

The problem is caused/triggered by non-ASCII characters in 'testdata'.

Note: I can run 'testprog' it with LC_ALL=C and the output is as expected.

My understanding is, though, that the implicit results from the match()
function, RSTART and RLENGTH, should be consistently usable in substr(),
independent of the locale setting.

Thanks!

Janis

Attachment: testdata
Description: Binary data

Attachment: testprog
Description: Binary data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]