|
From: | Janis Papanagnou |
Subject: | [bug-gawk] Problem with substr() after match() with non-ASCII characters |
Date: | Sat, 22 Aug 2015 22:33:52 +0200 |
The issue was observed using GNU awk 4.1.2 and confirmed to show the
same behaviour in GNU awk 4.1.3. With the attached program 'testprog' applied on the attached data 'testdata' I do *not* get the expected result of four lines containing "2007" each, but instead I get: 2007 0703 2007 0071 The problem is caused/triggered by non-ASCII characters in 'testdata'. Note: I can run 'testprog' it with LC_ALL=C and the output is as expected. My understanding is, though, that the implicit results from the match() function, RSTART and RLENGTH, should be consistently usable in substr(), independent of the locale setting. Thanks! Janis |
testdata
Description: Binary data
testprog
Description: Binary data
[Prev in Thread] | Current Thread | [Next in Thread] |