bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gawk] third argument to split() always behaves like FS, and the


From: arnold
Subject: Re: [bug-gawk] third argument to split() always behaves like FS, and the docs are not clear about it
Date: Tue, 27 Nov 2018 08:32:35 -0700
User-agent: Heirloom mailx 12.4 7/29/08

Hi.

R <address@hidden> wrote:

> If the third argument to split() is made up of a single character, it
> will NOT be treated as a regular expression:
>
> [ ... ]
>
> This is (for me) completely expected, and it's the way it works in all
> awk implementations I know of.
>
> But neither the man nor the info page of gawk are clear about it --
> they both talk about the 3rd argument as of a 'regular expression'.
> The info page goes into some detail about the case where it's a single
> space, but does not say anything about the case where it's a special
> character like '.' or '('.

I have updated the documentation (patch below).  This is already pushed
to git.

Thanks,

Arnold
------------------------------
diff --git a/doc/gawk.1 b/doc/gawk.1
index a2448695..7bcef9d6 100644
--- a/doc/gawk.1
+++ b/doc/gawk.1
@@ -13,7 +13,7 @@
 .              if \w'\(rq' .ds rq "\(rq
 .      \}
 .\}
-.TH GAWK 1 "Apr 08 2018" "Free Software Foundation" "Utility Commands"
+.TH GAWK 1 "Nov 26 2018" "Free Software Foundation" "Utility Commands"
 .SH NAME
 gawk \- pattern scanning and processing language
 .SH SYNOPSIS
@@ -3031,7 +3031,7 @@ is the possibly null separator that appeared after
 The value of
 .B seps[0]
 is the possibly null leading separator.
-\&\fRIf
+If
 .I r
 is omitted,
 .B FPAT
@@ -3071,7 +3071,7 @@ between
 .BI a[ i ]
 and
 .BI a[ i +1]\fR.
-\&\fRIf
+If
 .I r
 is a single space, then leading whitespace in
 .I s
@@ -3084,6 +3084,10 @@ where
 is the return value of
 .BI split( s ", " a ", " r ", " seps )\fR.
 Splitting behaves identically to field splitting, described above.
+In particular, if
+.I r
+is a single-character string, that string acts as the separator,
+even if it happens to be a regular expression metacharacter.
 .TP
 .BI sprintf( fmt , " expr-list" )
 Print
diff --git a/doc/gawktexi.in b/doc/gawktexi.in
index dddcf673..51d9afeb 100644
--- a/doc/gawktexi.in
+++ b/doc/gawktexi.in
@@ -7704,7 +7704,6 @@ FPAT = "([^,]*)|(\"[^\"]+\")"
 Finally, the @code{patsplit()} function makes the same functionality
 available for splitting regular strings (@pxref{String Functions}).
 
-
 @node Testing field creation
 @section Checking How @command{gawk} Is Splitting Records
 
@@ -17678,8 +17677,8 @@ whitespace goes into @address@hidden@var{n}]}, where 
@var{n} is the
 return value of
 @code{split()} (i.e., the number of elements in @var{array}).
 
-The @code{split()} function splits strings into pieces in a
-manner similar to the way input lines are split into fields.  For example:
+The @code{split()} function splits strings into pieces in the same way
+that input lines are split into fields.  For example:
 
 @example
 split("cul-de-sac", a, "-", seps)
@@ -17715,6 +17714,8 @@ are separated by runs of whitespace.
 Also, as with input field splitting, if @var{fieldsep} is the null string, each
 individual character in the string is split into its own array element.
 @value{COMMONEXT}
+Additionally, if @var{fieldsep} is a single-character string, that string acts
+as the separator, even if its value is a regular expression metacharacter.
 
 Note, however, that @code{RS} has no effect on the way @code{split()}
 works. Even though @samp{RS = ""} causes the newline character to also be an 
input



reply via email to

[Prev in Thread] Current Thread [Next in Thread]