[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[SCM] gawk branch, gawk-5.2-stable, updated. gawk-4.1.0-5000-gc419ea07
From: |
Arnold Robbins |
Subject: |
[SCM] gawk branch, gawk-5.2-stable, updated. gawk-4.1.0-5000-gc419ea07 |
Date: |
Sat, 25 Feb 2023 13:42:20 -0500 (EST) |
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "gawk".
The branch, gawk-5.2-stable has been updated
via c419ea07ec452effc347c089350202a3d9151bcc (commit)
from a908e81a6b4a41116e3268a915449881c9982209 (commit)
Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.
- Log -----------------------------------------------------------------
http://git.sv.gnu.org/cgit/gawk.git/commit/?id=c419ea07ec452effc347c089350202a3d9151bcc
commit c419ea07ec452effc347c089350202a3d9151bcc
Author: Arnold D. Robbins <arnold@skeeve.com>
Date: Sat Feb 25 20:41:56 2023 +0200
Improve the doc on input parsers.
diff --git a/doc/ChangeLog b/doc/ChangeLog
index 076f60f6..cc2b35da 100644
--- a/doc/ChangeLog
+++ b/doc/ChangeLog
@@ -1,3 +1,7 @@
+2023-02-25 Arnold D. Robbins <arnold@skeeve.com>
+
+ * gawktexi.in (Input Parsers): Clarify and improve some of the prose.
+
2023-02-24 Arnold D. Robbins <arnold@skeeve.com>
* gawktexi.in (Feature History): Add note about nonfatal I/O.
diff --git a/doc/gawk.info b/doc/gawk.info
index 2355ded3..a1e6043a 100644
--- a/doc/gawk.info
+++ b/doc/gawk.info
@@ -26647,9 +26647,10 @@ File: gawk.info, Node: Input Parsers, Next: Output
Wrappers, Prev: Extension
.................................
By default, âgawkâ reads text files as its input. It uses the value of
-âRSâ to find the end of the record, and then uses âFSâ (or
âFIELDWIDTHSâ
-or âFPATâ) to split it into fields (*note Reading Files::).
-Additionally, it sets the value of âRTâ (*note Built-in Variables::).
+âRSâ to find the end of an input record, and then uses âFSâ (or
+âFIELDWIDTHSâ or âFPATâ) to split it into fields (*note Reading
+Files::). Additionally, it sets the value of âRTâ (*note Built-in
+Variables::).
If you want, you can provide your own custom input parser. An input
parserâs job is to return a record to the âgawkâ record-processing code,
@@ -26733,8 +26734,9 @@ as follows:
The name of the file.
âint fd;â
- A file descriptor for the file. If âgawkâ was able to open the
- file, then âfdâ will _not_ be equal to âINVALID_HANDLEâ.
+ A file descriptor for the file. âgawkâ attempts to open the file
+ for reading using the âopen()â system call. If it was able to open
+ the file, then âfdâ will _not_ be equal to âINVALID_HANDLEâ.
Otherwise, it will.
âstruct stat sbuf;â
@@ -26795,13 +26797,13 @@ records. The parameters are as follows:
âchar **outâ
This is a pointer to a âchar *â variable that is set to point to
- the record. âgawkâ makes its own copy of the data, so the
+ the record. âgawkâ makes its own copy of the data, so your
extension must manage this storage.
âstruct awk_input *iobufâ
- This is the âawk_input_buf_tâ for the file. The fields should be
- used for reading data (âfdâ) and for managing private state
- (âopaqueâ), if any.
+ This is the âawk_input_buf_tâ for the file. Two of its fields
+ should be used by your extension: âfdâ for reading data, and
+ âopaqueâ for managing any private state.
âint *errcodeâ
If an error occurs, â*errcodeâ should be set to an appropriate code
@@ -26812,8 +26814,8 @@ records. The parameters are as follows:
If the concept of a ârecord terminatorâ makes sense, then
â*rt_startâ should be set to point to the data to be used for
âRTâ,
and â*rt_lenâ should be set to the length of the data. Otherwise,
- â*rt_lenâ should be set to zero. âgawkâ makes its own copy of
this
- data, so the extension must manage this storage.
+ â*rt_lenâ should be set to zero. Here too, âgawkâ makes its own
+ copy of this data, so your extension must manage this storage.
âconst awk_fieldwidth_info_t **field_widthâ
If âfield_widthâ is not âNULLâ, then â*field_widthâ will be
@@ -26823,11 +26825,12 @@ records. The parameters are as follows:
copied by âgawkâ; it must persist at least until the next call to
âget_recordâ or âclose_funcâ. Note also that âfield_widthâ is
âNULLâ when âgetlineâ is assigning the results to a variable, thus
- field parsing is not needed. If the parser does set
- â*field_widthâ, then âgawkâ uses this layout to parse the input
- record, and the âPROCINFO["FS"]â value will be â"API"â while this
- record is active in â$0â. The âawk_fieldwidth_info_tâ data
- structure is described below.
+ field parsing is not needed.
+
+ If the parser sets â*field_widthâ, then âgawkâ uses this layout to
+ parse the input record, and the âPROCINFO["FS"]â value will be
+ â"API"â while this record is active in â$0â. The
+ âawk_fieldwidth_info_tâ data structure is described below.
The return value is the length of the buffer pointed to by â*outâ, or
âEOFâ if end-of-file was reached or an error occurred.
@@ -26871,7 +26874,7 @@ extension does). Or you may want it to take effect
based upon the value
of an âawkâ variable, as the XML extension from the âgawkextlibâ
project
does (*note gawkextlib::). In the latter case, code in a âBEGINFILEâ
rule can look at âFILENAMEâ and âERRNOâ to decide whether or not to
-activate an input parser (*note BEGINFILE/ENDFILE::).
+activate your input parser (*note BEGINFILE/ENDFILE::).
You register your input parser with the following function:
@@ -39768,140 +39771,140 @@ Node: Extension Functions1112908
Node: Exit Callback Functions1118484
Node: Extension Version String1119803
Node: Input Parsers1120498
-Node: Output Wrappers1133872
-Node: Two-way processors1138580
-Node: Printing Messages1140941
-Ref: Printing Messages-Footnote-11142155
-Node: Updating ERRNO1142310
-Node: Requesting Values1143109
-Ref: table-value-types-returned1143862
-Node: Accessing Parameters1144971
-Node: Symbol Table Access1146255
-Node: Symbol table by name1146771
-Ref: Symbol table by name-Footnote-11149982
-Node: Symbol table by cookie1150114
-Ref: Symbol table by cookie-Footnote-11154395
-Node: Cached values1154459
-Ref: Cached values-Footnote-11158103
-Node: Array Manipulation1158260
-Ref: Array Manipulation-Footnote-11159363
-Node: Array Data Types1159400
-Ref: Array Data Types-Footnote-11162222
-Node: Array Functions1162322
-Node: Flattening Arrays1167351
-Node: Creating Arrays1174403
-Node: Redirection API1179253
-Node: Extension API Variables1182274
-Node: Extension Versioning1182999
-Ref: gawk-api-version1183436
-Node: Extension GMP/MPFR Versioning1185224
-Node: Extension API Informational Variables1186930
-Node: Extension API Boilerplate1188091
-Node: Changes from API V11192227
-Node: Finding Extensions1193861
-Node: Extension Example1194436
-Node: Internal File Description1195260
-Node: Internal File Ops1199584
-Ref: Internal File Ops-Footnote-11211142
-Node: Using Internal File Ops1211290
-Ref: Using Internal File Ops-Footnote-11213721
-Node: Extension Samples1213999
-Node: Extension Sample File Functions1215568
-Node: Extension Sample Fnmatch1223706
-Node: Extension Sample Fork1225301
-Node: Extension Sample Inplace1226577
-Node: Extension Sample Ord1230249
-Node: Extension Sample Readdir1231125
-Ref: table-readdir-file-types1232022
-Node: Extension Sample Revout1233160
-Node: Extension Sample Rev2way1233757
-Node: Extension Sample Read write array1234509
-Node: Extension Sample Readfile1237783
-Node: Extension Sample Time1238914
-Node: Extension Sample API Tests1241204
-Node: gawkextlib1241712
-Node: Extension summary1244748
-Node: Extension Exercises1248606
-Node: Language History1249884
-Node: V7/SVR3.11251598
-Node: SVR41253948
-Node: POSIX1255480
-Node: BTL1256905
-Node: POSIX/GNU1257674
-Node: Feature History1264205
-Node: Common Extensions1283323
-Node: Ranges and Locales1284692
-Ref: Ranges and Locales-Footnote-11289493
-Ref: Ranges and Locales-Footnote-21289520
-Ref: Ranges and Locales-Footnote-31289759
-Node: Contributors1289982
-Node: History summary1296187
-Node: Installation1297633
-Node: Gawk Distribution1298597
-Node: Getting1299089
-Node: Extracting1300088
-Node: Distribution contents1301800
-Node: Unix Installation1309880
-Node: Quick Installation1310702
-Node: Compiling with MPFR1313248
-Node: Shell Startup Files1313954
-Node: Additional Configuration Options1315111
-Node: Configuration Philosophy1317498
-Node: Compiling from Git1320000
-Node: Building the Documentation1320559
-Node: Non-Unix Installation1321971
-Node: PC Installation1322447
-Node: PC Binary Installation1323320
-Node: PC Compiling1324225
-Node: PC Using1325403
-Node: Cygwin1329131
-Node: MSYS1330387
-Node: OpenVMS Installation1331019
-Node: OpenVMS Compilation1331700
-Ref: OpenVMS Compilation-Footnote-11333183
-Node: OpenVMS Dynamic Extensions1333245
-Node: OpenVMS Installation Details1334881
-Node: OpenVMS Running1337316
-Node: OpenVMS GNV1341453
-Node: Bugs1342208
-Node: Bug definition1343132
-Node: Bug address1346783
-Node: Usenet1350374
-Node: Performance bugs1351605
-Node: Asking for help1354623
-Node: Maintainers1356614
-Node: Other Versions1357641
-Node: Installation summary1366573
-Node: Notes1367957
-Node: Compatibility Mode1368767
-Node: Additions1369589
-Node: Accessing The Source1370534
-Node: Adding Code1372069
-Node: New Ports1379205
-Node: Derived Files1383715
-Ref: Derived Files-Footnote-11389562
-Ref: Derived Files-Footnote-21389597
-Ref: Derived Files-Footnote-31390214
-Node: Future Extensions1390328
-Node: Implementation Limitations1391000
-Node: Extension Design1392242
-Node: Old Extension Problems1393406
-Ref: Old Extension Problems-Footnote-11394982
-Node: Extension New Mechanism Goals1395043
-Ref: Extension New Mechanism Goals-Footnote-11398539
-Node: Extension Other Design Decisions1398740
-Node: Extension Future Growth1400939
-Node: Notes summary1401563
-Node: Basic Concepts1402776
-Node: Basic High Level1403461
-Ref: figure-general-flow1403743
-Ref: figure-process-flow1404445
-Ref: Basic High Level-Footnote-11407841
-Node: Basic Data Typing1408030
-Node: Glossary1411448
-Node: Copying1444570
-Node: GNU Free Documentation License1482331
-Node: Index1507654
+Node: Output Wrappers1133990
+Node: Two-way processors1138698
+Node: Printing Messages1141059
+Ref: Printing Messages-Footnote-11142273
+Node: Updating ERRNO1142428
+Node: Requesting Values1143227
+Ref: table-value-types-returned1143980
+Node: Accessing Parameters1145089
+Node: Symbol Table Access1146373
+Node: Symbol table by name1146889
+Ref: Symbol table by name-Footnote-11150100
+Node: Symbol table by cookie1150232
+Ref: Symbol table by cookie-Footnote-11154513
+Node: Cached values1154577
+Ref: Cached values-Footnote-11158221
+Node: Array Manipulation1158378
+Ref: Array Manipulation-Footnote-11159481
+Node: Array Data Types1159518
+Ref: Array Data Types-Footnote-11162340
+Node: Array Functions1162440
+Node: Flattening Arrays1167469
+Node: Creating Arrays1174521
+Node: Redirection API1179371
+Node: Extension API Variables1182392
+Node: Extension Versioning1183117
+Ref: gawk-api-version1183554
+Node: Extension GMP/MPFR Versioning1185342
+Node: Extension API Informational Variables1187048
+Node: Extension API Boilerplate1188209
+Node: Changes from API V11192345
+Node: Finding Extensions1193979
+Node: Extension Example1194554
+Node: Internal File Description1195378
+Node: Internal File Ops1199702
+Ref: Internal File Ops-Footnote-11211260
+Node: Using Internal File Ops1211408
+Ref: Using Internal File Ops-Footnote-11213839
+Node: Extension Samples1214117
+Node: Extension Sample File Functions1215686
+Node: Extension Sample Fnmatch1223824
+Node: Extension Sample Fork1225419
+Node: Extension Sample Inplace1226695
+Node: Extension Sample Ord1230367
+Node: Extension Sample Readdir1231243
+Ref: table-readdir-file-types1232140
+Node: Extension Sample Revout1233278
+Node: Extension Sample Rev2way1233875
+Node: Extension Sample Read write array1234627
+Node: Extension Sample Readfile1237901
+Node: Extension Sample Time1239032
+Node: Extension Sample API Tests1241322
+Node: gawkextlib1241830
+Node: Extension summary1244866
+Node: Extension Exercises1248724
+Node: Language History1250002
+Node: V7/SVR3.11251716
+Node: SVR41254066
+Node: POSIX1255598
+Node: BTL1257023
+Node: POSIX/GNU1257792
+Node: Feature History1264323
+Node: Common Extensions1283441
+Node: Ranges and Locales1284810
+Ref: Ranges and Locales-Footnote-11289611
+Ref: Ranges and Locales-Footnote-21289638
+Ref: Ranges and Locales-Footnote-31289877
+Node: Contributors1290100
+Node: History summary1296305
+Node: Installation1297751
+Node: Gawk Distribution1298715
+Node: Getting1299207
+Node: Extracting1300206
+Node: Distribution contents1301918
+Node: Unix Installation1309998
+Node: Quick Installation1310820
+Node: Compiling with MPFR1313366
+Node: Shell Startup Files1314072
+Node: Additional Configuration Options1315229
+Node: Configuration Philosophy1317616
+Node: Compiling from Git1320118
+Node: Building the Documentation1320677
+Node: Non-Unix Installation1322089
+Node: PC Installation1322565
+Node: PC Binary Installation1323438
+Node: PC Compiling1324343
+Node: PC Using1325521
+Node: Cygwin1329249
+Node: MSYS1330505
+Node: OpenVMS Installation1331137
+Node: OpenVMS Compilation1331818
+Ref: OpenVMS Compilation-Footnote-11333301
+Node: OpenVMS Dynamic Extensions1333363
+Node: OpenVMS Installation Details1334999
+Node: OpenVMS Running1337434
+Node: OpenVMS GNV1341571
+Node: Bugs1342326
+Node: Bug definition1343250
+Node: Bug address1346901
+Node: Usenet1350492
+Node: Performance bugs1351723
+Node: Asking for help1354741
+Node: Maintainers1356732
+Node: Other Versions1357759
+Node: Installation summary1366691
+Node: Notes1368075
+Node: Compatibility Mode1368885
+Node: Additions1369707
+Node: Accessing The Source1370652
+Node: Adding Code1372187
+Node: New Ports1379323
+Node: Derived Files1383833
+Ref: Derived Files-Footnote-11389680
+Ref: Derived Files-Footnote-21389715
+Ref: Derived Files-Footnote-31390332
+Node: Future Extensions1390446
+Node: Implementation Limitations1391118
+Node: Extension Design1392360
+Node: Old Extension Problems1393524
+Ref: Old Extension Problems-Footnote-11395100
+Node: Extension New Mechanism Goals1395161
+Ref: Extension New Mechanism Goals-Footnote-11398657
+Node: Extension Other Design Decisions1398858
+Node: Extension Future Growth1401057
+Node: Notes summary1401681
+Node: Basic Concepts1402894
+Node: Basic High Level1403579
+Ref: figure-general-flow1403861
+Ref: figure-process-flow1404563
+Ref: Basic High Level-Footnote-11407959
+Node: Basic Data Typing1408148
+Node: Glossary1411566
+Node: Copying1444688
+Node: GNU Free Documentation License1482449
+Node: Index1507772
End Tag Table
diff --git a/doc/gawk.texi b/doc/gawk.texi
index c4337378..15c343f0 100644
--- a/doc/gawk.texi
+++ b/doc/gawk.texi
@@ -36855,7 +36855,7 @@ is invoked with the @option{--version} option.
@cindex customized input parser
By default, @command{gawk} reads text files as its input. It uses the value
-of @code{RS} to find the end of the record, and then uses @code{FS}
+of @code{RS} to find the end of an input record, and then uses @code{FS}
(or @code{FIELDWIDTHS} or @code{FPAT}) to split it into fields (@pxref{Reading
Files}).
Additionally, it sets the value of @code{RT} (@pxref{Built-in Variables}).
@@ -36957,8 +36957,9 @@ are as follows:
The name of the file.
@item int fd;
-A file descriptor for the file. If @command{gawk} was able to
-open the file, then @code{fd} will @emph{not} be equal to
+A file descriptor for the file. @command{gawk} attempts to open
+the file for reading using the @code{open()} system call. If it was
+able to open the file, then @code{fd} will @emph{not} be equal to
@code{INVALID_HANDLE}. Otherwise, it will.
@item struct stat sbuf;
@@ -37026,12 +37027,12 @@ input records. The parameters are as follows:
@item char **out
This is a pointer to a @code{char *} variable that is set to point
to the record. @command{gawk} makes its own copy of the data, so
-the extension must manage this storage.
+your extension must manage this storage.
@item struct awk_input *iobuf
-This is the @code{awk_input_buf_t} for the file. The fields should be
-used for reading data (@code{fd}) and for managing private state
-(@code{opaque}), if any.
+This is the @code{awk_input_buf_t} for the file. Two of its fields should
+be used by your extension: @code{fd} for reading data, and @code{opaque}
+for managing any private state.
@item int *errcode
If an error occurs, @code{*errcode} should be set to an appropriate
@@ -37043,7 +37044,7 @@ If the concept of a ``record terminator'' makes sense,
then
@code{*rt_start} should be set to point to the data to be used for
@code{RT}, and @code{*rt_len} should be set to the length of the
data. Otherwise, @code{*rt_len} should be set to zero.
-@command{gawk} makes its own copy of this data, so the
+Here too, @command{gawk} makes its own copy of this data, so your
extension must manage this storage.
@item const awk_fieldwidth_info_t **field_width
@@ -37054,7 +37055,9 @@ field parsing mechanism. Note that this structure will
not
be copied by @command{gawk}; it must persist at least until the next call
to @code{get_record} or @code{close_func}. Note also that @code{field_width} is
@code{NULL} when @code{getline} is assigning the results to a variable, thus
-field parsing is not needed. If the parser does set @code{*field_width},
+field parsing is not needed.
+
+If the parser sets @code{*field_width},
then @command{gawk} uses this layout to parse the input record,
and the @code{PROCINFO["FS"]} value will be @code{"API"} while this record
is active in @code{$0}.
@@ -37108,7 +37111,7 @@ based upon the value of an @command{awk} variable, as
the XML extension
from the @code{gawkextlib} project does (@pxref{gawkextlib}).
In the latter case, code in a @code{BEGINFILE} rule
can look at @code{FILENAME} and @code{ERRNO} to decide whether or
-not to activate an input parser (@pxref{BEGINFILE/ENDFILE}).
+not to activate your input parser (@pxref{BEGINFILE/ENDFILE}).
You register your input parser with the following function:
diff --git a/doc/gawktexi.in b/doc/gawktexi.in
index 47657013..5e1affc9 100644
--- a/doc/gawktexi.in
+++ b/doc/gawktexi.in
@@ -35771,7 +35771,7 @@ is invoked with the @option{--version} option.
@cindex customized input parser
By default, @command{gawk} reads text files as its input. It uses the value
-of @code{RS} to find the end of the record, and then uses @code{FS}
+of @code{RS} to find the end of an input record, and then uses @code{FS}
(or @code{FIELDWIDTHS} or @code{FPAT}) to split it into fields (@pxref{Reading
Files}).
Additionally, it sets the value of @code{RT} (@pxref{Built-in Variables}).
@@ -35873,8 +35873,9 @@ are as follows:
The name of the file.
@item int fd;
-A file descriptor for the file. If @command{gawk} was able to
-open the file, then @code{fd} will @emph{not} be equal to
+A file descriptor for the file. @command{gawk} attempts to open
+the file for reading using the @code{open()} system call. If it was
+able to open the file, then @code{fd} will @emph{not} be equal to
@code{INVALID_HANDLE}. Otherwise, it will.
@item struct stat sbuf;
@@ -35942,12 +35943,12 @@ input records. The parameters are as follows:
@item char **out
This is a pointer to a @code{char *} variable that is set to point
to the record. @command{gawk} makes its own copy of the data, so
-the extension must manage this storage.
+your extension must manage this storage.
@item struct awk_input *iobuf
-This is the @code{awk_input_buf_t} for the file. The fields should be
-used for reading data (@code{fd}) and for managing private state
-(@code{opaque}), if any.
+This is the @code{awk_input_buf_t} for the file. Two of its fields should
+be used by your extension: @code{fd} for reading data, and @code{opaque}
+for managing any private state.
@item int *errcode
If an error occurs, @code{*errcode} should be set to an appropriate
@@ -35959,7 +35960,7 @@ If the concept of a ``record terminator'' makes sense,
then
@code{*rt_start} should be set to point to the data to be used for
@code{RT}, and @code{*rt_len} should be set to the length of the
data. Otherwise, @code{*rt_len} should be set to zero.
-@command{gawk} makes its own copy of this data, so the
+Here too, @command{gawk} makes its own copy of this data, so your
extension must manage this storage.
@item const awk_fieldwidth_info_t **field_width
@@ -35970,7 +35971,9 @@ field parsing mechanism. Note that this structure will
not
be copied by @command{gawk}; it must persist at least until the next call
to @code{get_record} or @code{close_func}. Note also that @code{field_width} is
@code{NULL} when @code{getline} is assigning the results to a variable, thus
-field parsing is not needed. If the parser does set @code{*field_width},
+field parsing is not needed.
+
+If the parser sets @code{*field_width},
then @command{gawk} uses this layout to parse the input record,
and the @code{PROCINFO["FS"]} value will be @code{"API"} while this record
is active in @code{$0}.
@@ -36024,7 +36027,7 @@ based upon the value of an @command{awk} variable, as
the XML extension
from the @code{gawkextlib} project does (@pxref{gawkextlib}).
In the latter case, code in a @code{BEGINFILE} rule
can look at @code{FILENAME} and @code{ERRNO} to decide whether or
-not to activate an input parser (@pxref{BEGINFILE/ENDFILE}).
+not to activate your input parser (@pxref{BEGINFILE/ENDFILE}).
You register your input parser with the following function:
-----------------------------------------------------------------------
Summary of changes:
doc/ChangeLog | 4 +
doc/gawk.info | 305 ++++++++++++++++++++++++++++----------------------------
doc/gawk.texi | 23 +++--
doc/gawktexi.in | 23 +++--
4 files changed, 184 insertions(+), 171 deletions(-)
hooks/post-receive
--
gawk
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [SCM] gawk branch, gawk-5.2-stable, updated. gawk-4.1.0-5000-gc419ea07,
Arnold Robbins <=