[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Octave-bug-tracker] [bug #52892] textread incorrectly reads a text file
From: |
Dan Sebald |
Subject: |
[Octave-bug-tracker] [bug #52892] textread incorrectly reads a text file when empty lines are present |
Date: |
Mon, 15 Jan 2018 03:21:46 -0500 (EST) |
User-agent: |
Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:55.0) Gecko/20100101 Firefox/55.0 |
Follow-up Comment #1, bug #52892 (project octave):
There has been quite a bit of activity concerning textscan, textread, etc. As
you mentioned, there is an internal C++ routine that is used by script files.
I forget exactly which is which, but some bug reports are here:
https://savannah.gnu.org/bugs/index.php?52550
https://savannah.gnu.org/bugs/index.php?52479
However, this testread/scan is one voluminous piece of code to account for so
many different scenarios, and there is a good chance that more adjustments
need to be done.
I have the latest development code and can test your demos here. You mention
snippet, so I'm assuming you manually removed a big chunk of the output
because I see many more lines...so I too will snip the results to the first
ten: I see
octave:1>
[a,b]=textread('cruise_params_with_empty_lines.cfg','%s%s','Delimiter','=','CommentStyle','#');
octave:2> a
a =
{
[1,1] =
[2,1] = .
[3,1] = AB1705
[4,1] =
[5,1] =
[6,1] = 2017
[7,1] = 0
[8,1] = 1
[9,1] = 0
[10,1] = psc
***SNIP***
[123,1] = nan
}
octave:3> b
b =
{
[1,1] = working_directory
[2,1] = cruise_id
[3,1] = cruise_id_prefix
[4,1] = cruise_id_suffix
[5,1] = correct_year
[6,1] = use_mat_for_nav
[7,1] = make_nav
[8,1] = use_sadcp
[9,1] = print_formats
[10,1] = remove_zctd_downcast
***SNIP***
[22,1] = position_fixed
ESCOD
{
[1,1] = working_directory
[2,1] = cruise_id
[3,1] = cruise_id_prefix
[4,1] = cruise_id_suffix
[5,1] = correct_year
[6,1] = use_mat_for_nav
[7,1] = make_nav
[8,1] = use_sadcp
[9,1] = print_formats
***SNIP***
[122,1] = beam2earth_bad_down_beam
[123,1] =
}
octave:4>
[a,b]=textread('cruise_params_no_empty_lines.cfg','%s%s','Delimiter','=','CommentStyle','#');
error: str(0): subscripts must be either integers 1 to (2^63)-1 or logicals
error: called from
strread at line 446 column 5
textread at line 249 column 31
You mentioned that the only difference between these files is the blank lines.
However, when I do a diff comparison ignoring white space, I see the
following differences:
linux@ ~/octave/bug/52892 $ diff cruise_params_no_empty_lines.cfg
cruise_params_with_empty_lines.cfg -wu
--- cruise_params_no_empty_lines.cfg 2018-01-15 01:11:56.643370310 -0600
+++ cruise_params_with_empty_lines.cfg 2018-01-15 01:11:36.559370114 -0600
@@ -4,12 +4,13 @@
# Any line that starts with a "#" will be ignored. Don't add comments #
# after a variable because this can mess up the parsing of this file #
# in some versions of Matlab and Octave. #
-# Using the "percent" percent symbol in comments on a line before a line
#
+# Using the "%" percent symbol in comments on a line before a line #
# with a variable to read can cause that variable to be ignored. #
-# Avoid using "percent"
#
+# Avoid using "%" #
# #
# #
#######################################################################
+
#######################################################################
# to process the cast in the current directory, set this variable to #
# "." without quotes. All the paths in the script should be #
@@ -724,3 +725,4 @@
beam2earth_bad_down_beam=nan
# #
#######################################################################
+
Is there an ancillary bug here that you ran across?
The reason that this is being shifted is that there is one empty line added
that is being read as though it was an entry. Take a look at a[1,1]; it's
blank. So the first non-blank, non-comment item ends up in b[1,1]. Perhaps
that is wrong behavior; don't know. I've tried adding the option
...,"whitespace","\n") and that seems to have no effect. The documentation
indicates:
octave:24> help strread
'strread' is a function from the file
/home/sebald/octave/octave/octave/scripts/io/strread.m
-- [A, ...] = strread (STR)
-- [A, ...] = strread (STR, FORMAT)
-- [A, ...] = strread (STR, FORMAT, FORMAT_REPEAT)
-- [A, ...] = strread (STR, FORMAT, PROP1, VALUE1, ...)
-- [A, ...] = strread (STR, FORMAT, FORMAT_REPEAT, PROP1, VALUE1,
...)
Read data from a string.
***SNIP***
"delimiter"
Any character in VALUE will be used to split STR into words
(default value = any whitespace). Note that whitespace is
implicitly added to the set of delimiter characters unless a
"%s" format conversion specifier is supplied; see "whitespace"
parameter below. The set of delimiter characters cannot be
empty; if needed Octave substitutes a space as delimiter.
***SNIP***
"whitespace"
Any character in VALUE will be interpreted as whitespace and
trimmed; the string defining whitespace must be enclosed in
double quotes for proper processing of special characters like
"\t". In each data field, multiple consecutive whitespace
characters are collapsed into one space and leading and
trailing whitespace is removed. The default value for
whitespace is " \b\r\n\t" (note the space). Whitespace is
always added to the set of delimiter characters unless at
least one "%s" format conversion specifier is supplied; in
that case only whitespace explicitly specified in "delimiter"
is retained as delimiter and removed from the set of
whitespace characters. If whitespace characters are to be
kept as-is (in e.g., strings), specify an empty value (i.e.,
"") for "whitespace"; obviously, whitespace cannot be a
delimiter then.
I think the reason that the blank line is being treated as an item (i.e.,
a[1,1]) is the fact that your example uses at least one %s. The %s means a
string, and it is conceivable that that string is empty. Hence any blank
lines are considered empty strings, I guess.
I'm not sure there is a bug here. What do you think?
_______________________________________________________
Reply to this item at:
<http://savannah.gnu.org/bugs/?52892>
_______________________________________________
Message sent via/by Savannah
http://savannah.gnu.org/
- [Octave-bug-tracker] [bug #52892] textread incorrectly reads a text file when empty lines are present, Pedro Pena, 2018/01/14
- [Octave-bug-tracker] [bug #52892] textread incorrectly reads a text file when empty lines are present,
Dan Sebald <=
- [Octave-bug-tracker] [bug #52892] textread incorrectly reads a text file when empty lines are present, Philip Nienhuis, 2018/01/15
- [Octave-bug-tracker] [bug #52892] textread incorrectly reads a text file when empty lines are present, Philip Nienhuis, 2018/01/15
- [Octave-bug-tracker] [bug #52892] textread incorrectly reads a text file when empty lines are present, Dan Sebald, 2018/01/16
- [Octave-bug-tracker] [bug #52892] textread incorrectly reads a text file when empty lines are present, Pedro Pena, 2018/01/16
- [Octave-bug-tracker] [bug #52892] textread incorrectly reads a text file when empty lines are present, Philip Nienhuis, 2018/01/16
- [Octave-bug-tracker] [bug #52892] textread incorrectly reads a text file when empty lines are present, Pedro Pena, 2018/01/16
- [Octave-bug-tracker] [bug #52892] textread incorrectly reads a text file when empty lines are present, Dan Sebald, 2018/01/17
- [Octave-bug-tracker] [bug #52892] textread incorrectly reads a text file when empty lines are present, Pedro Pena, 2018/01/17