Re: newline in strread format

octave-maintainers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: newline in strread format

From:	Philip Nienhuis
Subject:	Re: newline in strread format
Date:	Tue, 24 Jun 2014 20:49:03 +0200
User-agent:	Mozilla/5.0 (Windows NT 5.1; rv:29.0) Gecko/20100101 Firefox/29.0 SeaMonkey/2.26.1

John W. Eaton wrote:

I think the following should work, but it's throwing an error:

   octave:1> [a, b, c] = strread ("1 2 3\n4 5 6\n7 8 9\n", "%f %f %f\n")
   strread: FORMAT does not match data
   error: called from 'strread' in file
/home/jwe/src/octave-stable/scripts/io/strread.m near line 745, column 13
   error: called from:
   error:   /home/jwe/src/octave-stable/scripts/io/strread.m at line
755, column 9

I took a look at strread.m but I'm not sure what the right approach is
for a fix.  It appears to me that newlines are removed from the string
and that it is split on the field delimiter.  But the newline
character remains in the list of format specifiers.

I noticed the problem in the stable sources, but it seems to also be
present in the current sources on the default branch.

Any clues would be much appreciated.

As I wrote most of the strread.m code processing this stuff I supposeI'm the one to have a look at it.

But just to be sure before I give it a go: is this undocumented, or atleast obscure, ML behavior we have at hand here?

AFAIU the Matlab docs, any non-[format conversion specifier] in theformat string is to be treated as a literal. That is the reasonstrread.m retains it in the format specifier list.

Experimenting a bit with ML r2014a shows that Matlab does this too.

Now, strread.m removes the "regular delimiters" (the ones specified bythe user or the default ones) long before it processes literals; eventhough literals can be interpreted as just another delimiter (but thenagain, only in positions/columns in the data specified in the formatstring).

There are a few possible strategies (assuming this issue you brought uponly pertains to valid delimiters specified as a literal):

1. strread.m could scan the format string and remove any delimiters itfinds there from the list of delimiters.

2. Or it could just remove it from the list of literals and add it tothe delimiters list (probably the easiest).

3. strread.m could first strrep literals in the data string into a validdelimiter - a costly operation on big data strings (/-files). And thiswouldn't honor that literals should only be processed as such in data"columns" specified in the format string (in other positions they shouldbe processed as a string value).


1. and 2. are easy mods; bu maybe too quick & dirty.

However due care is required for e.g., overlapping sets of whitespaceand delimiters. I think experimenting with ML to find out the properbehavior is required plus a set of tests for Octave. I do not have timefor that currently.


Either way, I see quite a few pitfalls and corner cases looming.

Given the fact that strread.m is already a bit of a dinosaur and we'reactually waiting for a binary textscan, I wonder whether it is worthwilespending much time fixing this issue - any opinion on that?


Philip

[Prev in Thread]

Current Thread

[Next in Thread]

newline in strread format, John W. Eaton, 2014/06/23
- Re: newline in strread format, Daniel J Sebald, 2014/06/24
  - Re: newline in strread format, Ben Abbott, 2014/06/24
    - Re: newline in strread format, Daniel J Sebald, 2014/06/24
- Re: newline in strread format, Philip Nienhuis <=
  - Re: newline in strread format, Philip Nienhuis, 2014/06/24

Prev by Date: Re: sombrero default argument (matlab compatibility)
Next by Date: Re: sombrero default argument (matlab compatibility)
Previous by thread: Re: newline in strread format
Next by thread: Re: newline in strread format
Index(es):
- Date
- Thread