[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: strread.m
From: |
Philip Nienhuis |
Subject: |
Re: strread.m |
Date: |
Wed, 03 Aug 2011 21:47:32 +0200 |
User-agent: |
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.11) Gecko/20100701 SeaMonkey/2.0.6 |
John W. Eaton wrote:
On 3-Aug-2011, Philip Nienhuis wrote:
| John W. Eaton wrote:
|
|> to have many more format options. So why handle textscan with
|> strread?
|
| Because Octave's textscan has been written that way.
| Perhaps the thinking was along the lines of "there's a scripted
| strread.m available; a binary strread replacement can easily be swapped
| in as soon as there is one."
| Ben might be able to tell you more (he is the author of textscan).
| Anyway, would there be a problem in extending (parts of) Octave's
| strread (and textread) versatility beyond that of Matlab's? I guess not.
The Matlab docs say that textscan is intended to replace both textread
and strread. And since textscan seems more versatile than either of
the obsolete functions, it seems better to me to write a complete
textscan implementation in C++ and then perhaps try to use that to
implement strread and textread. Though I think there may be problems
even with that. For example, what does Matlab do with
[a, b] = strread ('1 8 1', '%u8 %u');
a =
1
b =
1
vs
a = textscan ('1 8 1', '%u8 %u');
a =
[2x1 uint8][8]
a{1} =
1
1
a{2} =
8
? I expect that in the first case, it will skip reading the 8 and
return a and b as double values,
Indeed.
while in the second it will read all
three values and return the first and third as uint8 values and the
second as a uint32 value.
Yep.
If so, then I don't know how you would take
the format that is passed to strread and convert it to something that
textscan can use to obtain the same result as strread.
Yeah, a strread deficiency that is unavoidable and caused by strictly
ignoring whitespace: accepting that "cuddling" literals in the format
string can match non-cuddling literals in the file (string). But I have
seen ML textscan behaviour that is not much better; those corner cases
are just more concealed.
Imitating this strread behaviour in Octave (which IMO comes close to
bug-for-bug compatibility) goes along the way outlined in my long post
in bug #33875:
- Separating the format string parsing into a separate utility function
in /private subdir of /io
- Let textscan, textread and also strread call these functions directly
(they need parts of this anyway to a.o., determine number of output args)
- Separating more parts of dev source strread.m into /private utility
functions (exploring file column build-up and matching it to the format;
comment line handling; maybe more)
- Have textread and textscan communicate with strread v.v. using
undocumented parameter/value pairs to convey info & modify behavior.
The latter (communication using undocumented args) is also needed for
properly resuming reading by textscan.
Splitting up strread would make the code easier to maintain as well. But
it wouldn't solve the fundamental issues of the way Octave's strread
parses files (the biggest headache for %g, %c, and %[] formats).
Admittedly this all looks like, or it just is, prolonged polishing of a
big kludge, but IMO it is doable, would need less time investment, and
could be done faster than rebuilding textscan (-.oct) from the ground up
(though I could be wrong there, of course).
If you think it's a waste of time, just say so; no offense taken.
Dumping my work in favor of a compiled textscan (or oct-file called by
textscan-as-it-stands) isn't a problem for me and even preferrable. I
just needed my patches to get urgent things done. I might still go ahead
fixing the current scripts for myself as long as I see an urgent need
while there is no viable alternative. (the beauty of open source)
I've got little proficiency at C++; I can understand simple existing
code and even fix little things, but creating complete functions is
beyond me.
So fixing the script versions is all I can do.
Philip
- Re: Release goals for 3.6, (continued)
Re: Release goals for 3.6, PhilipNienhuis, 2011/08/02
- strread.m (was: Re: Release goals for 3.6), John W. Eaton, 2011/08/02
- Re: strread.m, Philip Nienhuis, 2011/08/02
- Re: strread.m, John W. Eaton, 2011/08/02
- Re: strread.m, Philip Nienhuis, 2011/08/02
- Re: strread.m, John W. Eaton, 2011/08/02
- Re: strread.m,
Philip Nienhuis <=
- Re: strread.m, John W. Eaton, 2011/08/03
- Re: strread.m, Philip Nienhuis, 2011/08/03
- Re: strread.m, John W. Eaton, 2011/08/04
- xtextscan [WAS: Re: strread.m], Philip Nienhuis, 2011/08/04
- Re: strread.m, Ben Abbott, 2011/08/04
Re: strread.m, Ben Abbott, 2011/08/02
Re: strread.m, John W. Eaton, 2011/08/02
Re: Release goals for 3.6, Konstantinos Poulios, 2011/08/03