octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: C++ version of regexprep.cc


From: David Bateman
Subject: Re: C++ version of regexprep.cc
Date: Tue, 02 May 2006 16:39:27 +0200
User-agent: Thunderbird 1.5 (Windows/20051201)

Paul Kienzle wrote:

On May 2, 2006, at 7:25 AM, David Bateman wrote:

Paul Kienzle wrote:
David,

octave now goes quickly through the regular expression portion of the code.

I haven't yet confirmed that the results are consistent with matlab.

The next portion involves for loops such as the following:

  tag = cell(number_of_tags,4);
  for i=1:number_of_tags
   tag{i,1} = xml(tag_start(i):tag_end(i))
  end

which for 10000 tags is slow.

Are there octave routines for splitting/joining strings into cells
which are fast?

- Paul

Paul,

Hey, I'm on holidays at the moment, and so have a little time. What about the attached implementation of mat2cell? With this you should be able to repalce the above code with

tag = cell(number_of_tags,4);
tag{:,1} = mat2cell (xml, 1, tag_end - tag_start);

mat2cell partitions the matrix into cells. The xml2cell code extracts substrings.

The following does what I expect:

    xml='<eh><bee>   <see> deed </see>  </bee></eh>';
    tag_start = find(xml=='<');
    tag_end = find(xml=='>');
    pieces = [ tag_start; tag_end+1 ];
    partition = diff([1;pieces(:);length(xml)+1]);
    tag_name = mat2cell (xml, 1, partition) (2:2:end);

    tags = cell(length(tag_start),4);
    tags(:,1) = tag_name';

Ok, I think this confirms what Tom said.

John, do you want mat2cell committed as well?

mat2cell is a core function and it works like expected so the answer should be yes.
Yes, but I commit nothing without Johns say so.


Here are a couple of test cases

/*

%!test
%! x = reshape(1:20,5,4);
%! c = mat2cell(x,[3,2],[3,1]);
%! assert(c,{[1,6,11;2,7,12;3,8,13],[16;17;18];[4,9,14;5,10,15],[19;20]})

%!test
%! x = 'abcdefghij';
%! c = mat2cell(x,1,[0,4,2,0,4,0]);
%! assert(c,{'','abcd','ef','','ghij',''})
Yes but in matlab size('c{1} is [1,0]!!! So it is the test that is in fact wrong. The resize function I recently added might be used here to change the test to

%! assert(c,{resize('',1,0),'abcd',ef',resize('',1,0),'ghij',resize('',1,0)})

Regards
David



*/

The second test is failing because size(c{1}) is [1,0] but size('') is [0,0].

- Paul




reply via email to

[Prev in Thread] Current Thread [Next in Thread]