octave-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Octave-bug-tracker] [bug #51512] of-io: Missing or wrong types when usi


From: Philip Nienhuis
Subject: [Octave-bug-tracker] [bug #51512] of-io: Missing or wrong types when using xlsread with OCT interface
Date: Fri, 28 Jul 2017 14:10:54 -0400 (EDT)
User-agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:49.0) Gecko/20100101 Firefox/49.0 SeaMonkey/2.46

Follow-up Comment #14, bug #51512 (project octave):

Some timings on my Core Duo i5 box (Win 7 Prof., 8 GB RAM & SSD):

>> obj = rand (1000, 500);
>> lbj = num2cell (obj > 0.5);
>> obj = [num2cell(obj) lbj];
>> xlswrite ("test.xlsx", obj);

>> tic; [~, ~, raw] = xlsread ("test.xlsx", 1, "", "oct"); toc
Checking requested interface(s):
Elapsed time is 93.0533 seconds.

>> tic; [~, ~, raw] = xlsread ("test.xlsx", 1, "", "com"); toc
Checking requested interface(s):
COM*; (* = default interface)
Elapsed time is 3.59621 seconds.

>> tic; [~, ~, raw] = xlsread ("test.xlsx", 1, "", "poi"); toc
Checking requested interface(s):
POI (& OOXML)*; (* = default interface)
Elapsed time is 117.655 seconds.

>> tic; [~, ~, raw] = xlsread ("test.xlsx", 1, "", "uno"); toc
Checking requested interface(s):
UNO*; (* = default interface)
:
<snip UNO warning >
:
Elapsed time is 1117.01 seconds.


So COM (w. Excel 2013 as work horse) is by far the fastest, UNO (invoking
LibreOffice) by far the slowest, and OCT some 25 X slower than COM.
In a way disappointing that proprietary SW is so much faster, but on the
bright side: OCT is still 2nd fastest.

Some profiling:

>> profile on
>> tic; [~, ~, raw] = xlsread ("test.xlsx", 1, "", "oct"); toc
Elapsed time is 93.9684 seconds.
>> profile off
>> profshow
   #         Function Attr     Time (s)   Time (%)        Calls
---------------------------------------------------------------
  71           regexp            63.262      67.43           20
  78       str2double            11.472      12.23            9
  76              cat             4.961       5.29           12
  73          cellfun             3.754       4.00           72
  98          col2num             3.018       3.22      2000002
  85 __OCT_xlsx2oct__             2.368       2.52            1
  64           system             1.224       1.30            2
  83          xls2oct             1.122       1.20            1
 100         num2cell             0.709       0.76            3
  72         cell2mat             0.525       0.56           16
   5           ischar             0.311       0.33      1000047
 102            clear             0.300       0.32            5
  19             cell             0.295       0.31           12
  65            pause             0.200       0.21            1
  82        postfix '             0.081       0.09            8
  66            fread             0.054       0.06            6
  62               cd             0.027       0.03            4
 108        parsecell             0.024       0.03            1
 107          sub2ind             0.011       0.01            1
   6         prefix !             0.009       0.01          161
>> profile clear


I tried experimenting with preallocating regexp's output array but that didn't
help much.
These profiling results show that tuning the regexp, if at all possible, is
probably the best way to get better performance.
Maybe the modification I made (obtaining the cell addresses and contents in
one regexp) is slower than getting them separately with two regexps as in the
original  __OCT_xlsx2oct__.m  .
Note: in this case (for a 1000x1000 array) the XML string to be processed is
about 37 MB in size.


    _______________________________________________________

Reply to this item at:

  <http://savannah.gnu.org/bugs/?51512>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]