[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Importing large amounts of data
From: |
Francesco Potortì |
Subject: |
Re: Importing large amounts of data |
Date: |
Tue, 19 Jun 2012 09:47:50 +0200 |
>Quoting "Francesco Potortì" <address@hidden>:
>
>>> Right now I am trying to do this with a 150x150x1000 int array. This
>>> array has a small memory footprint in C++ and the file being pushed from
>>> the C++ program to the octave script is around 65MB.
>>
>> Those are 22.5e6 elements. If you are using a binary representation
>> with 4-byte integers, you should have a 90-MB file. If you use 16-bit
>> integers, half that measure. 65 MB, if there are no errors, indicates
>> that you are using a text representation, which is good and easy to
>> debug for your case, but may become slow if you are planning to use much
>> bigger arrays.
>>> When reading this
>>> into Octave it already consumes 8GB of RAM, which is quite a surprise,
>>
>> Octave uses 8-byte floats by default, but it can read and write 1, 2, 4
>> and 8-byte integers. Even when using the default, your array should
>> consume around 180 MB. If you see 8 GB, something is going wrong.
>>
>>> but not the main problem (I have memory to spare right now). However the
>>> reshaping is already going of for two days now on a multi-cpu Xeon Server.
>>
>> This too is strange. Should be in the order of a few seconds at most.
>>
>>> Whats going wrong? How should I approach this to get it done?
>>
>> Tell us exactly what format you are using for writing the file (an
>> example with a small array will suffice) and what commands exactly you
>> use for reading it in. For example, try with a 2x2x3 array first.
>
>Well, like I said, I am writing this as code, that the Octave script
>imports using source().
>
>I have attached a compressed file that is 2.5MB unpacked. The big
>array in this file has dimensions 150x150x55 (roughly). This file
>takes about 20s to import on my Workstation.
>If I export a larger array, like I said, it gets really slow.
Hm. First, I had asked for a 2x2x3 example :)
Anyway, this takes three seconds on my pc, but that may depend on
hardware (this is an AMD Phenom II X4 965):
octave> tic;example;toc
Elapsed time is 3.1 seconds.
octave> whos
Variables in the current scope:
Attr Name Size Bytes Class
==== ==== ==== ===== =====
fam 19x19x55 158840 double
families 1x144 1152 double
indiv 144x144x55 9123840 double
Anyway, this is not the best way to load data into octave. Much better
is to write a file containing the data you need. In text format for
ease of debugging or in binary format for speed. That said, there is
still no explanatin to the huge memory consumption you reported. Try
this:
clear all
source <your file>
whos
and send the output here.
--
Francesco Potortì (ricercatore) Voice: +39.050.315.3058 (op.2111)
ISTI - Area della ricerca CNR Mobile: +39.348.8283.107
via G. Moruzzi 1, I-56124 Pisa Fax: +39.050.315.2040
(entrance 20, 1st floor, room C71) Web: http://fly.isti.cnr.it