octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Parallel access to data created with Octave


From: Jaroslav Hajek
Subject: Re: Parallel access to data created with Octave
Date: Tue, 4 May 2010 07:33:26 +0200

On Mon, May 3, 2010 at 6:53 PM, Jarno Rajahalme <address@hidden> wrote:
> On May 3, 2010, at 8:14 AM, ext Jaroslav Hajek wrote:
>>
>> The problem is that your code may stop working essentially any time
>> because of internal changes, and it may be hard to figure out what is
>> wrong. For instance, in Octave 3.3+, a double real value with
>> is_scalar_type() = true is *not* necessarily an octave_scalar
>> instance.
>>
>
> Thanks for the note. However, in this case I'm extracting values from a 
> structure that I have created myself. I know I could add a couple of checks 
> (outside of the loops) to make sure the structure seems to contain only 
> uint16's (either arrays or scalars).
>

AFAIK for uint16 it currently works, but still may easily stop working.

> For the case where I test for scalar type, it would suffice if numel() on a 
> scalar would not depend on a static initialization of a "static dim_vector 
> dv(1,1); return dv;" within a function dims(). If octave_base_scalar 
> implemented a numel() always returning 1, I would not have to care whether a 
> specific cell is a scalar or not. And it would be faster, too :-).
>

Feel free to do this, but it's just the tip of an iceberg. You simply
shouldn't depend on internals like this.


>> I would suggest you try to hoist as much of the value extraction calls
>> as possible out of the parallelized loops. If it's not possible, you
>> can use #pragma omp critical. Perhaps the pragmas will be inserted in
>> Octave's sources in the future.
>>
>
> In this case doing this would effectively mean copying a multi-megabyte 
> structure from Octave to C, and is not an option, at least for now. I could 
> add a separate initialization function to that effect, but then I would have 
> to worry about calling that each time the Octave structure is changed to keep 
> them in sync.
>

That's not exactly what I meant. Consider your earlier example

 const octave_value snv(su.xelem(0,si));
 const uint16NDArray sna(snv.uint16_array_value());
 const octave_idx_type snl = sna.nelem();
 const octave_uint16 * snp = sna.fortran_vec();

You could do something like:

OCTAVE_LOCAL_BUFFER (uint16NDArray, sna, nsi);

// prepare array values
for (si = 0; si < nsi; si++)
  sna[i] = su(0,si).uint16_array_value ();

// parallel loop
#pragma omp parallel for
for (si = 0; si < nsi; si++)
{
 const octave_idx_type snl = sna[i].nelem(); // thread safe
 const octave_uint16 * snp = sna[i].fortran_vec(); // thread safe
}

The sna buffer is an added memory overhead, but it shouldn't be a
problem unless you have lots of small arrays (which is not efficient
in Octave anyway).

> Declaring value extraction functions critical would unnecessarily slow down 
> the code. IMO, just inspecting data should not do anything that would not 
> work when parallel.

Unfortunately, at the interpreter level, a lot can happen behind the
scene just because uint16_array_value is called. Sometimes, the
resulting array is a shared copy, or it may be created afresh. Because
many things are reference-counted, it often results in race
conditions. Static initializers are used in several places to boost up
things, in particular default constructors.

> I guess there are NOT that many places where things break down. Maybe one 
> issue is with the reference counts being incremented and decremented, when 
> extracting values, and every now and then there will be a race regarding a 
> reference count: 2 threads read the same value, increment it and then 
> independently decrement it, causing the data to be freed. This could be 
> solved by making reference count manipulation critical, or by enabling data 
> inspection without touching the reference values at all (as I have now done).

Yes, the former is an option, but it has disadvantages, because many
of those locks will be redundant, still, I think we may do it in
future. The latter, as explained, is extremely quirky given the very
interchangeable and dynamic nature of octave_values.

regards

-- 
RNDr. Jaroslav Hajek, PhD
computing expert & GNU Octave developer
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz



reply via email to

[Prev in Thread] Current Thread [Next in Thread]