Re: [pooma-dev] Sparse Engine

freepooma-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [pooma-dev] Sparse Engine

From:	John Hall
Subject:	Re: [pooma-dev] Sparse Engine
Date:	Fri, 26 Sep 2003 03:33:01 -0600

Richard:

The idea is to get rid of the loop over materials found in our previousiterations of our code. In fact we simply need to compute cell-averagequantities for the mixed cells and then perform a single data-parallelcomputation over the single mixed material field which does the workfor all materials at once. So we do a little unstructured work tocompute the cell average quantities and then we do a data parallelcomputation. There are some other advantages that accrue to the use ofthis unstructured approach that would allow us to store someinformation that would normally be too expensive to store, but, wewon't go into that here.

The complication is that we want to (in the grand tradition of POOMA)hide the underlying complexity of our storage scheme and make thingsappear beautiful and logically simple. A good example might be using astorage scheme for a symmetric matrix that only stores an uppertriangular matrix, but, that allows you to access any index into thearray and it internally maps the indices into the correct storagelocation.

In our example, the index array is positive for a pure cell and simplyis the material ID for the material contained in that cell. If theindex array contains a negative value, then it has traditionally beenan index into an unstructured linked-list of the mixed cells data. Wecan then access this data and compute a cell-average value which westore in that cell of the multi-material field and then we perform ourdata-parallel operations on that multi-material field.

We occasionally need to gather all of the pure and mixed materialvalues of a single material so that we can do a single-materialcalculation like an EOS evaluation, so that is why we want the workarray (which we compress/deallocate when we are not using it). So thevarious views of the data that we would take are the multi-materialcell average view, the gathered single material view and the overallcomplicated-storage scheme view. To get the kind of performance the oldcode has we will also need to introduce windowing and activity flags.Basically, we are attempting to throw away any unnecessary computationsand minimize the data we are pushing through cache.

The sparse tile layout doesn't have the concept of indirect addressingusing an index field. It is simply intended for block-AMR type meshes.If we do AMR it would probably be a completely unstructured problem inwhich any cell can be refined, rather than a block type. Unfortunately,this again introduces the possibility of all-to-all communications(slow) to find your neighbors, etc.

We have also been dealing with the issue of how best to do masking. Iam beginning to think that we need another sparse storage idea so thatwe end up with something equivalent to a block where in which the datais collected into lists using a test and the computation is done simplyover that collection, which gets progressively smaller as the number oftests increases. Currently, when using a mask, you end up traversingall of the data, maybe even doing the computation everywhere and thensimply throwing away the result where the mask is not set (either by aconditional or by multiplying by 0.0). Building the list for extremelysparse data can be a huge win. Like I said, the old version of thisalgorithm is running 20 times faster than the data-parallel version.This is only possible by simply doing less work.

We would also like to have exterior guards on a box with a lot of verylittle logically distinct but shared memory patches without guard cellswithin the box. Then we could maybe achieve some reasonablecompression and our computations should approach AMR storage schemeswithout the undesired Gibb's phenomenon due to poor impedance matchingacross T-joints that AMR has.

I should note that we are aware of the issue of not using certain typesof dynamically allocated data structures because the guard cell copyscheme might only move the pointer to the data and not the actual data.We are taking this into account.


Hope this helps,
John Hall


On Friday, September 26, 2003, at 02:07  AM, Richard Guenther wrote:

Ok, still trying to understand - this is something like (statically)
specifying which cells participate in computation? Like you would
have a usual brick engine in conjunction with a bitfield specifying
a mask and using this in the evaluator loop (of course this would be
less memory efficient)? So this would be a cheap way to do this
compared to using the sparse tile layout?

Thanks,

Richard.

On Fri, 26 Sep 2003, John H.Hall wrote:

Richard:
OK. Here goes. The basic idea is that we have a hierarchical field
structure (built using hierarchical engines similar to the current
multi-material field abstraction) which has a collection of 1-D
dynamicFields (for the sparse unstructured storage), a shared Index
(n-D) integer Array (or Field), and a single (n-D) scalar, vector or
tensor field which has either the data for a pure cell, or a cell
average value for mixed-material cell's data. As the problem evolves
the material interfaces migrate and so the actual position of the

unstructured cells changes. However, all the indirect indexing isstill

local to the processor (except for normal guard cell communications).
So this is much simpler than a real unstructured problem with
all-to-all communications. In the general case, the sparse dynamic
fields are only used to compute the cell-average quantities before a
data-parallel computation across the single multi-material or
cell-average field is performed. We would also like to take some views
of the field in which all of the data for a particular material is
gathered/scattered to/from a single spare dynamic work Array that is
shared in this hierarchical structure.

Field<Mesh_t, Real, MixedCellEngine> would look like this:
___________________
|__________________| Single material Gather/Scatter 1-D Dynamic Work
Array (both mixed and pure cells)
______
|_____| mat A (1-D Dynamic Array/Field)
_______
|______| mat B (1-D Dynamic Array/Field)
______
|_____| mat C (1-D Dynamic Array/Field)

_______________________________________________________________________|______________________________________________________________________|

  Cell Average Quantities (n-D)

_______________________________________________________________________|______________________________________________________________________|

  Integer Index Array (n-D)
Single Index Array is shared by all Sparse Fields (e.g. Density,
Pressure, etc.). This shares duty between
providing the material index for a pure cell and an offset into a
collection tracking the unstructured
mixed cell data for a mixed cell.

Multi-Patch should still work although the guard cell communications
might be slightly more complicated.

The number of cells which are indirectly addressed is very small (< 5%
of the total) so even using compressible brick we are wasting a lot of
memory bandwidth and performing numerous extraneous computations. A
comparison code using this structure is running 20 times faster than
the equivalent data parallel POOMA R1 computation for the single
processor serial case. We believe we can match that performance by

building an engine that encapsulates the sparse nature reflected inthe

problem and by making more use of the new engines POOMA R2 provides
(stencil, etc.).

Again, most of the computations are performed on the Cell-Average
Quantities, so we just take a view, operator[]?, that returns that
single field.
John and Jean


--
Richard Guenther <richard dot guenther at uni-tuebingen dot de>
WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/

[Prev in Thread]

Current Thread

[Next in Thread]

Sparse Engine, Jean Marshall, 2003/09/26
- Re: [pooma-dev] Sparse Engine, Richard Guenther, 2003/09/26
  - Re: [pooma-dev] Sparse Engine, John H . Hall, 2003/09/26
    - Re: [pooma-dev] Sparse Engine, Richard Guenther, 2003/09/26
    - Re: [pooma-dev] Sparse Engine, John Hall <=
- Re: [pooma-dev] Sparse Engine, Sergei Mingaleev, 2003/09/26
  - Re: [pooma-dev] Sparse Engine, Richard Guenther, 2003/09/26
    - Re: [pooma-dev] Sparse Engine, Sergei Mingaleev, 2003/09/26
    - Re: [pooma-dev] Sparse Engine, Richard Guenther, 2003/09/26
    - Re: [pooma-dev] Sparse Engine, Sergei Mingaleev, 2003/09/27

Prev by Date: Re: [pooma-dev] Sparse Engine
Next by Date: Re: [pooma-dev] Sparse Engine
Previous by thread: Re: [pooma-dev] Sparse Engine
Next by thread: Re: [pooma-dev] Sparse Engine
Index(es):
- Date
- Thread