Re: [pooma-dev] performance concern with Mappers

freepooma-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [pooma-dev] performance concern with Mappers

From:	James Crotinger
Subject:	Re: [pooma-dev] performance concern with Mappers
Date:	Fri, 30 Mar 2001 10:24:48 -0700
User-agent:	Microsoft-Outlook-Express-Macintosh-Edition/5.02.2022

on 3/30/01 4:00 AM, Allan Stokes at address@hidden wrote:
> What I found interesting is that there does not appear to be any kind of
> Context data structure which can be informed about what work units (nodes)
> have been assigned into each context.  Without having this structure as a
> reference point, it's not clear to me how the system can collect information
> about the effectiveness of the different mapping heuristics.

I'm not sure what you mean by "node" here. Do you mean "patch"? Or are you
wanting to have run-time monitoring of the actual amount of work done.

My own view of Pooma's mappers and partitioners is that the ones provided
are relatively simple examples. None of our stuff knows about NUMA topology
or about the topology of the equations being solved. My hope is that users
can write these to support their hardware and application. I think there is
a question as to whether or not we provide enough services for users to do
this effectively. I don't know. Until someone proposes and tries to write
something more complicated, it is a bit difficult to predict what will be
needed. 

> A second consequence of not having a Context data structure is that each
> mapping becomes an independent event.  If a context is already under a
> larger than average work assignment there is no way to discover or avert
> this as additional slabs of data are sliced and diced.
> 
> My observation about the Mapper implementation is that a built-in bias
> exists which causes low numbered contexts to receive more than their fair
> share of node assignments.

For the simple examples that we've looked at, one tends to want all of the
arrays to have the same partitioning and mapping. If this is the case, then
there is no particular benefit to assigning the remainder patches to random
contexts. If, however, one had two relatively independent calculations going
on then there would be some advantage to having their remainder patches go
to different contexts. Again, I only think of our mappers as simple examples
so I guess I'm not too worried about this.

I think that what we should do here is develop custom mappers and
partitioners for these kernel programs. That'll give us a chance to see
whether they make much difference and to see what functionality the
underlying Pooma framework should be providing to support such mappers.

> I suspect that higher dimensionality objects might compound this issue.
> Suppose we have a three dimensional array of size (101,101,101) and we
> partition each dimension into ten segments.  If this works the way I think
> it does, this process would create 1000 nodes (guard cells ignored).  Most
> of the nodes would be 10x10x10 in size.  But there would also be one node
> 11x11x11 in size.  This is a fairly substantial 30% extra work if the
> largest node were rate limiting.  If the partitioning logic has the same
> bias (putting the larger nodes first in the mapper lists) these two effects
> could be compounding negatively.

My only comment is that these are pretty small patches. If instead you're
looking at (1001,1001,1001) then the extra work for the last patch is pretty
small.

I do see your point about the two effects compounding. That should be fixed
in the simple partitioner/mappers.

> One of the properties in the algorithms I've inspected is that the Node list
> is partitioned among the available contexts in contiguous segments.  (Does
> this matter?) 

I think it does matter - if I remember correctly it simplifies the patch
lookup. 

> My second suggestion is that context-to-node bindings be made through some
> kind of function call which could, at some point in the future, also notify
> a global Context data structure about Node assignments accrued.  In the
> meantime, one could at least toss a trace statement at such a control point
> to determine, for example, whether the Mapper issues I've identified are a
> practical concern.

Are you simply proposing that instead of

    templist[i]->context() = c;

we do something like

    Pooma::setcontext(templist[i]->context(),c)

or do we need to pass more info than this to setcontext? I would think that
analysis might also need to know the domain size, affinity, etc.

  Jim

[Prev in Thread]

Current Thread

[Next in Thread]

performance concern with Mappers, Allan Stokes, 2001/03/30
- Re: [pooma-dev] performance concern with Mappers, James Crotinger <=

Prev by Date: performance concern with Mappers
Next by Date: goofy Domain constructors
Previous by thread: performance concern with Mappers
Next by thread: goofy Domain constructors
Index(es):
- Date
- Thread