Re: [Gomp-discuss] Somethings to think about ....

gomp-discuss

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gomp-discuss] Somethings to think about ....

From:	Lars Segerlund
Subject:	Re: [Gomp-discuss] Somethings to think about ....
Date:	Mon, 10 Mar 2003 15:53:24 +0100
User-agent:	Mozilla/5.0 (X11; U; Linux i586; en-US; rv:1.2.1) Gecko/20021226 Debian/1.2.1-9


Steven Bosscher wrote:

Hi Lars, all,


Op ma 10-03-2003, om 13:44 schreef Lars Segerlund:
I have also been looking at the linux support for MP, and NUMA ( whichwas added lately ), and linux does support affinity and NUMA in thelatest kernels, however to take maximum advantage of this it would bequite resonable to do a native port to linux ( using clone instead ofany thread library ), and only implement the synchronization element'sneeded by openMP.
I think our goal should be to make GCC concurrency-aware and use it for
OpenMP with threads as a first application.  Everything else
(autoparallelization, NUMA-awareness, grid computing, the construction
of HAL, what else?) is beyond the scope of this project.

If you want to make stuff linux-specific, you'd have to stuff it down
the throats of the GCC community with force and violence to make them
accept your contribution and like it.

I didn't want to make stuff linux specific, I wanted to keep thepossibility open of adding machine specific implementations later forefficiency, I do realize that it might hace sounded 'bad'.

Besides, load balancing, scheduler affinity and other low-level SMP/NUMA
stuff is the kind of stuff that a kernels is responsible for.  OpenMP is
not designed for such purpose.

(You can distribute tasks over clusters of CPUs with HPF2 (DISTRIBUTE,
BLOCK, etc.), that gives you some control of how your job will run on a
NUMA machine.  But it's not very portable and as a developer you need to
know all the ins and outs of the machine you're targeting.  I suppose
this explains why I've seen only a few HPF applications that use this
feature...)
Still the first thing to do is to get openMP running with a threadinglibrary, and perhaps ( if smp safe ) a semaphore library.
As for the tasks ahead, I think it's not to hard to use the frameworkin the paper to target the GENERIC tree's ( which is the most resonableform to target IMHO ). The algoritms for an rather good implementationseem's to all be there, and the nice part is that if we extend thepragma handling and add a -fgomp to gcc, we should be able to leave mostof the regulat stuff in place.
I would prefer -fopenmp :-)


 Why do you come up with all the good names :-) ... I do agree.

I do however have a question, I know gcc does support barriers, but towhat extent and in what context ?



What do you mean with "GCC supports barriers"?

The only barriers I know of in GCC are BARRIER insns in RTL.  In that
context, a BARRIER basically is just the marker of the end of a code
block (e.g. after an unconditional jump_insn).  In other words, it
states: "Control flow ends before this".  It is used for code alignment
(i.e. the insn following the BARRIER can be aligned) among other things.

This has nothing to do with barriers they talk about in the OpenMP
specs; that one synchronizes threads.

This I am aware of, however I was not aware of what they were used forin gcc thus the question about them. Also I was not aware that they wereon the RTL level, ( which makes them uninteresting for us ).

As far as I understand it it supportsbarriers which prevents sections of code to be handled together, ( tusenforcing separate optimization ). I'm still looking, but does anybodyknow if this is correct ?



Well, barrier really are just markers for places where there is one and
only one out-edge in the control flow graph.  That does not necessarily
imply that all the optimizers stop there.

For example, after expanding trees to RTL, you'll see that the dump file
is littered with BARRIERs all over, but after some basic flow graph
optimizations (jump!), most of them are gone.  And if that single edge
before the barrier is a back-edge, the loop optimizers use them to
identify loops.  And crossjumping is all-barriers?  In these case,
barriers _allow_ the compiler to identify optimization opportunities!

Thanks, I have looked at this a bit now, and I don't claim tounderstand it fully, but I see what they do now. I'm just desperatelylooking for a mechanism in place to restrict optimizer scope, but Ifigure we do have to make them parallell aware insted.

I thought that we might as well start documenting what we want to dowith gcc, the trees and what we have to modify.
Do you have a plan we can discuss on some mailing list?

Greetz
Steven


 As for a plan, do you mean something concrete ?

I would then suggest that we investigated what is needed to enhanceGENERIC enough to support the form of Diego's paper, since there is aset of algorithms to support this.

I was more thinking that we could have a discussion about what theplan should be :-) ... since we don't have a plan yet.


 Basicly I think along these lines,

1. the lib is trivial to do, and a stub might as well be enough toenable other areas of work to progress.

2. the tree modifications are not that hard, but have to be carefullyplanned in order to be efficient and extendable. Still they have to be done.

3. the algorithms used for the concurrency can be tested on thetree's when these are done, without a proper front and backend, thismight even be a very nice thing to do in order to get some propertesting done. If this phase is basicly 'bug' free I think a lot of laterwork is spared.

4. at this point it should be about time to figure how to interfacewith gcc n the most 'non intrusive' manner. And as I understand it,there could be two routes to this, the first is to make gcc ignore (remove ) the parallell part's of the tree if not -fopenmp is given, andthe second is to enable the extra 'concurrency aware' code if the-fopenmp ( or replace parts of gcc with concurrency aware code ). ( Idon't know if it get's through what I mean, but it's basicly a tightlyknit implementation vs. a loosely knit implementation ).

5. When this is done, it would be reasonable to start doing the codegeneration. ( I haven't given this any thoughts ).

6. Front end work making gcc take advantage of the parallell treesshould be the last thing to get the compiler working, and at this pointwe should have a working implementation.

So I should think we would need a specification of what to do with thetrees and what we need to represent, from there we only have to code alot. ( :-D ).


 / Lars Segerlund.

[Prev in Thread]

Current Thread

[Next in Thread]

[Gomp-discuss] Somethings to think about ...., Lars Segerlund, 2003/03/10
- Re: [Gomp-discuss] Somethings to think about ...., Steven Bosscher, 2003/03/10
  - Re: [Gomp-discuss] Somethings to think about ...., Lars Segerlund <=
    - Re: [Gomp-discuss] Somethings to think about ...., Steven Bosscher, 2003/03/10
    - Re: [Gomp-discuss] Somethings to think about ...., Pop Sébastian, 2003/03/10
    - Re: [Gomp-discuss] Somethings to think about ...., Steven Bosscher, 2003/03/10
    - Re: [Gomp-discuss] Somethings to think about ...., Pop Sébastian, 2003/03/10
    - Re: [Gomp-discuss] Somethings to think about ...., Steven Bosscher, 2003/03/10
    - Re: [Gomp-discuss] Somethings to think about ...., Diego Novillo, 2003/03/10
    - Re: [Gomp-discuss] Somethings to think about ...., Steven Bosscher, 2003/03/10
    - Re: [Gomp-discuss] Somethings to think about ...., Pop Sébastian, 2003/03/11
    - Re: [Gomp-discuss] Somethings to think about ...., Lars Segerlund, 2003/03/11
    - Re: [Gomp-discuss] Somethings to think about ...., Steven Bosscher, 2003/03/11

Prev by Date: Re: [Gomp-discuss] Somethings to think about ....
Next by Date: Re: [Gomp-discuss] Somethings to think about ....
Previous by thread: Re: [Gomp-discuss] Somethings to think about ....
Next by thread: Re: [Gomp-discuss] Somethings to think about ....
Index(es):
- Date
- Thread