freepooma-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [pooma-dev] Re: InlineEvaluator implementation question


From: Jeffrey Oldham
Subject: Re: [pooma-dev] Re: InlineEvaluator implementation question
Date: Mon, 16 Dec 2002 11:42:27 -0800
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.1) Gecko/20021003

Richard Guenther wrote:
On Mon, 16 Dec 2002, Mark Mitchell wrote:


--On Monday, December 16, 2002 04:55:52 PM +0100 Richard Guenther
<address@hidden> wrote:


Hi!

Does anyone remember why we create copies of the LHS and RHS inside
the KernelEvaluator<InlineKernelTag>::evaluate() methods (within
ReductionEvaluator<InlineKernelTag>::evaluate() is similar code)? I.e.
there is code like

 template<class LHS,class Op,class RHS,class Domain>
 inline static void evaluate(const LHS& lhs,const Op& op,const RHS& rhs,
                             const Domain& domain,WrappedInt<1>)
 {
   CTAssert(Domain::unitStride);
   PAssert(domain[0].first() == 0);
   LHS localLHS(lhs);
   RHS localRHS(rhs);
   int e0 = domain[0].length();
   for (int i0=0; i0<e0; ++i0)
     op(localLHS(i0),localRHS.read(i0));
 }

I'm pretty sure that this copy allowed some C++ compilers (KCC) to see
that some parts of lhs/rhs were loop-invariant, and then hoist references
to those fields out of the loop.  (The compiler can see that nothing can
modify localLHS; it's less obvious to it that nothing can modify rhs
since it doesn't know what else might point to that location.)


Hmm - as both, lhs and rhs are declared const, isnt this enough to tell
the compiler? Or has the compiler to assume every function call can have
a side-effect on any (but local) variable?

Well, at least gcc creates worse (larger) code with copying than without.

Using the copies measurably reduced execution time in previous experiments.

Thanks,
Jeffrey D. Oldham
address@hidden

reply via email to

[Prev in Thread] Current Thread [Next in Thread]