chicken-hackers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Chicken-hackers] Argvector handling - maybe we could do better at t


From: Peter Bex
Subject: Re: [Chicken-hackers] Argvector handling - maybe we could do better at that
Date: Tue, 16 Feb 2016 19:46:58 +0100
User-agent: Mutt/1.5.21 (2010-09-15)

On Tue, Feb 16, 2016 at 07:34:20PM +0100, Jörg F. Wittenberger wrote:
> Hi Folks,
> 
> I see a certain call pattern which I believe (as in "wild guess") could
> be the cause of the strange 30% performance loss I observer for some
> kind of load while I see an almost 50% performance gain for other jobs.
> 
> Looks like alternating calls procedures with many arguments (9 and 11 in
> my case) are expensive.  Every second call will not reuse the argument
> vector even though this would be easily possible.

Yeah, that's pretty much expected under the current implementation.
However, it shouldn't be much worse than pre-argvector CHICKENs, which
would consume this much stack space for _every_ procedure call.

Though I guess gcc would use registers for the first N (4? 8?) args,
which means it might use less room in practice.

> If we would allocate argvectors to have at least a compile time defined
> "reasonable" minimum room for X (say 32) words, then we could always
> reuse the argvector if we neither need nor got more than X arguments
> whenever the current code would try to reuse the argument vector simply
> for being big enough.  Otherwise we would do what chicken does now:
> allocate a fresh vector.

That's just kicking the bad behaviour further down the road.

> With the exception of constructors like vector, list, call/cc,
> ##sys#make-structure etc. most calls would end up reusing the very same
> argvector.
> 
> Does this make sense?

Yes, it sort of makes sense.  I think I once suggested a slightly better
alternative: pass the actual size of the argvector plus the argcount.
That would mean that in the alternate 9 and 11 argument calls we allocate
at most 2 vectors, after which we know that it's 11 slots big, even if the
current function takes only 9 arguments.  Then we can re-use it.

> There is another - independent - idea:
> 
> Why actually pass the argvector as a argument of every C call?  There
> can be only one CHICKEN thread per process and I don't see this changing
> any time soon.
> 
> So if we could convince the C compiler to pass the argvector - and as we
> where about changing things the argument count too - in a global
> *register* variable, then we needed zero allocation for most cps calls.
> 
> 
> Still reasonable?

Yes, very reasonable.  It's about on par with my suggestion of passing
the vector size along.  One advantage of my suggestion is that it will
will be in the stack, giving better cache locality.  And it's also easy
to use for different threads, but then again we can do that via thread-
local storage as well, the same way we get a heap per thread etc.

This could be interesting to experiment with, but this will break
"everything", because the calling convention would (probably) change
yet again.  Or maybe there's a clever way to avoid that.  In any case,
it's high impact and we shouldn't let that hold up a 4.11 release.

Cheers,
Peter

Attachment: signature.asc
Description: Digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]