[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Libunwind-devel] Another optimisation for x86-64 fast trace
From: |
Lassi Tuura |
Subject: |
Re: [Libunwind-devel] Another optimisation for x86-64 fast trace |
Date: |
Wed, 30 Mar 2011 17:05:22 +0200 |
Hi,
> Here's one more small performance patch for x86-64 fast trace: a slightly
> lighter getcontext.
For completeness, perhaps I should mention that I also tested with ".p2align 2"
and ".p2align 4" right before ".global _Ux86_64_getcontext_trace". The results
started to be slightly sporadic, but curiously all the aligned versions were
slightly but systematically slower than the unaligned one (by ~1-2%).
The function is definitely unaligned with the patch, at offset 0x4e09 into the
shared library in my case.
I wonder if I started hitting cache collision type effects, and if this is
beginning to be sensitive to the exact tests I am using. I'd be interested to
hear what others see, provided anyone else cares in this much detail.
Regards,
Lassi