[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH 0/3] target-arm: cache tbflags in CPUARMState
From: |
Laurent Desnogues |
Subject: |
Re: [Qemu-devel] [PATCH 0/3] target-arm: cache tbflags in CPUARMState |
Date: |
Mon, 26 Sep 2016 13:13:41 +0200 |
On Mon, Sep 26, 2016 at 12:04 PM, Laurent Desnogues
<address@hidden> wrote:
> Hello,
>
> On Wed, Sep 14, 2016 at 11:56 AM, Paolo Bonzini <address@hidden> wrote:
>> Computing TranslationBlock flags is pretty expensive on ARM, especially
>> 32-bit. Because tbflags are computed on every tb lookup, it is not
>> unlikely to see cpu_get_tb_cpu_state close to the top of the profile
>> now that QHT makes the hash table much more efficient.
>>
>> However, most tbflags only change when the EL is switched or after
>> MSR instructions. Based on this observation, this series caches these
>> tbflags in CPUARMState, resulting in a 10-15% speedup on 32-bit code.
>
> I like that patch!
>
> I quickly tested with some softmmu images on both AArch32 and AArch64
> and I can confirm the speedup.
>
> As far as your patch goes:
>
> Tested-by: Laurent Desnogues <address@hidden>
> Reviewed-by: Laurent Desnogues <address@hidden>
>
> Thanks,
>
> Laurent
>
> PS - BTW, I couldn't run any user mode program since they segfault on
> mainline for some reason I have no time to look into. The v2.7.0 tag
> works.
It turned out this was a mistake on my side.
I ran one SPEC2k test with the patch in user mode, and got a few
percent improvements for both AArch32 and AArch64.
Thanks,
Laurent
>
>> Paolo
>>
>> Paolo Bonzini (3):
>> target-arm: introduce cpu_dynamic_tb_cpu_flags
>> target-arm: add env->tbflags
>> target-arm: cache most tbflags
>>
>> target-arm/cpu.c | 2 ++
>> target-arm/cpu.h | 58
>> ++++++++++++++++++++++++++++++++--------------
>> target-arm/helper.c | 2 ++
>> target-arm/helper.h | 1 +
>> target-arm/op_helper.c | 7 ++++++
>> target-arm/translate-a64.c | 4 ++++
>> target-arm/translate.c | 12 ++++++++--
>> target-arm/translate.h | 1 +
>> 8 files changed, 68 insertions(+), 19 deletions(-)
>>
>> --
>> 2.7.4
>>
>>