[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: TCG performance on PPC64
From: |
Matheus K. Ferst |
Subject: |
Re: TCG performance on PPC64 |
Date: |
Thu, 26 May 2022 08:07:07 -0300 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.7.0 |
On 19/05/2022 01:13, David Gibson wrote:
>> What would be different in aarch64 emulation that yields a better
>> performance on our POWER9?
>> - I suppose that aarch64 has more instructions with GVec
implementations
>> than PPC64 and s390x, so maybe aarch64 guests can better use host-vector
>> instructions?
>
> As with Richard, I think it's pretty unlikely that this would make
> such a difference. With a pure number crunching vector workload in
> the guest, maybe, with kernel & userspace boot, not really. It might
> be interesting to configure a guest CPU without vector support to
> double check if it makes any differece though.
>
>> - Looking at the flame graphs of each test (attached), I can see that
>> tb_gen_code takes proportionally less time of aarch64 emulation than
PPC64
>> and s390x, so it might be that decodetree is faster?
>> - There is more than TCG at play, so perhaps the differences can be
better
>> explained by VirtIO performance or something else?
>
> Also seems unlikely to me; I don't really see how this would differ
> enough based on guest type to make the difference we see here.
>
>> Currently, Leandro Lupori is working to improve TLB invalidation[7],
Victor
>> Colombo is working to enable hardfpu in some scenarios, and I'm
reviewing
>> some older helpers that can use GVec or easily implemented inline. We're
>> also planning to add some Power ISA v3.1 instructions to the TCG
backend,
>> but it's probably better to test on hardware if our changes are
doing any
>> good, and we don't have access to a POWER10 yet.
>>
>> Are there any other known performance problems for TCG on PPC64 that we
>> should investigate?
>
> Known? I don't think so. The TCG code is pretty old and clunky
> though, so there could be all manner of problems lurking in there.
>
>
> A couple of thougts:
>
> * I wonder how much emulation of guest side synchronization
> instructions might be a factor here. That's one of the few things
> I can think of where the matchup between host and guest models
> might make a difference.
That's an interesting suggestion, we'll be looking into this. It seems
similar to Nicholas Piggin's recent works, and there is probably more to
be done in this area.
> It might be interesting to try these
> tests with single core guests. Likewise it might be interesting to
> get results with multi-core guests, but MTTCG explicitly disabled.
>
With 50 runs:
+---------+--------------------------------+
| | Host |
| Options +---------------+----------------+
| | PPC64 | x86_64 |
+---------+---------------+----------------+
| -smp 2 | 427.41 ± 7.89 | 350.89 ± 7.62 |
| -smp 1 | 574.01 ± 4.18 | 411.27 ± 17.14 |
| No MTTCG| 588.84 ± 8.50 | 445.30 ± 21.66 |
+---------+---------------+----------------+
The gap with x86 has increased in the two new cases, but I'm not sure if
I can draw anything from this result. Maybe it's just SMT vs.
Hyper-Thread that benefits POWER9 in the initial test, or the Xeon is
better at boosting a single core when QEMU uses only one thread.
> * It might also be interesting to get CPU time results as well as
> elapsed time. That might indicate whether qemu is doing more
> actual work in the slow cases, or if it's blocking for some
> non-obvious reason.
The results above and in my first email were wall clock time, but I also
have user and system times on a GitHub wiki page:
https://github.com/PPC64/qemu/wiki/TCG-Performance-on-PPC64
Thanks,
Matheus K. Ferst
Instituto de Pesquisas ELDORADO <http://www.eldorado.org.br/>
Analista de Software
Aviso Legal - Disclaimer <https://www.eldorado.org.br/disclaimer.html>
- TCG performance on PPC64, Matheus K. Ferst, 2022/05/18
- Re: TCG performance on PPC64, Daniel Henrique Barboza, 2022/05/18
- Re: TCG performance on PPC64, Cédric Le Goater, 2022/05/18
- Re: TCG performance on PPC64, Mark Cave-Ayland, 2022/05/18
- Re: TCG performance on PPC64, Richard Henderson, 2022/05/18
- Re: TCG performance on PPC64, David Gibson, 2022/05/19
- Re: TCG performance on PPC64,
Matheus K. Ferst <=