[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH 00/22] tcg: tb_lock removal
From: |
Emilio G. Cota |
Subject: |
Re: [Qemu-devel] [PATCH 00/22] tcg: tb_lock removal |
Date: |
Fri, 25 Aug 2017 19:23:02 -0400 |
User-agent: |
Mutt/1.5.24 (2015-08-30) |
On Mon, Aug 07, 2017 at 19:52:16 -0400, Emilio G. Cota wrote:
> This series applies on top of the "multiple TCG contexts" series, v4:
> https://lists.gnu.org/archive/html/qemu-devel/2017-07/msg06769.html
>
> Highlights:
>
> - First, fix a few typos I encountered while working on this (patches 1-3).
> I could send them separately to qemu-trivial if you prefer.
> - QHT: use a proper cmp function, instead of just checking pointer values
> to determine equality of matches.
> - Use a binary search tree for each TCG region.
> - Make l1_map lockless by using cmpxchg
> - Introduce page locks (for !user-mode), so that tb_lock is not
> needed when operating on a page
> - Introduce page_collection, to lock a range of pages
> - Introduce tb->jmp_lock to protect TB jump lists.
> - Remove tb_lock. User-mode uses just mmap_lock and tb->jmp_lock's;
> !user-mode uses the same jump locks as well as page locks.
>
> Performance numbers are in patch 22. We get nice speedups, but I still
> see a lot of idling when booting many cores. I suspect it comes from
> cross-CPU events (e.g. TLB invalidations), but I need to profile it
> better (perf is not good for this; mutrace doesn't quite work). But
> anyway that's for another patchset.
The idling is due to BQL contention related to interrupt handling. In the
case of ARM, this boils down to the GICv3 code being single-threaded.
I don't have time right now to make it multi-threaded, but at least we
know where the scalability bottleneck is.
BTW if there's interest I can submit the lock profiler to the list. The code
is in this branch:
https://github.com/cota/qemu/tree/lock-profiler
The first commit has sample output: https://github.com/cota/qemu/commit/c5bda634
Also, any feedback on the parent (tb_lock removal) patchset would be
appreciated.
To make the 2.11 merge easier, I rebased this patchset (as well as the
multi-tcg-v4 set it is based on) on top of rth's tcg-generic-15, fixing a good
bunch of annoying conflicts. The resulting branch is available at:
https://github.com/cota/qemu/tree/tcg-generic-15%2Bmulti-tcg-v4-parallel
Thanks,
Emilio
- [Qemu-devel] [PATCH 10/22] translate-all: iterate over TBs in a page with page_for_each_tb, (continued)
- [Qemu-devel] [PATCH 10/22] translate-all: iterate over TBs in a page with page_for_each_tb, Emilio G. Cota, 2017/08/07
- [Qemu-devel] [PATCH 17/22] translate-all: discard TB when tb_link_page returns an existing matching TB, Emilio G. Cota, 2017/08/07
- [Qemu-devel] [PATCH 16/22] translate-all: use per-page locking in !user-mode, Emilio G. Cota, 2017/08/07
- [Qemu-devel] [PATCH 07/22] tcg: track TBs with per-region BST's, Emilio G. Cota, 2017/08/07
- [Qemu-devel] [PATCH 14/22] translate-all: work page-by-page in tb_invalidate_phys_range_1, Emilio G. Cota, 2017/08/07
- [Qemu-devel] [PATCH 18/22] translate-all: protect TB jumps with a per-destination-TB lock, Emilio G. Cota, 2017/08/07
- [Qemu-devel] [PATCH 20/22] exec: remove tb_lock from notdirty_mem_write, Emilio G. Cota, 2017/08/07
- [Qemu-devel] [PATCH 19/22] cputlb: remove tb_lock from tlb_flush functions, Emilio G. Cota, 2017/08/07
- [Qemu-devel] [PATCH 21/22] translate-all: remove tb_lock mention from cpu_restore_state_from_tb, Emilio G. Cota, 2017/08/07
- [Qemu-devel] [PATCH 22/22] tcg: remove tb_lock, Emilio G. Cota, 2017/08/07
- Re: [Qemu-devel] [PATCH 00/22] tcg: tb_lock removal,
Emilio G. Cota <=