qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] RFC Multi-threaded TCG design document


From: Dr. David Alan Gilbert
Subject: Re: [Qemu-devel] RFC Multi-threaded TCG design document
Date: Wed, 17 Jun 2015 17:57:16 +0100
User-agent: Mutt/1.5.23 (2014-03-12)

* Alex Benn?e (address@hidden) wrote:
> Hi,

> Shared Data Structures
> ======================
> 
> Global TCG State
> ----------------
> 
> We need to protect the entire code generation cycle including any post
> generation patching of the translated code. This also implies a shared
> translation buffer which contains code running on all cores. Any
> execution path that comes to the main run loop will need to hold a
> mutex for code generation. This also includes times when we need flush
> code or jumps from the tb_cache.
> 
> DESIGN REQUIREMENT: Add locking around all code generation, patching
> and jump cache modification

I don't think that you require a shared translation buffer between
cores to do this - although it *might* be the easiest way.
You could have a per-core translation buffer, the only requirement is
that most invalidation operations happen on all the buffers
(although that might depend on the emulated architecture).
With a per-core translation buffer, each core could generate new translations
without locking the other cores as long as no one is doing invalidations.

> Memory maps and TLBs
> --------------------
> 
> The memory handling code is fairly critical to the speed of memory
> access in the emulated system.
> 
>   - Memory regions (dividing up access to PIO, MMIO and RAM)
>   - Dirty page tracking (for code gen, migration and display)
>   - Virtual TLB (for translating guest address->real address)
> 
> There is a both a fast path walked by the generated code and a slow
> path when resolution is required. When the TLB tables are updated we
> need to ensure they are done in a safe way by bringing all executing
> threads to a halt before making the modifications.
> 
> DESIGN REQUIREMENTS:
> 
>   - TLB Flush All/Page
>     - can be across-CPUs
>     - will need all other CPUs brought to a halt
>   - TLB Update (update a CPUTLBEntry, via tlb_set_page_with_attrs)
>     - This is a per-CPU table - by definition can't race
>     - updated by it's own thread when the slow-path is forced
> 
> Emulated hardware state
> -----------------------
> 
> Currently the hardware emulation has no protection against
> multiple-accesses. However guest systems accessing emulated hardware
> should be carrying out their own locking to prevent multiple CPUs
> confusing the hardware. Of course there is no guarantee the there
> couldn't be a broken guest that doesn't lock so you could get racing
> accesses to the hardware.
> 
> There is the class of paravirtualized hardware (VIRTIO) that works in
> a purely mmio mode. Often setting flags directly in guest memory as a
> result of a guest triggered transaction.
> 
> DESIGN REQUIREMENTS:
> 
>   - Access to IO Memory should be serialised by an IOMem mutex
>   - The mutex should be recursive (e.g. allowing pid to relock itself)
> 
> IO Subsystem
> ------------
> 
> The I/O subsystem is heavily used by KVM and has seen a lot of
> improvements to offload I/O tasks to dedicated IOThreads. There should
> be no additional locking required once we reach the Block Driver.
> 
> DESIGN REQUIREMENTS:
> 
>   - The dataplane should continue to be protected by the iothread locks

Watch out for where DMA invalidates the translated code.

Dave

> 
> 
> References
> ==========
> 
> [1] 
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/plain/Documentation/memory-barriers.txt
> [2] http://thread.gmane.org/gmane.comp.emulators.qemu/334561
> [3] http://thread.gmane.org/gmane.comp.emulators.qemu/335297
> 
> 
> 
> -- 
> Alex Bennée
--
Dr. David Alan Gilbert / address@hidden / Manchester, UK



reply via email to

[Prev in Thread] Current Thread [Next in Thread]