|
From: | Stefano Bonifazi |
Subject: | [Qemu-devel] TCG flow vs dyngen |
Date: | Fri, 10 Dec 2010 22:26:43 +0100 |
User-agent: | Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.12) Gecko/20101027 Thunderbird/3.1.6 |
Hi all! From the technical documentation (http://www.usenix.org/publications/library/proceedings/usenix05/tech/freenix/bellard.html) I read: The first step is to split each target CPU instruction into fewer simpler instructions called micro operations. Each micro operation is implemented by a small piece of C code. This small C source code is compiled by GCC to an object file. The micro operations are chosen so that their number is much smaller (typically a few hundreds) than all the combinations of instructions and operands of the target CPU. The translation from target CPU instructions to micro operations is done entirely with hand coded code. A compile time tool called dyngen uses the object file containing the micro operations as input to generate a dynamic code generator. This dynamic code generator is invoked at runtime to generate a complete host function which concatenates several micro operations.instead from wikipedia(http://en.wikipedia.org/wiki/QEMU) and other sources I read: The Tiny Code Generator (TCG) aims to remove the shortcoming of relying on a particular version of GCC or any compiler, instead incorporating the compiler (code generator) into other tasks performed by QEMU in run-time. The whole translation task thus consists of two parts: blocks of target code (TBs) being rewritten in TCG ops - a kind of machine-independent intermediate notation, and subsequently this notation being compiled for the host's architecture by TCG. Optional optimisation passes are performed between them.- So, I think that the technical documentation is now obsolete, isn't it? - The "old way" used much offline (compile time) work compiling the micro operations into host machine code, while if I understand well, TCG does everything in run-time(please correct me if I am wrong!).. so I wonder, how can it be as fast as the previous method (or even faster)? - If I understand well, TGC runtime flow is the following: - TCG takes the target binary, and splits it into target blocks - if the TB is not cached, TGC translates it (or better the target instructions it is composed by) into TCG micro ops, - TGC compiles TGC uops into host object code, - TGC caches the TB, - TGC tries to chain the block with others, - TGC copies the TB into the execution buffer - TGC runs it Am I right? Please correct me, whether I am wrong, as I wanna use that flow scheme for trying to understand the code.. Thank you very much in advance! Stefano B. |
[Prev in Thread] | Current Thread | [Next in Thread] |