qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] Re: Release of COREMU, a scalable and portable full-system


From: Jan Kiszka
Subject: [Qemu-devel] Re: Release of COREMU, a scalable and portable full-system emulator
Date: Thu, 22 Jul 2010 15:00:53 +0200
User-agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); de; rv:1.8.1.12) Gecko/20080226 SUSE/2.0.0.12-1.1 Thunderbird/2.0.0.12 Mnenhy/0.7.5.666

Stefan Hajnoczi wrote:
> On Thu, Jul 22, 2010 at 9:48 AM, Chen Yufei <address@hidden> wrote:
>> On 2010-7-22, at 上午1:04, Stefan Weil wrote:
>>
>>> Am 21.07.2010 09:03, schrieb Chen Yufei:
>>>> On 2010-7-21, at 上午5:43, Blue Swirl wrote:
>>>>
>>>>
>>>>> On Sat, Jul 17, 2010 at 10:27 AM, Chen Yufei<address@hidden>  wrote:
>>>>>
>>>>>> We are pleased to announce COREMU, which is a "multicore-on-multicore" 
>>>>>> full-system emulator built on Qemu. (Simply speaking, we made Qemu 
>>>>>> parallel.)
>>>>>>
>>>>>> The project web page is located at:
>>>>>> http://ppi.fudan.edu.cn/coremu
>>>>>>
>>>>>> You can also download the source code, images for playing on sourceforge
>>>>>> http://sf.net/p/coremu
>>>>>>
>>>>>> COREMU is composed of
>>>>>> 1. a parallel emulation library
>>>>>> 2. a set of patches to qemu
>>>>>> (We worked on the master branch, commit 
>>>>>> 54d7cf136f040713095cbc064f62d753bff6f9d2)
>>>>>>
>>>>>> It currently supports full-system emulation of x64 and ARM MPcore 
>>>>>> platforms.
>>>>>>
>>>>>> By leveraging the underlying multicore resources, it can emulate up to 
>>>>>> 255 cores running commodity operating systems (even on a 4-core machine).
>>>>>>
>>>>>> Enjoy,
>>>>>>
>>>>> Nice work. Do you plan to submit the improvements back to upstream QEMU?
>>>>>
>>>> It would be great if we can submit our code to QEMU, but we do not know 
>>>> the process.
>>>> Would you please give us some instructions?
>>>>
>>>> --
>>>> Best regards,
>>>> Chen Yufei
>>>>
>>> Some hints can be found here:
>>> http://wiki.qemu.org/Contribute/StartHere
>>>
>>> Kind regards,
>>> Stefan Weil
>> The patch is in the attachment, produced with command
>> git diff 54d7cf136f040713095cbc064f62d753bff6f9d2
>>
>> In order to separate what need to be done to make QEMU parallel, we created 
>> a separate library, and the patched QEMU need to be compiled and linked with 
>> that library. To submit our enhancement to QEMU, maybe we need to 
>> incorporate this library into QEMU. I don't know what would be the best 
>> solution.
>>
>> Our approach to make QEMU parallel can be found at 
>> http://ppi.fudan.edu.cn/coremu
>>
>> I will give a short summary here:
>>
>> 1. Each emulated core thread runs a separate binary translator engine and 
>> has private code cache. We marked some variables in TCG as thread local. We 
>> also modified the TB invalidation mechanism.
>>
>> 2. Each core has a queue holding pending interrupts. The COREMU library 
>> provides this queue, and interrupt notification is done by sending realtime 
>> signals to the emulated core thread.
>>
>> 3. Atomic instruction emulation has to be modified for parallel emulation. 
>> We use lightweight memory transaction which requires only compare-and-swap 
>> instruction to emulate atomic instruction.
>>
>> 4. Some code in the original QEMU may cause data race bug after we make it 
>> parallel. We fixed these problems.
>>
>>
>>
>>
>> --
>> Best regards,
>> Chen Yufei
> 
> Looking at the patch it seems there is a global lock for hardware
> access via coremu_spin_lock(&cm_hw_lock).  How many cores have you
> tried running and do you have lock contention data for cm_hw_lock?

BTW, this kind of lock is called qemu_global_mutex in QEMU, thus it is a
sleepy lock here which is likely better for the code paths protected by
it in upstream. Are they shorter in COREMU?

> Have you thought about making hardware emulation concurrent?
> 
> These are issues that qemu-kvm faces today since it executes vcpu
> threads in parallel.  Both qemu-kvm and the COREMU patches could
> benefit from a solution for concurrent hardware access.

While we are all looking forward to see more scalable hardware models
:), I think it is a topic that can be addressed widely independent of
parallelizing TCG VCPUs. The latter can benefit from the former, for
sure, but it first of all has to solve its own issues.

Note that --enable-io-thread provides truly parallel KVM VCPUs also in
upstream these days. Just for TCG, we need that sightly suboptimal CPU
scheduling inside single-threaded tcg_cpu_exec (was renamed to
cpu_exec_all today).

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux



reply via email to

[Prev in Thread] Current Thread [Next in Thread]