qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] Re: [PATCH v5 4/5] Inter-VM shared memory PCI device


From: Cam Macdonell
Subject: [Qemu-devel] Re: [PATCH v5 4/5] Inter-VM shared memory PCI device
Date: Mon, 10 May 2010 12:01:01 -0600

On Mon, May 10, 2010 at 11:52 AM, Anthony Liguori <address@hidden> wrote:
> On 05/10/2010 12:43 PM, Cam Macdonell wrote:
>>
>> On Mon, May 10, 2010 at 11:25 AM, Anthony Liguori<address@hidden>
>>  wrote:
>>
>>>
>>> On 05/10/2010 11:59 AM, Avi Kivity wrote:
>>>
>>>>
>>>> On 05/10/2010 06:38 PM, Anthony Liguori wrote:
>>>>
>>>>>
>>>>>
>>>>>>>
>>>>>>> Otherwise, if the BAR is allocated during initialization, I would
>>>>>>> have
>>>>>>> to use MAP_FIXED to mmap the memory.  This is what I did before the
>>>>>>> qemu_ram_mmap() function was added.
>>>>>>>
>>>>>>
>>>>>> What would happen to any data written to the BAR before the the
>>>>>> handshake completed?  I think it would disappear.
>>>>>>
>>>>>
>>>>> You don't have to do MAP_FIXED.  You can allocate a ram area and map
>>>>> that
>>>>> in when disconnected.  When you connect, you create another ram area
>>>>> and
>>>>> memcpy() the previous ram area to the new one.  You then map the second
>>>>> ram
>>>>> area in.
>>>>>
>>>>
>>>> But it's a shared memory area.  Other peers could have connected and
>>>> written some data in.  The memcpy() would destroy their data.
>>>>
>>>
>>> Why try to attempt to support multi-master shared memory?  What's the
>>> use-case?
>>>
>>
>> I don't see it as multi-master, but that the latest guest to join
>> shouldn't have their contents take precedence.  In developing this
>> patch, my motivation has been to let the guests decide.  If the memcpy
>> is always done, even when no data is written, a guest cannot join
>> without overwriting everything.
>>
>> One use case we're looking at is having VMs using a map reduce
>> framework like Hadoop or Phoenix running in VMs.  However, if a
>> workqueue is stored or data transfer passes through shared memory, a
>> system can't scale up the number of workers because each new guest
>> will erase the shared memory (and the workqueue or in progress data
>> transfer).
>>
>
> (Replying again to list)

Sorry about that.

> What data structure would you use?  For a lockless ring queue, you can only
> support a single producer and consumer.  To achieve bidirectional
> communication in virtio, we always use two queues.

MCS locks can work with multiple producer/consumers, either with busy
waiting or using the doorbell mechanism.

>
> If you're adding additional queues to support other levels of communication,
> you can always use different areas of shared memory.

True, and my point is simply that the memcpy would wipe those all out.

>
> I guess this is the point behind the doorbell mechanism?

Yes.

>
> Regards,
>
> Anthony Liguori
>
>> In cases where the latest guest to join wants to clear the memory, it
>> can do so without the automatic memcpy.  The guest can do a memset
>> once it knows the memory is attached.  My opinion is to leave it to
>> the guests and the application that is using the shared memory to
>> decide what to do on guest joins.
>>
>> Cam
>>
>>
>>>
>>> Regards,
>>>
>>> Anthony Liguori
>>>
>>>
>>>>>
>>>>>  From the guest's perspective, it's totally transparent.  For the
>>>>> backend,
>>>>> I'd suggest having an explicit "initialized" ack or something so that
>>>>> it
>>>>> knows that the data is now mapped to the guest.
>>>>>
>>>>
>>>>  From the peers' perspective, it's non-transparent :(
>>>>
>>>> Also it doubles the transient memory requirement.
>>>>
>>>>
>>>>>
>>>>> If you're doing just a ring queue in shared memory, it should allow
>>>>> disconnect/reconnect during live migration asynchronously to the actual
>>>>> qemu
>>>>> live migration.
>>>>>
>>>>>
>>>>
>>>> Live migration of guests using shared memory is interesting.  You'd need
>>>> to freeze all peers on one node, disconnect, reconnect, and restart them
>>>> on
>>>> the other node.
>>>>
>>>>
>>>
>>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>> the body of a message to address@hidden
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>
>



reply via email to

[Prev in Thread] Current Thread [Next in Thread]