qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC 24/29] vhost+postcopy: Lock around set_mem_table


From: Maxime Coquelin
Subject: Re: [Qemu-devel] [RFC 24/29] vhost+postcopy: Lock around set_mem_table
Date: Thu, 12 Oct 2017 09:22:56 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.3.0



On 10/09/2017 02:12 PM, Dr. David Alan Gilbert wrote:
* Maxime Coquelin (address@hidden) wrote:


On 10/03/2017 03:23 PM, Dr. David Alan Gilbert wrote:
* Dr. David Alan Gilbert (address@hidden) wrote:
* Maxime Coquelin (address@hidden) wrote:


On 06/28/2017 09:00 PM, Dr. David Alan Gilbert (git) wrote:
From: "Dr. David Alan Gilbert"<address@hidden>

**HACK - better solution needed **
We have the situation where:

        qemu                      bridge

        send set_mem_table
                                 map memory
     a)                          mark area with UFD
                                 send reply with map addresses
     b)                          start using
     c) receive reply

     As soon as (a) happens qemu might start seeing faults
from memory accesses (but doesn't until b); but it can't
process those faults until (c) when it's received the
mmap addresses.

Make the fault handler spin until it gets the reply in (c).

At the very least this needs some proper locks, but preferably
we need to split the message.

Yes, maybe the slave channel could be used to send the ufds with
a dedicated request? The backend would set the reply-ack flag, so that
it starts accessing the guest memory only when Qemu is ready to handle
faults.

Yes, that would make life a lot easier.

Note that the slave channel support has not been implemented in Qemu's
libvhost-user yet, but this is something I can do if we feel the need.

Can you tell me a bit about how the slave channel works?

I've looked at the slave-channel; and I'm worried that it's not suitable
for this case.
The problem is that 'slave_read' is wired to a fd_handler that I think
is serviced by the main thread,
I confirm, this is serviced by the main thread.

and while postcopy is running I don't
want to rely on the operation of the main thread (since it could be
blocked by a page fault).

IIUC, you mean QEMU being blocked by a page fault.

Yes.

In this case, I don't think this is an issue, because QEMU doesn't rely
on the backend to handle the page fault, so the slave request can be
handled only once QEMU has handled the fault.

Maybe I am missing something?

It feels delicate;  with the vhost client blocked waiting for the ack
from the qemu to the registration reply on the slave, and some other
part blocked by a page fault, it makes it sound likely to hit deadlocks
even if I can't put my finger on one.

Right, it is hard to be sure there is no risk of deadlock.

I could still use an explicit ack at that point though over the main
channel I think (or use the slave synchronously?).

Can you please elaborate, I'm not sure to understand what you mean.

In the world I'm currently working on I've got it just using the main
channel but:
    settable -> client
    settable-results -> qemu
    ack -> client

all over the main channel with each side waiting for the other.

Ok, thanks for the clarification.
Maxime

Dave


Thanks,
Maxime

Dave

Dave

Maxime
--
Dr. David Alan Gilbert / address@hidden / Manchester, UK
--
Dr. David Alan Gilbert / address@hidden / Manchester, UK

--
Dr. David Alan Gilbert / address@hidden / Manchester, UK




reply via email to

[Prev in Thread] Current Thread [Next in Thread]