qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] migration: optimize the downtime


From: Paolo Bonzini
Subject: Re: [Qemu-devel] [PATCH] migration: optimize the downtime
Date: Thu, 27 Jul 2017 16:26:36 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1

On 25/07/2017 21:15, Dr. David Alan Gilbert wrote:
> * Paolo Bonzini (address@hidden) wrote:
>> On 24/07/2017 21:03, Dr. David Alan Gilbert wrote:
>>>> I don't like having such a long-lived mutex (it seems like a recipe for
>>>> deadlocks with the BQL), plus memory_region_transaction_commit (the
>>>> expensive part of memory_global_dirty_log_stop) needs to be under the
>>>> BQL itself because it calls MemoryListeners.
>>>>
>>>> Maybe memory_global_dirty_log_stop can delay itself to the next vm_start
>>>> if it's called while runstate_running() returns false (which should be
>>>> always the case)?
>>>>
>>>> It could even be entirely enclosed within memory.c if you do it with a
>>>> VMChangeStateHandler.
>>> This still causes the BQL to be held for quite a while; albeit at a less
>>> critical point.
>>>
>>> In this and the existing case we don't actually need efficiency - what we 
>>> need is just
>>> not to be holding onto the BQL for so long;  could we do a less
>>> efficient commit here, removing one region at a time, yielding the lock
>>> and retaking it?
>>
>> I'm worried that if you release the lock and retake it you open a window
>> where someone else could start a commit, and
>> memory_global_dirty_log_stop's commit would use outdated information.
>>
>> The problem is that a coarse lock, such as the BQL, is very hard to
>> break into another lock with long critical sections (a "memory map
>> commit lock" in this case).  It's obviously very easy to say "the new
>> lock nests inside the BQL"... but then, because you created the lock to
>> move things out of the BQL, any callback inside the new lock cannot take
>> the memory map commit lock, and you need to make entire subsystems
>> thread-safe.  If the new lock nests outside the BQL instead, it's
>> simpler to avoid deadlocks, but you need to be careful about releasing
>> the BQL and any reentrancy implications of doing that.
>>
>> One strategy that worked well so far has been to use two locks, one of
>> them being the BQL, where you can take either lock for reads and both
>> for writes.
> 
> I guess you could say that in the event someone wants to do something
> with the memory map they have to wait for the cleanup to complete?

Yes, you could take pending "invisible" changes (well, toggling dirty
bitmap status is the only invisible change to the guest I guess) and
push those to a separate thread.  That optimization would be in the KVM
memory listener.

Paolo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]