qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Re: [PATCH 3/3] disk: don't read from disk until the gu


From: Kevin Wolf
Subject: Re: [Qemu-devel] Re: [PATCH 3/3] disk: don't read from disk until the guest starts
Date: Tue, 14 Sep 2010 10:28:19 +0200
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.12) Gecko/20100907 Fedora/3.0.7-1.fc12 Thunderbird/3.0.7

Am 13.09.2010 22:09, schrieb Anthony Liguori:
> On 09/13/2010 03:03 PM, Kevin Wolf wrote:
>> Am 13.09.2010 21:29, schrieb Stefan Hajnoczi:
>>    
>>> On Mon, Sep 13, 2010 at 3:13 PM, Kevin Wolf<address@hidden>  wrote:
>>>      
>>>> Am 13.09.2010 15:42, schrieb Anthony Liguori:
>>>>        
>>>>> On 09/13/2010 08:39 AM, Kevin Wolf wrote:
>>>>>          
>>>>>>> Yeah, one of the key design points of live migration is to minimize the
>>>>>>> number of failure scenarios where you lose a VM.  If someone typed the
>>>>>>> wrong command line or shared storage hasn't been mounted yet and we
>>>>>>> delay failure until live migration is in the critical path, that would
>>>>>>> be terribly unfortunate.
>>>>>>>
>>>>>>>              
>>>>>> We would catch most of them if we try to open the image when migration
>>>>>> starts and immediately close it again until migration is (almost)
>>>>>> completed, so that no other code can possibly use it before the source
>>>>>> has really closed it.
>>>>>>
>>>>>>            
>>>>> I think the only real advantage is that we fix NFS migration, right?
>>>>>          
>>>> That's the one that we know about, yes.
>>>>
>>>> The rest is not a specific scenario, but a strong feeling that having an
>>>> image opened twice at the same time feels dangerous. As soon as an
>>>> open/close sequence writes to the image for some format, we probably
>>>> have a bug. For example, what about this mounted flag that you were
>>>> discussing for QED?
>>>>        
>>> There is some room left to work in, even if we can't check in open().
>>> One idea would be to do the check asynchronously once I/O begins.  It
>>> is actually easy to check L1/L2 tables as they are loaded.
>>>
>>> The only barrier relationship between I/O and checking is that an
>>> allocating write (which will need to update L1/L2 tables) is only
>>> allowed after check completes.  Otherwise reads and non-allocating
>>> writes may proceed while the image is not yet fully checked.  We can
>>> detect when a table element is an invalid offset and discard it.
>>>      
>> I'm not even talking about such complicated things. You wanted to have a
>> dirty flag in the header, right? So when we allow opening an image
>> twice, you get this sequence with migration:
>>
>> Source: open
>> Destination: open (with dirty image)
>> Source: close
>>
>> The image is now marked as clean, even though the destination is still
>> working on it.
>>    
> 
> The dirty flag should be read on demand (which is the first time we 
> fetch an L1/L2 table).
> 
> I agree that the life cycle of the block drivers is getting fuzzy.  Need 
> to think quite a bit here.

The point is that after the close the flag is wrong on disk. That's
completely unrelated to when you read it. If anything, you could also
delay writing the flag until the first write, so that the close comes first.

But honestly, I'm not very excited about pulling half of the open()
logic into read/write functions.

Kevin



reply via email to

[Prev in Thread] Current Thread [Next in Thread]