Re: [Qemu-devel] Migration design planning

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Migration design planning

From:	John Snow
Subject:	Re: [Qemu-devel] Migration design planning
Date:	Tue, 1 Mar 2016 14:11:40 -0500
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.0

Of course I have a nasty habit of responding before seeing replies,
apologies.

On 03/01/2016 10:24 AM, Vladimir Sementsov-Ogievskiy wrote:
> On 01.03.2016 16:47, Juan Quintela wrote:
>> John Snow <address@hidden> wrote:
>>> Hi Juan;
>>> We need your assistance in reviewing two competing designs for migrating
>>> some block data so we can move forward with the feature.
>>>
>>> First, some background:
>>>
>>> What: Block Dirty Bitmaps. They are simple primitives that keep track of
>>> which clusters have been written to since the last incremental backup.
>>>
>>> Why: They are in-ram primitives that do not get migrated as-is alongside
>>> block data, they need to be migrated specially. We want to migrate them
>>> so that the "incremental backup" feature is available after a migration.
>>>
>>> How: There are two competing designs, see below.
>>>
>>>
>>> Design Option #1: Live Migration
>>>
>>> Just like block data and ram, we make an initial pass over the data and
>>> then continue to re-transmit data as necessary when block data becomes
>>> dirtied again.
>>>
>>> This is a simple, bog-standard approach that mimics pretty closely how
>>> other systems are migrated.
>>>
>>> The series is here from November:
>>> https://lists.nongnu.org/archive/html/qemu-devel/2015-11/msg02717.html
>>>
>>> Most of the block-specific stuff has been reviewed, but it never got any
>>> reviews by the migration maintainers. It's reasonably rotted by this
>>> point, but it probably would not be a herculean effort to revive it.
>> After this week I will take a look at this series.
>>
>>> Design Option #2: "Postcopy" Migration
>>>
>>> https://lists.nongnu.org/archive/html/qemu-devel/2016-02/msg02793.html
>>>
>>> The concept here is that incremental backup data can be treated simply
>>> as best-effort; if it is lost, it's not a big deal. We can reconstitute
>>> the data or simply start a new incremental backup sync point with a full
>>> backup.
>>>
>>> The idea then is that instead of the incremental live migration, we just
>>> wait to copy the bitmap until after the pivot and send it all at once.
>>> This is faster and a bit more efficient, and will scale pretty nicely to
>>> even quite large bitmaps.
>> How big is it?
>> And what is a normal rate of dirtying of that bitmap?
> 
> Default granularity for the bitmap is 64kb, so for different disk sizes
> bitmap size would very like this:
> 
> disk        bitmap
> 20gb      40kb
> 512gb    1mb
> 16tb       32mb
> 

And in general, smaller granularities don't make sense, so these values
are pragmatic worst-case. You probably would not use a 64KiB cluster
size for a 16TiB disk.

(We technically do allow 32KiB granularities but I can't see them being
useful for anyone. We allow all the way down to 512 bytes, but it's not
an anticipated use case. We try to match the qcow2 granularity of 64K
clusters where practical.)

> Significant note: we may have several bitmaps per disk (although (for
> now) in normal case it should be one bitmap for incremental backup)
> 
> About rate:
> Bit in the bitmap becomes set when corresponding granularity-sized chunk
> of disk is changed, but after first set it remains unchanged up to the
> next incremental backup. So, although in worst case dirtying rate would
> be the same as for disk (on sequential write for example, if
> corresponding bits are clear in bitmap) in general it should be lower,
> as consecutive writes to the same sectors of disk would not dirty the
> bitmap again.
> 
>>
>>
>>> What I'd like from you: a broad acknowledgment of whether or not you
>>> feel the Postcopy solution here is tenable, so we know which solution to
>>> pursue. If we can get an ACK to one or the other method, we can
>>> exhaustively review it from our end before handing it back to you for a
>>> comprehensive migration review. We would like to see this feature hit
>>> 2.6 if possible as the designs have been on-list for quite some time.
>> To make a good evaluation, we need:
>> - how big are normally that bitmaps
>> - what is a typical/worst dirty rate
>>
>> I guess we can go from them.
>>
>> And you say that you don't care a lot about lossing the bitmap.  "Not
>> big deal" here means?
> 
> Loss of dirty bitmap means that next backup will be full backup instead
> of incremental, i.e. it will require more time. So "not big deal" means
> loss of time but not data loss.
> 
>>
>> Size here is also important, normally we have around 1-2MB of data for
>> the last stage.  If size is much bigger than that amount, we will
>> really, really want it to be send "live".
>>
>>
>> Wondering about the second approach, it is possible for you:
>>
>> - do normal migration
>> - run on destination with a new empty bitmap
>> - transfer the bitmap now
>> - xor the two bitmaps
> 
> Yes, this is exactly what my series do.
> 
>>
>> Or this is exactly what you are proposing on the second example?  If it
>> is, what is the error recovery if we lost connection during the
>> transfer of the
>> bitmap, can you recover (I guess this is the "not big deal") part.
> 
> Principally dirty bitmap may be recovered by comparing backup version of
> disk with the current one, but as connection loss is a rare case (I
> hope), full backup is appropriate solution too.
> 
> 
> Note: now we consider only backup bitmaps, but bitmaps theoretically may
> be used for other things and by other mechanisms (external for example),
> and for these cases we can say nothing.
> Note2: other uses of bitmaps in Qemu for now:
>   1. anonimous child of backup bitmap - if we will someday implement
> migration of backup process, we will need to migrate them, but loss of
> data will lead to the same full backup
>   2. block migration dirty tracking - I think we will never implement
> migration of migration)

Hahaha, I hope not.

>   3. mirror - I don't know.. dirty bitmap may be recovered by comparison
> of destination and target, but I'm not sure that it would be
> appropriate.. However for now we are not considering mirror migration.
> please correct me if I'm wrong.
> 

It's conceivable we may wish to allow a mirror-to-nbd operation to
persist across a migration. That's a long ways off, though.

>>
>> Does this makes any sense?
>>
>> Later, Juan.
>>
>> PD.  It is clear by now that I don't understand how you do the backup!
> 
> in short:
> 
> full backup:
>     - backup the whole disk
>     - clean dirty bitmap
> 
> incremental backup:
>     - backup only sectors which are dirty (i.e., sectors corresponding
> to set bits in dirty bitmap)
>     - clean dirty bitmap
> 
> write on disk:
>     - set corresponding bit(s) in dirty bitmap
>     - write on disk
> 
> 
> 

Thanks for the writeup!

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Qemu-devel] Migration design planning, Juan Quintela, 2016/03/01
- Re: [Qemu-devel] Migration design planning, Vladimir Sementsov-Ogievskiy, 2016/03/01
  - Re: [Qemu-devel] Migration design planning, John Snow <=
- Re: [Qemu-devel] Migration design planning, John Snow, 2016/03/01
- Re: [Qemu-devel] Migration design planning, Dr. David Alan Gilbert, 2016/03/02
  - Re: [Qemu-devel] Migration design planning, John Snow, 2016/03/02

Prev by Date: Re: [Qemu-devel] [PATCH] Use special code for sigsetjmp only in cpu-exec.c
Next by Date: Re: [Qemu-devel] Qemu-devel Digest, Vol 156, Issue 15
Previous by thread: Re: [Qemu-devel] Migration design planning
Next by thread: Re: [Qemu-devel] Migration design planning
Index(es):
- Date
- Thread