qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Migration design planning


From: Vladimir Sementsov-Ogievskiy
Subject: Re: [Qemu-devel] Migration design planning
Date: Tue, 1 Mar 2016 18:24:38 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.8.0

On 01.03.2016 16:47, Juan Quintela wrote:
John Snow <address@hidden> wrote:
Hi Juan;
We need your assistance in reviewing two competing designs for migrating
some block data so we can move forward with the feature.

First, some background:

What: Block Dirty Bitmaps. They are simple primitives that keep track of
which clusters have been written to since the last incremental backup.

Why: They are in-ram primitives that do not get migrated as-is alongside
block data, they need to be migrated specially. We want to migrate them
so that the "incremental backup" feature is available after a migration.

How: There are two competing designs, see below.


Design Option #1: Live Migration

Just like block data and ram, we make an initial pass over the data and
then continue to re-transmit data as necessary when block data becomes
dirtied again.

This is a simple, bog-standard approach that mimics pretty closely how
other systems are migrated.

The series is here from November:
https://lists.nongnu.org/archive/html/qemu-devel/2015-11/msg02717.html

Most of the block-specific stuff has been reviewed, but it never got any
reviews by the migration maintainers. It's reasonably rotted by this
point, but it probably would not be a herculean effort to revive it.
After this week I will take a look at this series.

Design Option #2: "Postcopy" Migration

https://lists.nongnu.org/archive/html/qemu-devel/2016-02/msg02793.html

The concept here is that incremental backup data can be treated simply
as best-effort; if it is lost, it's not a big deal. We can reconstitute
the data or simply start a new incremental backup sync point with a full
backup.

The idea then is that instead of the incremental live migration, we just
wait to copy the bitmap until after the pivot and send it all at once.
This is faster and a bit more efficient, and will scale pretty nicely to
even quite large bitmaps.
How big is it?
And what is a normal rate of dirtying of that bitmap?

Default granularity for the bitmap is 64kb, so for different disk sizes bitmap size would very like this:

disk        bitmap
20gb      40kb
512gb    1mb
16tb       32mb

Significant note: we may have several bitmaps per disk (although (for now) in normal case it should be one bitmap for incremental backup)

About rate:
Bit in the bitmap becomes set when corresponding granularity-sized chunk of disk is changed, but after first set it remains unchanged up to the next incremental backup. So, although in worst case dirtying rate would be the same as for disk (on sequential write for example, if corresponding bits are clear in bitmap) in general it should be lower, as consecutive writes to the same sectors of disk would not dirty the bitmap again.



What I'd like from you: a broad acknowledgment of whether or not you
feel the Postcopy solution here is tenable, so we know which solution to
pursue. If we can get an ACK to one or the other method, we can
exhaustively review it from our end before handing it back to you for a
comprehensive migration review. We would like to see this feature hit
2.6 if possible as the designs have been on-list for quite some time.
To make a good evaluation, we need:
- how big are normally that bitmaps
- what is a typical/worst dirty rate

I guess we can go from them.

And you say that you don't care a lot about lossing the bitmap.  "Not
big deal" here means?

Loss of dirty bitmap means that next backup will be full backup instead of incremental, i.e. it will require more time. So "not big deal" means loss of time but not data loss.


Size here is also important, normally we have around 1-2MB of data for
the last stage.  If size is much bigger than that amount, we will
really, really want it to be send "live".


Wondering about the second approach, it is possible for you:

- do normal migration
- run on destination with a new empty bitmap
- transfer the bitmap now
- xor the two bitmaps

Yes, this is exactly what my series do.


Or this is exactly what you are proposing on the second example?  If it
is, what is the error recovery if we lost connection during the transfer of the
bitmap, can you recover (I guess this is the "not big deal") part.

Principally dirty bitmap may be recovered by comparing backup version of disk with the current one, but as connection loss is a rare case (I hope), full backup is appropriate solution too.


Note: now we consider only backup bitmaps, but bitmaps theoretically may be used for other things and by other mechanisms (external for example), and for these cases we can say nothing.
Note2: other uses of bitmaps in Qemu for now:
1. anonimous child of backup bitmap - if we will someday implement migration of backup process, we will need to migrate them, but loss of data will lead to the same full backup 2. block migration dirty tracking - I think we will never implement migration of migration) 3. mirror - I don't know.. dirty bitmap may be recovered by comparison of destination and target, but I'm not sure that it would be appropriate.. However for now we are not considering mirror migration.
please correct me if I'm wrong.


Does this makes any sense?

Later, Juan.

PD.  It is clear by now that I don't understand how you do the backup!

in short:

full backup:
    - backup the whole disk
    - clean dirty bitmap

incremental backup:
- backup only sectors which are dirty (i.e., sectors corresponding to set bits in dirty bitmap)
    - clean dirty bitmap

write on disk:
    - set corresponding bit(s) in dirty bitmap
    - write on disk



--
Best regards,
Vladimir




reply via email to

[Prev in Thread] Current Thread [Next in Thread]