[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-devel] [PATCH v10 00/13] block: Incremental backup series (RFC)
From: |
John Snow |
Subject: |
[Qemu-devel] [PATCH v10 00/13] block: Incremental backup series (RFC) |
Date: |
Mon, 22 Dec 2014 20:12:09 -0500 |
This is v10 of the in-memory part of the incremental backup
feature.
There are some remaining issues for which I am requesting
commentary, please see below under the "RFC" section for the
lingering questions I am aware of as of this revision, the
broad overview below, and the detailed changelog.
Enough changes have been made that most Reviewed-By lines from
previous iterations have been removed. (Sorry!)
This series was originally authored by Fam Zheng;
his cover letter is included below.
~John Snow
=================================================================
This is the in memory part of the incremental backup feature.
With the added commands, we can create a bitmap on a block
backend, from which point of time all the writes are tracked by
the bitmap, marking sectors as dirty. Later, we call drive-backup
and pass the bitmap to it, to do an incremental backup.
See the last patch which adds some tests for this use case.
Fam
=================================================================
For convenience, this patchset is available on github:
https://github.com/jnsnow/qemu/commits/dbm-backup
v10 (Overview):
- I've included one of Vladimir's bitmap fixes as patch #1.
- QMP commands and transactions are now protected via
aio_context functions.
- QMP commands use "node-ref" as a parameter name now. Reasoning
is thus: Either a "device name" or a "node name" can be used to
reference a BDS, which is the key item we are actually acting
on with these bitmap commands. Thus, I refer to this unified
parameter as a "Node Reference," or "node-ref."
We could also argue for "backend-ref" or "device-ref" for the
reverse semantics: where we accept a unified parameter, but we
intend to resolve it to the BlockBackend instead of resolving
the parameter given to the BlockDriverState.
Or, we could use "reference" for both cases, and to match the
existing BlockdevRef command.
- All QMP commands added are now per-node via a unified parameter
name. Though backup only operates on "devices," you are free to
create bitmaps for any arbitrary node you wish. It is obviously
only useful for the root node, currently.
- Bitmap Sync modes (CONSUME, RESET) are removed. See below
(changelog, RFC questions) for more details.
- Code to recover the bitmap after a failure has been added,
but I have some major questions about drive_backup guarantees.
See RFC questions.
v10 (Detailed Changelog):
(1/13) New Patch:
- Included Vladimir Sementsov-Ogievskiy's patch that clarifies
the semantics of the bitmap primitives.
(3/13):
- Edited function names for consistency (Stefanha)
- Removed compile-time constants (Stefanha)
- Acquire aio_context for bitmap add/remove (Stefanha)
- Documented optional granularity for bitmap-add in
qmp-commands.hx file (Eric Blake)
- block_dirty_bitmap_lookup moved forward to this patch to allow
more re-use. (Stefanha)
- Fixed a problem where the block_dirty_bitmap_lookup didn't
always set an error if it returned NULL.
- Added an optional return BDS lookup parameter to
bdrv_dirty_bitmap_lookup.
- Renamed "device" to "node-ref" for block_dirty_bitmap_lookup,
adjusted calls to bdrv_lookup_bs() to reflect unified
parameter usage.
- qmp_block_dirty_bitmap_{add,remove} now reference arbitrary
node names via @node-ref.
(4/13):
- Default granularity and granularity getters are both
now uint64_t (Stefanha)
(5/13):
- Added documentation to warn about the necessity of updating
the hbitmap deep copy. (Stefanha)
(6/13)
- Renamed bdrv_reset_dirty_bitmap to bdrv_clear_dirty_bitmap
to be consistent with Vladimir's patches.
- Removed const qualifier for bdrv_copy_dirty_bitmap,
to accommodate patch 8.
(7/13) New Patch:
- Added an hbitmap_merge operation to join two bitmaps together.
(8/13) New Patch:
- Added bdrv_reclaim_dirty_bitmap() to allow us to "roll back" a
bitmap into the bitmap that it was spawned from. This will let
us maintain an accurate account of dirty sectors even after a
failure.
- This adds an "originator" pointer to the BdrvDirtyBitmap and
is why "const" was removed for copy.
(9/13):
- QMP semantics changes as outlined for Patch #3.
- Enable/Disable now protected by aio_context (Stefanha)
(10/13):
- Add coroutine_fn annotation to block backup helper. (Stefanha)
- Removed sync_bitmap_gran from BackupBlockJob, just use the
getter on the BdrvDirtyBitmap instead. (Stefanha)
- bdrv_dirty_iter_set was renamed to bdrv_set_dirty_iter.
- Calls to bdrv_reset_dirty_bitmap are modified to
bdrv_clear_dirty_bitmap to reflect the rename in patch #6.
- Bitmap usage modes (RESET vs. CONSUME) has been deleted, for
the reason of targeting a simpler core usage first before
targeting optimizations. CONSUME is particularly problematic
in the case of errors; so this mode is omitted for now.
- Adjusted error message to use MirrorSyncMode enum, not
(incorrectly) the BitmapSyncMode enum.
- In the event of a failure, the sync_bitmap is now merged back
into the original bitmap so that we do not lose any dirty
bitmap information needlessly.
(11/13):
- Changed block_dirty_bitmap_add_abort to use
qmp_block_dirty_bitmap_remove:
- Old code used bdrv_lookup_bs to acquire bitmap and bs
anyway, and ignored the failure case.
- New code relies on the same information, and with a NULL
errp pointer we will similarly ignore the failure case.
prepare() should have properly vetted this information
before this point anyway.
This new code is also now protected via aio_context through
the use of the QMP commands.
- More code re-use of block_dirty_bitmap_lookup to get both the
bs and bitmap pointers.
- aio_context protection and cleanup in new abort methods.
(13/13):
- Modified test for new parameter names.
v9:
- Edited commit message, for English embetterment (02/10)
- Rebased on top of stefanha/block-next (06,08/10)
- Adjusted error message and line length (07/10)
v8:
- Changed int64_t return for bdrv_dbm_calc_def_granularity to uint64_t (2/10)
- Updated commit message (2/10)
- Removed redundant check for null in device parameter (2/10)
- Removed comment cruft. (2/10)
- Removed redundant local_err propagation (several)
- Updated commit message (3/10)
- Fix HBitmap copy loop index (4/10)
- Remove redundant ternary (5/10)
- Shift up the block_dirty_bitmap_lookup function (6/10)
- Error messages cleanup (7/10)
- Add an assertion to explain the re-use of .prepare() for two transactions.
(8/10)
- Removed BDS argument from bitmap enable/disable helper; it was unused. (8/10)
v7: (highlights)
- First version being authored by jsnow
- Addressed most list feedback from V6, many small changes.
All feedback was either addressed on-list (as a wontfix) or patched.
- Replaced all error_set with error_setg
- Replaced all bdrv_find with bdrv_lookup_bs()
- Adjusted the max granularity to share a common function with
backup/mirror that attempts to "guess" a reasonable default.
It clamps between [4K,64K] currently.
- The BdrvDirtyBitmap object now counts granularity exclusively in
bytes to match its interface.
It leaves the sector granularity concerns to HBitmap.
- Reworked the backup loop to utilize the hbitmap iterator.
There are some extra concerns to handle arrhythmic cases where the
granularity of the bitmap does not match the backup cluster size.
This iteration works best when it does match, but it's not a
deal-breaker if it doesn't -- it just gets less efficient.
- Reworked the transactional functions so that abort() wouldn't "undo"
a redundant command. They now have been split into a prepare and a
commit function (with state) and do not provide an abort command.
- Added a block_dirty_bitmap_lookup(device, name, errp) function to
shorten a few of the commands added in this series, particularly
qmp_enable, qmp_disable, and the transaction preparations.
v6: Re-send of v5.
v5: Rebase to master.
v4: Last version tailored by Fam Zheng.
=================================================================
RFC Questions:
(1) Return values for QMP documentation.
Versions 1-9 indicate that a QMP command will return
"DeviceNotFound," but this seems to be related to the old
error_set implementation instead of the newer(?) error_setg.
How should docs/writing-qmp-commands.txt and this
documentation be updated accordingly?
(2) qmp_drive_backup coherency
Does qmp_drive_backup promise a /coherent/ backup?
This has implications for how we handle the failure mode
for sync bitmaps in block/backup; those implications are
examined below.
== MODE: CONSUME ==
The "CONSUME" BitmapUseMode would take the BdrvDirtyBitmap
and copy the sectors indicated by the bitmap, then discard
the bitmap. This means that from the time of the start of the
backup, nobody is keeping track of new writes.
In the failure case, this means that either:
(A) Backups are coherent: We cannot retry the operation,
because we simply do not know which bits may have been
altered from the time we started the backup.
The usefulness of the bitmap is lost on errors.
(B) Backups are not necessarily coherent: We can retry the
operation by just keeping this bitmap around and retrying
the operation. Additional bits may have been changed
since, but we are only interested in producing some
differential, without regard to coherency. The usefulness
of this is uncertain, because we do not have the needed
information to flush the remaining differential to
another image, so we can never produce a differential that
forms a coherent backing chain.
For the above reasons, I have omitted the CONSUME mode
from v10 of the patchset entirely.
== MODE: RESET ==
The "RESET" BitmapUseMode, now the only mode, makes a copy of
the BdrvDirtyBitmap for use during backup and clears the
original bitmap. This way we are tracking new writes during
backup.
In the failure case, this means that either for coherent or
incoherent snapshots, we should be able to take the list of
sectors we originally meant to copy and merge them back into
the list of sectors that have become modified since we began
copying them. In either the coherent or incoherent case, this
should be sufficient for maintaining data integrity and
allowing for retries on backup failure.
In the incoherent case, this will allow us to flush remaining
dirty clusters to disk to produce a coherent incremental
backup chain when we go to close the drive.
In this mode, a backup retry will then attempt to copy all
clusters modified since the last successful backup, because
modified sectors will accumulate in the bitmap between each
failure.
== EFFICIENCY ==
If backups are not guaranteed or expected to be coherent; we
may leave too many bits set. For example, say we copy
clusters [0 - 100]. Let's say that cluster 75 is one that is
marked dirty.
As we begin our backup, we make a copy of the original bitmap
and clear all bits in the original.
During our backup, as we copy cluster 25, a write occurs that
marks cluster 75 dirty in the original bitmap. Now it is
marked as dirty in both bitmaps.
If drive backups are not coherent, by the time we arrive at
cluster 75, it includes the newest data that marked the bit
dirty, but no bitmaps have cluster 75 marked clean.
Assuming cluster 75 is not written to again before the next
incremental backup is requested, when we go to do our backup
we will copy cluster 75 needlessly.
This scenario only happens if incremental backups are NOT
guaranteed to be self-consistent. If they are, there is no
problem. If they are not, the existing code can be optimized
by marking sectors as clean in the original bitmap as they
are copied out.
== What the heck? ==
The big RFC question here is "Are backups expected to be
self-consistent?" because depending on the answer, we may
need to treat bitmaps different during backup or upon error
after a backup.
Fam Zheng (8):
qapi: Add optional field "name" to block dirty bitmap
qmp: Add block-dirty-bitmap-add and block-dirty-bitmap-remove
block: Introduce bdrv_dirty_bitmap_granularity()
hbitmap: Add hbitmap_copy
qmp: Add block-dirty-bitmap-enable and block-dirty-bitmap-disable
qapi: Add transaction support to block-dirty-bitmap-{add, enable,
disable}
qmp: Add dirty bitmap 'enabled' field in query-block
qemu-iotests: Add tests for drive-backup sync=dirty-bitmap
John Snow (4):
block: Add bdrv_copy_dirty_bitmap and bdrv_clear_dirty_bitmap
hbitmap: add hbitmap_merge
block: add bdrv_reclaim_dirty_bitmap
qmp: Add support of "dirty-bitmap" sync mode for drive-backup
Vladimir Sementsov-Ogievskiy (1):
block: fix spoiling all dirty bitmaps by mirror and migration
block.c | 157 ++++++++++++++++++++++++--
block/backup.c | 119 ++++++++++++++++----
block/mirror.c | 27 +++--
blockdev.c | 254 +++++++++++++++++++++++++++++++++++++++++-
hmp.c | 3 +-
include/block/block.h | 25 ++++-
include/block/block_int.h | 2 +
include/qemu/hbitmap.h | 19 ++++
migration/block.c | 7 +-
qapi-schema.json | 5 +-
qapi/block-core.json | 101 ++++++++++++++++-
qmp-commands.hx | 68 ++++++++++-
tests/qemu-iotests/056 | 33 +++++-
tests/qemu-iotests/056.out | 4 +-
tests/qemu-iotests/iotests.py | 8 ++
util/hbitmap.c | 48 ++++++++
16 files changed, 814 insertions(+), 66 deletions(-)
--
1.9.3
- [Qemu-devel] [PATCH v10 00/13] block: Incremental backup series (RFC),
John Snow <=
- [Qemu-devel] [PATCH v10 01/13] block: fix spoiling all dirty bitmaps by mirror and migration, John Snow, 2014/12/22
- [Qemu-devel] [PATCH v10 02/13] qapi: Add optional field "name" to block dirty bitmap, John Snow, 2014/12/22
- [Qemu-devel] [PATCH v10 03/13] qmp: Add block-dirty-bitmap-add and block-dirty-bitmap-remove, John Snow, 2014/12/22
- [Qemu-devel] [PATCH v10 04/13] block: Introduce bdrv_dirty_bitmap_granularity(), John Snow, 2014/12/22
- [Qemu-devel] [PATCH v10 05/13] hbitmap: Add hbitmap_copy, John Snow, 2014/12/22
- [Qemu-devel] [PATCH v10 07/13] hbitmap: add hbitmap_merge, John Snow, 2014/12/22
- [Qemu-devel] [PATCH v10 06/13] block: Add bdrv_copy_dirty_bitmap and bdrv_clear_dirty_bitmap, John Snow, 2014/12/22
- [Qemu-devel] [PATCH v10 10/13] qmp: Add support of "dirty-bitmap" sync mode for drive-backup, John Snow, 2014/12/22
- [Qemu-devel] [PATCH v10 08/13] block: add bdrv_reclaim_dirty_bitmap, John Snow, 2014/12/22