qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] [PATCH v10 00/13] block: Incremental backup series (RFC)


From: John Snow
Subject: [Qemu-devel] [PATCH v10 00/13] block: Incremental backup series (RFC)
Date: Mon, 22 Dec 2014 20:12:09 -0500

This is v10 of the in-memory part of the incremental backup
feature.

There are some remaining issues for which I am requesting
commentary, please see below under the "RFC" section for the
lingering questions I am aware of as of this revision, the
broad overview below, and the detailed changelog.

Enough changes have been made that most Reviewed-By lines from
previous iterations have been removed. (Sorry!)

This series was originally authored by Fam Zheng;
his cover letter is included below.

~John Snow

=================================================================

This is the in memory part of the incremental backup feature.

With the added commands, we can create a bitmap on a block
backend, from which point of time all the writes are tracked by
the bitmap, marking sectors as dirty. Later, we call drive-backup
and pass the bitmap to it, to do an incremental backup.

See the last patch which adds some tests for this use case.

Fam

=================================================================

For convenience, this patchset is available on github:
                https://github.com/jnsnow/qemu/commits/dbm-backup

v10 (Overview):

 - I've included one of Vladimir's bitmap fixes as patch #1.

 - QMP commands and transactions are now protected via
   aio_context functions.

 - QMP commands use "node-ref" as a parameter name now. Reasoning
   is thus: Either a "device name" or a "node name" can be used to
   reference a BDS, which is the key item we are actually acting
   on with these bitmap commands. Thus, I refer to this unified
   parameter as a "Node Reference," or "node-ref."

   We could also argue for "backend-ref" or "device-ref" for the
   reverse semantics: where we accept a unified parameter, but we
   intend to resolve it to the BlockBackend instead of resolving
   the parameter given to the BlockDriverState.

   Or, we could use "reference" for both cases, and to match the
   existing BlockdevRef command.

 - All QMP commands added are now per-node via a unified parameter
   name. Though backup only operates on "devices," you are free to
   create bitmaps for any arbitrary node you wish. It is obviously
   only useful for the root node, currently.

 - Bitmap Sync modes (CONSUME, RESET) are removed. See below
   (changelog, RFC questions) for more details.

 - Code to recover the bitmap after a failure has been added,
   but I have some major questions about drive_backup guarantees.
   See RFC questions.

v10 (Detailed Changelog):

   (1/13) New Patch:
 - Included Vladimir Sementsov-Ogievskiy's patch that clarifies
   the semantics of the bitmap primitives.

   (3/13):
 - Edited function names for consistency (Stefanha)
 - Removed compile-time constants (Stefanha)
 - Acquire aio_context for bitmap add/remove (Stefanha)
 - Documented optional granularity for bitmap-add in
   qmp-commands.hx file (Eric Blake)
 - block_dirty_bitmap_lookup moved forward to this patch to allow
   more re-use. (Stefanha)
 - Fixed a problem where the block_dirty_bitmap_lookup didn't
   always set an error if it returned NULL.
 - Added an optional return BDS lookup parameter to
   bdrv_dirty_bitmap_lookup.
 - Renamed "device" to "node-ref" for block_dirty_bitmap_lookup,
   adjusted calls to bdrv_lookup_bs() to reflect unified
   parameter usage.
 - qmp_block_dirty_bitmap_{add,remove} now reference arbitrary
   node names via @node-ref.

   (4/13):
 - Default granularity and granularity getters are both
   now uint64_t (Stefanha)

   (5/13):
 - Added documentation to warn about the necessity of updating
   the hbitmap deep copy. (Stefanha)

   (6/13)
 - Renamed bdrv_reset_dirty_bitmap to bdrv_clear_dirty_bitmap
   to be consistent with Vladimir's patches.
 - Removed const qualifier for bdrv_copy_dirty_bitmap,
   to accommodate patch 8.

   (7/13) New Patch:
 - Added an hbitmap_merge operation to join two bitmaps together.

   (8/13) New Patch:
 - Added bdrv_reclaim_dirty_bitmap() to allow us to "roll back" a
   bitmap into the bitmap that it was spawned from. This will let
   us maintain an accurate account of dirty sectors even after a
   failure.
 - This adds an "originator" pointer to the BdrvDirtyBitmap and
   is why "const" was removed for copy.

   (9/13):
 - QMP semantics changes as outlined for Patch #3.
 - Enable/Disable now protected by aio_context (Stefanha)

   (10/13):
 - Add coroutine_fn annotation to block backup helper. (Stefanha)
 - Removed sync_bitmap_gran from BackupBlockJob, just use the
   getter on the BdrvDirtyBitmap instead. (Stefanha)
 - bdrv_dirty_iter_set was renamed to bdrv_set_dirty_iter.
 - Calls to bdrv_reset_dirty_bitmap are modified to
   bdrv_clear_dirty_bitmap to reflect the rename in patch #6.
 - Bitmap usage modes (RESET vs. CONSUME) has been deleted, for
   the reason of targeting a simpler core usage first before
   targeting optimizations. CONSUME is particularly problematic
   in the case of errors; so this mode is omitted for now.
 - Adjusted error message to use MirrorSyncMode enum, not
   (incorrectly) the BitmapSyncMode enum.
 - In the event of a failure, the sync_bitmap is now merged back
   into the original bitmap so that we do not lose any dirty
   bitmap information needlessly.

   (11/13):
 - Changed block_dirty_bitmap_add_abort to use
   qmp_block_dirty_bitmap_remove:
    - Old code used bdrv_lookup_bs to acquire bitmap and bs
      anyway, and ignored the failure case.
    - New code relies on the same information, and with a NULL
      errp pointer we will similarly ignore the failure case.
      prepare() should have properly vetted this information
      before this point anyway.
      This new code is also now protected via aio_context through
      the use of the QMP commands.
 - More code re-use of block_dirty_bitmap_lookup to get both the
   bs and bitmap pointers.
 - aio_context protection and cleanup in new abort methods.

   (13/13):
 - Modified test for new parameter names.

v9:
 - Edited commit message, for English embetterment (02/10)
 - Rebased on top of stefanha/block-next (06,08/10)
 - Adjusted error message and line length (07/10)

v8:
 - Changed int64_t return for bdrv_dbm_calc_def_granularity to uint64_t (2/10)
 - Updated commit message (2/10)
 - Removed redundant check for null in device parameter (2/10)
 - Removed comment cruft. (2/10)
 - Removed redundant local_err propagation (several)
 - Updated commit message (3/10)
 - Fix HBitmap copy loop index (4/10)
 - Remove redundant ternary (5/10)
 - Shift up the block_dirty_bitmap_lookup function (6/10)
 - Error messages cleanup (7/10)
 - Add an assertion to explain the re-use of .prepare() for two transactions.
   (8/10)
 - Removed BDS argument from bitmap enable/disable helper; it was unused. (8/10)

v7: (highlights)
 - First version being authored by jsnow
 - Addressed most list feedback from V6, many small changes.
   All feedback was either addressed on-list (as a wontfix) or patched.
 - Replaced all error_set with error_setg
 - Replaced all bdrv_find with bdrv_lookup_bs()
 - Adjusted the max granularity to share a common function with
   backup/mirror that attempts to "guess" a reasonable default.
   It clamps between [4K,64K] currently.
 - The BdrvDirtyBitmap object now counts granularity exclusively in
   bytes to match its interface.
   It leaves the sector granularity concerns to HBitmap.
 - Reworked the backup loop to utilize the hbitmap iterator.
   There are some extra concerns to handle arrhythmic cases where the
   granularity of the bitmap does not match the backup cluster size.
   This iteration works best when it does match, but it's not a
   deal-breaker if it doesn't -- it just gets less efficient.
 - Reworked the transactional functions so that abort() wouldn't "undo"
   a redundant command. They now have been split into a prepare and a
   commit function (with state) and do not provide an abort command.
 - Added a block_dirty_bitmap_lookup(device, name, errp) function to
   shorten a few of the commands added in this series, particularly
   qmp_enable, qmp_disable, and the transaction preparations.

v6: Re-send of v5.

v5: Rebase to master.

v4: Last version tailored by Fam Zheng.

=================================================================

RFC Questions:

(1) Return values for QMP documentation.

    Versions 1-9 indicate that a QMP command will return
    "DeviceNotFound," but this seems to be related to the old
    error_set implementation instead of the newer(?) error_setg.
    How should docs/writing-qmp-commands.txt and this
    documentation be updated accordingly?

(2) qmp_drive_backup coherency

    Does qmp_drive_backup promise a /coherent/ backup?
    This has implications for how we handle the failure mode
    for sync bitmaps in block/backup; those implications are
    examined below.

    == MODE: CONSUME ==

    The "CONSUME" BitmapUseMode would take the BdrvDirtyBitmap
    and copy the sectors indicated by the bitmap, then discard
    the bitmap. This means that from the time of the start of the
    backup, nobody is keeping track of new writes.

    In the failure case, this means that either:
    (A) Backups are coherent: We cannot retry the operation,
        because we simply do not know which bits may have been
        altered from the time we started the backup.
        The usefulness of the bitmap is lost on errors.
    (B) Backups are not necessarily coherent: We can retry the
        operation by just keeping this bitmap around and retrying
        the operation. Additional bits may have been changed
        since, but we are only interested in producing some
        differential, without regard to coherency. The usefulness
        of this is uncertain, because we do not have the needed
        information to flush the remaining differential to
        another image, so we can never produce a differential that
        forms a coherent backing chain.

    For the above reasons, I have omitted the CONSUME mode
    from v10 of the patchset entirely.

    == MODE: RESET ==

    The "RESET" BitmapUseMode, now the only mode, makes a copy of
    the BdrvDirtyBitmap for use during backup and clears the
    original bitmap. This way we are tracking new writes during
    backup.

    In the failure case, this means that either for coherent or
    incoherent snapshots, we should be able to take the list of
    sectors we originally meant to copy and merge them back into
    the list of sectors that have become modified since we began
    copying them. In either the coherent or incoherent case, this
    should be sufficient for maintaining data integrity and
    allowing for retries on backup failure.

    In the incoherent case, this will allow us to flush remaining
    dirty clusters to disk to produce a coherent incremental
    backup chain when we go to close the drive.

    In this mode, a backup retry will then attempt to copy all
    clusters modified since the last successful backup, because
    modified sectors will accumulate in the bitmap between each
    failure.

    == EFFICIENCY ==

    If backups are not guaranteed or expected to be coherent; we
    may leave too many bits set. For example, say we copy
    clusters [0 - 100]. Let's say that cluster 75 is one that is
    marked dirty.

    As we begin our backup, we make a copy of the original bitmap
    and clear all bits in the original.

    During our backup, as we copy cluster 25, a write occurs that
    marks cluster 75 dirty in the original bitmap. Now it is
    marked as dirty in both bitmaps.

    If drive backups are not coherent, by the time we arrive at
    cluster 75, it includes the newest data that marked the bit
    dirty, but no bitmaps have cluster 75 marked clean.

    Assuming cluster 75 is not written to again before the next
    incremental backup is requested, when we go to do our backup
    we will copy cluster 75 needlessly.

    This scenario only happens if incremental backups are NOT
    guaranteed to be self-consistent. If they are, there is no
    problem. If they are not, the existing code can be optimized
    by marking sectors as clean in the original bitmap as they
    are copied out.

    == What the heck? ==

    The big RFC question here is "Are backups expected to be
    self-consistent?" because depending on the answer, we may
    need to treat bitmaps different during backup or upon error
    after a backup.

Fam Zheng (8):
  qapi: Add optional field "name" to block dirty bitmap
  qmp: Add block-dirty-bitmap-add and block-dirty-bitmap-remove
  block: Introduce bdrv_dirty_bitmap_granularity()
  hbitmap: Add hbitmap_copy
  qmp: Add block-dirty-bitmap-enable and block-dirty-bitmap-disable
  qapi: Add transaction support to block-dirty-bitmap-{add, enable,
    disable}
  qmp: Add dirty bitmap 'enabled' field in query-block
  qemu-iotests: Add tests for drive-backup sync=dirty-bitmap

John Snow (4):
  block: Add bdrv_copy_dirty_bitmap and bdrv_clear_dirty_bitmap
  hbitmap: add hbitmap_merge
  block: add bdrv_reclaim_dirty_bitmap
  qmp: Add support of "dirty-bitmap" sync mode for drive-backup

Vladimir Sementsov-Ogievskiy (1):
  block: fix spoiling all dirty bitmaps by mirror and migration

 block.c                       | 157 ++++++++++++++++++++++++--
 block/backup.c                | 119 ++++++++++++++++----
 block/mirror.c                |  27 +++--
 blockdev.c                    | 254 +++++++++++++++++++++++++++++++++++++++++-
 hmp.c                         |   3 +-
 include/block/block.h         |  25 ++++-
 include/block/block_int.h     |   2 +
 include/qemu/hbitmap.h        |  19 ++++
 migration/block.c             |   7 +-
 qapi-schema.json              |   5 +-
 qapi/block-core.json          | 101 ++++++++++++++++-
 qmp-commands.hx               |  68 ++++++++++-
 tests/qemu-iotests/056        |  33 +++++-
 tests/qemu-iotests/056.out    |   4 +-
 tests/qemu-iotests/iotests.py |   8 ++
 util/hbitmap.c                |  48 ++++++++
 16 files changed, 814 insertions(+), 66 deletions(-)

-- 
1.9.3




reply via email to

[Prev in Thread] Current Thread [Next in Thread]