qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 1/2] qapi: Introduce blockdev-group-snapshot-


From: Kevin Wolf
Subject: Re: [Qemu-devel] [PATCH v2 1/2] qapi: Introduce blockdev-group-snapshot-sync command
Date: Mon, 27 Feb 2012 16:46:51 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0.1) Gecko/20120209 Thunderbird/10.0.1

Am 26.02.2012 20:31, schrieb Jeff Cody:
> This is a QAPI/QMP only command to take a snapshot of a group of
> devices. This is similar to the blockdev-snapshot-sync command, except
> blockdev-group-snapshot-sync accepts a list devices, filenames, and
> formats.
> 
> It is attempted to keep the snapshot of the group atomic; if the
> creation or open of any of the new snapshots fails, then all of
> the new snapshots are abandoned, and the name of the snapshot image
> that failed is returned.  The failure case should not interrupt
> any operations.
> 
> Rather than use bdrv_close() along with a subsequent bdrv_open() to
> perform the pivot, the original image is never closed and the new
> image is placed 'in front' of the original image via manipulation
> of the BlockDriverState fields.  Thus, once the new snapshot image
> has been successfully created, there are no more failure points
> before pivoting to the new snapshot.
> 
> This allows the group of disks to remain consistent with each other,
> even across snapshot failures.
> 
> Signed-off-by: Jeff Cody <address@hidden>
> ---
>  block.c          |   47 ++++++++++++++++++++
>  block.h          |    1 +
>  blockdev.c       |  128 
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  qapi-schema.json |   38 ++++++++++++++++
>  4 files changed, 214 insertions(+), 0 deletions(-)
> 
> diff --git a/block.c b/block.c
> index 3621d11..0045ab1 100644
> --- a/block.c
> +++ b/block.c
> @@ -880,6 +880,53 @@ void bdrv_make_anon(BlockDriverState *bs)
>      bs->device_name[0] = '\0';
>  }
>  
> +/*
> + * Add new bs contents at the top of an image chain while the chain is live,
> + * while keeping required fields on the top layer.
> + *
> + * It is assumed that bs_new already points to an existing image,
> + * with the correct backing filename of top->backing_file
> + */
> +void bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top)
> +{
> +    BlockDriverState tmp;
> +
> +    /* the new bs must not be in bdrv_states */
> +    bdrv_make_anon(bs_new);
> +
> +    tmp = *bs_new;
> +    tmp.backing_hd = bs_new;
> +
> +    /* there are some fields that need to stay on the top layer: */
> +
> +    /* dev info */
> +    tmp.dev_ops          = bs_top->dev_ops;
> +    tmp.dev_opaque       = bs_top->dev_opaque;
> +    tmp.dev              = bs_top->dev;
> +    tmp.buffer_alignment = bs_top->buffer_alignment;
> +    tmp.copy_on_read     = bs_top->copy_on_read;
> +
> +    /* i/o timing parameters */
> +    tmp.slice_time        = bs_top->slice_time;
> +    tmp.slice_start       = bs_top->slice_start;
> +    tmp.slice_end         = bs_top->slice_end;
> +    tmp.io_limits         = bs_top->io_limits;
> +    tmp.io_base           = bs_top->io_base;
> +    tmp.throttled_reqs    = bs_top->throttled_reqs;
> +    tmp.block_timer       = bs_top->block_timer;
> +    tmp.io_limits_enabled = bs_top->io_limits_enabled;

What about iostatus_enabled/iostatus? Possibly on_read/write_error and
geometry, too. (Maybe the correct answer is that you're doing the right
thing, but I wanted to bring it up)

> +
> +    /* keep the same entry in bdrv_states */
> +    pstrcpy(tmp.device_name, sizeof(tmp.device_name), bs_top->device_name);
> +    tmp.list = bs_top->list;
> +
> +    /* swap contents of the fixed new bs and the current top */
> +    *bs_new = *bs_top;
> +    *bs_top = tmp;
> +
> +    bdrv_detach_dev(bs_new, bs_new->dev);
> +}

The last line would actually deserve a comment /* clear the copied
fields in the new backing file */, which makes clear that there are
probably some more fields missing in this section.

> +
>  void bdrv_delete(BlockDriverState *bs)
>  {
>      assert(!bs->dev);
> diff --git a/block.h b/block.h
> index cae289b..190a780 100644
> --- a/block.h
> +++ b/block.h
> @@ -114,6 +114,7 @@ int bdrv_create(BlockDriver *drv, const char* filename,
>  int bdrv_create_file(const char* filename, QEMUOptionParameter *options);
>  BlockDriverState *bdrv_new(const char *device_name);
>  void bdrv_make_anon(BlockDriverState *bs);
> +void bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top);
>  void bdrv_delete(BlockDriverState *bs);
>  int bdrv_parse_cache_flags(const char *mode, int *flags);
>  int bdrv_file_open(BlockDriverState **pbs, const char *filename, int flags);
> diff --git a/blockdev.c b/blockdev.c
> index 05e7c5e..560f7e8 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -714,6 +714,134 @@ void qmp_blockdev_snapshot_sync(const char *device, 
> const char *snapshot_file,
>      }
>  }
>  
> +
> +/* New and old BlockDriverState structs for group snapshots */
> +typedef struct BlkGroupSnapshotStates {
> +    BlockDriverState *old_bs;
> +    BlockDriverState *new_bs;
> +    bool is_open;
> +    QSIMPLEQ_ENTRY(BlkGroupSnapshotStates) entry;
> +} BlkGroupSnapshotStates;
> +
> +/*
> + * 'Atomic' group snapshots.  The snapshots are taken as a set, and if any 
> fail
> + *  then we do not pivot any of the devices in the group, and abandon the
> + *  snapshots
> + */
> +void qmp_blockdev_group_snapshot_sync(SnapshotDevList *dev_list,
> +                                                   Error **errp)

Too much indentation?

> +{
> +    int ret = 0;
> +    SnapshotDevList *dev_entry = dev_list;
> +    SnapshotDev *dev_info = NULL;
> +    BlkGroupSnapshotStates *states;
> +    BlockDriver *proto_drv;
> +    BlockDriver *drv;
> +    int flags;
> +    const char *format;
> +    const char *snapshot_file;
> +
> +    QSIMPLEQ_HEAD(snap_bdrv_states, BlkGroupSnapshotStates) snap_bdrv_states;
> +    QSIMPLEQ_INIT(&snap_bdrv_states);
> +
> +    /* We don't do anything in this loop that commits us to the snapshot */
> +    while (NULL != dev_entry) {
> +        dev_info = dev_entry->value;
> +        dev_entry = dev_entry->next;
> +
> +        states = g_malloc0(sizeof(BlkGroupSnapshotStates));
> +        states->is_open = false;
> +
> +        states->old_bs = bdrv_find(dev_info->device);
> +
> +        if (!states->old_bs) {
> +            error_set(errp, QERR_DEVICE_NOT_FOUND, dev_info->device);
> +            goto fail;
> +        }
> +
> +        if (bdrv_in_use(states->old_bs)) {
> +            error_set(errp, QERR_DEVICE_IN_USE, dev_info->device);
> +            goto fail;
> +        }
> +
> +        snapshot_file = dev_info->snapshot_file;
> +
> +        flags = states->old_bs->open_flags;
> +
> +        if (!dev_info->has_format) {
> +            format = "qcow2";
> +        } else {
> +            format = dev_info->format;
> +        }
> +
> +        drv = bdrv_find_format(format);
> +        if (!drv) {
> +            error_set(errp, QERR_INVALID_BLOCK_FORMAT, format);
> +            goto fail;
> +        }
> +
> +        proto_drv = bdrv_find_protocol(snapshot_file);
> +        if (!proto_drv) {
> +            error_set(errp, QERR_INVALID_BLOCK_FORMAT, format);
> +            goto fail;
> +        }
> +
> +        /* create new image w/backing file */
> +        ret = bdrv_img_create(snapshot_file, format,
> +                              states->old_bs->filename,
> +                              drv->format_name, NULL, -1, flags);
> +        if (ret) {
> +            error_set(errp, QERR_OPEN_FILE_FAILED, snapshot_file);
> +            goto fail;
> +        }
> +
> +        /* We will manually add the backing_hd field to the bs later */
> +        states->new_bs = bdrv_new("");
> +        ret = bdrv_open(states->new_bs, snapshot_file,
> +                        flags | BDRV_O_NO_BACKING, drv);
> +        states->is_open = true;
> +
> +        if (ret != 0) {
> +            error_set(errp, QERR_OPEN_FILE_FAILED, snapshot_file);
> +            goto close_and_fail;
> +        }
> +        QSIMPLEQ_INSERT_TAIL(&snap_bdrv_states, states, entry);

Inserting it only here means that all the error cases above leak states.

> +    }
> +
> +    /*
> +     * Now we will drain, flush, & pivot everything - we are committed at 
> this
> +     * point.
> +     */
> +    bdrv_drain_all();

I would feel more comfortable if we could do the bdrv_drain_all() at the
very beginning of the function. Not that I know of a specific scenario
that would go wrong, but you have a nested main loop here that could do
more or less anything.

> +    QSIMPLEQ_FOREACH(states, &snap_bdrv_states, entry) {
> +        bdrv_flush(states->old_bs);

This can return an error which must be checked. And of course, we must
do it before committing to the snapshot (but after bdrv_drain_all).

> +        /* This removes our old bs from the bdrv_states, and adds the new bs 
> */
> +        bdrv_append(states->new_bs, states->old_bs);
> +    }
> +
> +    /* success */
> +    goto exit;
> +
> +close_and_fail:
> +    /*
> +    * failure, and it is all-or-none; abandon each new bs, and keep using
> +    * the original bs for all images
> +    */
> +    QSIMPLEQ_FOREACH(states, &snap_bdrv_states, entry) {
> +        if (states->is_open) {
> +             bdrv_close(states->new_bs);

bdrv_delete, as Stefan already mentioned.

> +        }
> +    }
> +fail:
> +exit:
> +    QSIMPLEQ_FOREACH(states, &snap_bdrv_states, entry) {
> +        g_free(states);
> +    }
> +    return;
> +}

Kevin



reply via email to

[Prev in Thread] Current Thread [Next in Thread]