qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [PATCH for-2.6 v2 3/3] block/gluster: prevent data loss


From: Kevin Wolf
Subject: Re: [Qemu-block] [PATCH for-2.6 v2 3/3] block/gluster: prevent data loss after i/o error
Date: Tue, 19 Apr 2016 14:27:56 +0200
User-agent: Mutt/1.5.21 (2010-09-15)

Am 19.04.2016 um 14:07 hat Jeff Cody geschrieben:
> Upon receiving an I/O error after an fsync, by default gluster will
> dump its cache.  However, QEMU will retry the fsync, which is especially
> useful when encountering errors such as ENOSPC when using the werror=stop
> option.  When using caching with gluster, however, the last written data
> will be lost upon encountering ENOSPC.  Using the write-behind-cache
> xlator option of 'resync-failed-syncs-after-fsync' should cause gluster
> to retain the cached data after a failed fsync, so that ENOSPC and other
> transient errors are recoverable.
> 
> Unfortunately, we have no way of knowing if the
> 'resync-failed-syncs-after-fsync' xlator option is supported, so for now
> close the fd and set the BDS driver to NULL upon fsync error.
> 
> Signed-off-by: Jeff Cody <address@hidden>
> ---
>  block/gluster.c | 42 ++++++++++++++++++++++++++++++++++++++++++
>  configure       |  8 ++++++++
>  2 files changed, 50 insertions(+)
> 
> diff --git a/block/gluster.c b/block/gluster.c
> index d9aace6..ba33488 100644
> --- a/block/gluster.c
> +++ b/block/gluster.c
> @@ -314,6 +314,23 @@ static int qemu_gluster_open(BlockDriverState *bs,  
> QDict *options,
>          goto out;
>      }
>  
> +#ifdef CONFIG_GLUSTERFS_XLATOR_OPT
> +    /* Without this, if fsync fails for a recoverable reason (for instance,
> +     * ENOSPC), gluster will dump its cache, preventing retries.  This means
> +     * almost certain data loss.  Not all gluster versions support the
> +     * 'resync-failed-syncs-after-fsync' key value, but there is no way to
> +     * discover during runtime if it is supported (this api returns success 
> for
> +     * unknown key/value pairs) */
> +    ret = glfs_set_xlator_option(s->glfs, "*-write-behind",
> +                                          "resync-failed-syncs-after-fsync",
> +                                          "on");
> +    if (ret < 0) {
> +        error_setg_errno(errp, errno, "Unable to set xlator key/value pair");
> +        ret = -errno;
> +        goto out;
> +    }
> +#endif
> +
>      qemu_gluster_parse_flags(bdrv_flags, &open_flags);
>  
>      s->fd = glfs_open(s->glfs, gconf->image, open_flags);
> @@ -366,6 +383,16 @@ static int qemu_gluster_reopen_prepare(BDRVReopenState 
> *state,
>          goto exit;
>      }
>  
> +#ifdef CONFIG_GLUSTERFS_XLATOR_OPT
> +    ret = glfs_set_xlator_option(reop_s->glfs, "*-write-behind",
> +                                 "resync-failed-syncs-after-fsync", "on");
> +    if (ret < 0) {
> +        error_setg_errno(errp, errno, "Unable to set xlator key/value pair");
> +        ret = -errno;
> +        goto exit;
> +    }
> +#endif
> +
>      reop_s->fd = glfs_open(reop_s->glfs, gconf->image, open_flags);
>      if (reop_s->fd == NULL) {
>          /* reops->glfs will be cleaned up in _abort */
> @@ -613,6 +640,21 @@ static coroutine_fn int 
> qemu_gluster_co_flush_to_disk(BlockDriverState *bs)
>  
>      ret = glfs_fsync_async(s->fd, gluster_finish_aiocb, &acb);
>      if (ret < 0) {
> +        /* Some versions of Gluster (3.5.6 -> 3.5.8?) will not retain its
> +         * cache after a fsync failure, so we have no way of allowing the 
> guest
> +         * to safely continue.  Gluster versions prior to 3.5.6 don't retain
> +         * the cache either, but will invalidate the fd on error, so this is
> +         * again our only option.
> +         *
> +         * The 'resync-failed-syncs-after-fsync' xlator option for the
> +         * write-behind cache will cause later gluster versions to retain
> +         * its cache after error, so long as the fd remains open.  However,
> +         * we currently have no way of knowing if this option is supported.
> +         *
> +         * TODO: Once gluster provides a way for us to determine if the 
> option
> +         *       is supported, bypass the closure and setting drv to NULL.  
> */
> +        qemu_gluster_close(bs);
> +        bs->drv = NULL;
>          return -errno;
>      }

More context:

        qemu_coroutine_yield();
        return acb.ret;
    }

I would guess that acb.ret containing an error is the more common case.
We should probably invalidate the BDS in both cases (immediate failure
and callback with error code).

Kevin



reply via email to

[Prev in Thread] Current Thread [Next in Thread]