qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCHv4] block: optimize zero writes with bdrv_write_z


From: Peter Lieven
Subject: Re: [Qemu-devel] [PATCHv4] block: optimize zero writes with bdrv_write_zeroes
Date: Thu, 15 May 2014 23:20:25 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.4.0

Am 15.05.2014 11:54, schrieb Kevin Wolf:
> Am 15.05.2014 um 07:16 hat Peter Lieven geschrieben:
>> Am 14.05.2014 13:41, schrieb Kevin Wolf:
>>> Am 08.05.2014 um 18:22 hat Peter Lieven geschrieben:
>>>> this patch tries to optimize zero write requests
>>>> by automatically using bdrv_write_zeroes if it is
>>>> supported by the format.
>>>>
>>>> This significantly speeds up file system initialization and
>>>> should speed zero write test used to test backend storage
>>>> performance.
>>>>
>>>> I ran the following 2 tests on my internal SSD with a
>>>> 50G QCOW2 container and on an attached iSCSI storage.
>>>>
>>>> a) mkfs.ext4 -E lazy_itable_init=0,lazy_journal_init=0 /dev/vdX
>>>>
>>>> QCOW2         [off]     [on]     [unmap]
>>>> -----
>>>> runtime:       14secs    1.1secs  1.1secs
>>>> filesize:      937M      18M      18M
>>>>
>>>> iSCSI         [off]     [on]     [unmap]
>>>> ----
>>>> runtime:       9.3s      0.9s     0.9s
>>>>
>>>> b) dd if=/dev/zero of=/dev/vdX bs=1M oflag=direct
>>>>
>>>> QCOW2         [off]     [on]     [unmap]
>>>> -----
>>>> runtime:       246secs   18secs   18secs
>>>> filesize:      51G       192K     192K
>>>> throughput:    203M/s    2.3G/s   2.3G/s
>>>>
>>>> iSCSI*        [off]     [on]     [unmap]
>>>> ----
>>>> runtime:       8mins     45secs   33secs
>>>> throughput:    106M/s    1.2G/s   1.6G/s
>>>> allocated:     100%      100%     0%
>>>>
>>>> * The storage was connected via an 1Gbit interface.
>>>>   It seems to internally handle writing zeroes
>>>>   via WRITESAME16 very fast.
>>>>
>>>> Signed-off-by: Peter Lieven <address@hidden>
>>>> ---
>>>> v3->v4: - use QAPI generated enum and lookup table [Kevin]
>>>>         - added more details about the options in the comments
>>>>           of the qapi-schema [Eric]
>>>>         - changed the type of detect_zeroes from str to
>>>>           BlockdevDetectZeroesOptions. I left the name
>>>>           as is because it is consistent with e.g.
>>>>           BlockdevDiscardOptions or BlockdevAioOptions [Eric]
>>>>         - changed the parse function in blockdev_init to
>>>>           be generic usable for other enum parameters
>>>>         
>>>> v2->v3: - moved parameter parsing to blockdev_init
>>>>         - added per device detect_zeroes status to 
>>>>           hmp (info block -v) and qmp (query-block) [Eric]
>>>>         - added support to enable detect-zeroes also
>>>>           for hot added devices [Eric]
>>>>         - added missing entry to qemu_common_drive_opts
>>>>         - fixed description of qemu_iovec_is_zero [Fam]
>>>>
>>>> v1->v2: - added tests to commit message (Markus)
>>>> RFCv2->v1: - fixed paramter parsing strncmp -> strcmp (Eric)
>>>>            - fixed typo (choosen->chosen) (Eric)
>>>>            - added second example to commit msg
>>>>
>>>> RFCv1->RFCv2: - add detect-zeroes=off|on|unmap knob to drive cmdline 
>>>> parameter
>>>>               - call zero detection only for format (bs->file != NULL)
>>>>
>>>>  block.c                   |   11 ++++++++++
>>>>  block/qapi.c              |    1 +
>>>>  blockdev.c                |   34 +++++++++++++++++++++++++++++
>>>>  hmp.c                     |    5 +++++
>>>>  include/block/block_int.h |    1 +
>>>>  include/qemu-common.h     |    1 +
>>>>  qapi-schema.json          |   52 
>>>> ++++++++++++++++++++++++++++++++-------------
>>>>  qemu-options.hx           |    6 ++++++
>>>>  qmp-commands.hx           |    3 +++
>>>>  util/iov.c                |   21 ++++++++++++++++++
>>>>  10 files changed, 120 insertions(+), 15 deletions(-)
>>>>
>>>> diff --git a/block.c b/block.c
>>>> index b749d31..aea4c77 100644
>>>> --- a/block.c
>>>> +++ b/block.c
>>>> @@ -3244,6 +3244,17 @@ static int coroutine_fn 
>>>> bdrv_aligned_pwritev(BlockDriverState *bs,
>>>>  
>>>>      ret = notifier_with_return_list_notify(&bs->before_write_notifiers, 
>>>> req);
>>>>  
>>>> +    if (!ret && bs->detect_zeroes != BLOCKDEV_DETECT_ZEROES_OPTIONS_OFF &&
>>>> +        !(flags & BDRV_REQ_ZERO_WRITE) && bs->file &&
>>>> +        drv->bdrv_co_write_zeroes && qemu_iovec_is_zero(qiov)) {
>>> Pretty long condition. :-)
>>>
>>> Looks like most is obviously needed, but I wonder what the bs->file part
>>> is good for? That looks rather arbitrary.
>> What I wanted to achieve is that this condition is only true if we handle
>> the format (e.g. raw, qcow2, vmdk etc.). If e.g. qcow2 then sends a
>> zero write this should always end on the disk and should not be optimizable.
> But why? This means setting an arbitrary policy for no good reason. You
> already have an option, and it already defaults to off, so unless
> someone specifically enables it for bs->file, we don't do the
> optimisation. But if someone wants to have it on bs->file, what reason
> is there to ignore that request?

I was erroneously thinking that the setting would be inherited to bs->file.
I will remove the bs->file from the condition.
>
>>>> +        flags |= BDRV_REQ_ZERO_WRITE;
>>>> +        /* if the device was not opened with discard=on the below flag
>>>> +         * is immediately cleared again in bdrv_co_do_write_zeroes */
>>> Is it? I only see it being cleared in bdrv_co_write_zeroes(), but that's
>>> not a function that seems to be called from here.
>> You are right. Question, do we want to support detect_zeroes = unmap
>> if discard = ignore? If yes, I have to update the docs. Otherwise
>> I have to check for BDRV_O_DISCARD before setting BDRV_REQ_MAY_UNMAP.
> I think it would be reasonable enough to just error out when you try to
> open an image with detect_zeroes=unmap,discard=ignore.
>
> Can these flags be changed during runtime? If so, we need to check there
> as well.

No, but you can specify them during hot add. Same as with discard.

Peter

>
>>>> +        if (bs->detect_zeroes == BLOCKDEV_DETECT_ZEROES_OPTIONS_UNMAP) {
>>>> +            flags |= BDRV_REQ_MAY_UNMAP;
>>>> +        }
>>>> +    }
>>>> +
>>>>      if (ret < 0) {
>>>>          /* Do nothing, write notifier decided to fail this request */
>>>>      } else if (flags & BDRV_REQ_ZERO_WRITE) {
> Kevin




reply via email to

[Prev in Thread] Current Thread [Next in Thread]