qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 0/1] qcow2: Skip copy-on-write when allocating a zero cluster


From: Alberto Garcia
Subject: Re: [PATCH 0/1] qcow2: Skip copy-on-write when allocating a zero cluster
Date: Fri, 21 Aug 2020 13:42:52 +0200
User-agent: Notmuch/0.18.2 (http://notmuchmail.org) Emacs/24.4.1 (i586-pc-linux-gnu)

On Fri 21 Aug 2020 01:05:06 PM CEST, Brian Foster <bfoster@redhat.com> wrote:
>> > 1) off: for every write request QEMU initializes the cluster (64KB)
>> >         with fallocate(ZERO_RANGE) and then writes the 4KB of data.
>> > 
>> > 2) off w/o ZERO_RANGE: QEMU writes the 4KB of data and fills the rest
>> >         of the cluster with zeroes.
>> > 
>> > 3) metadata: all clusters were allocated when the image was created
>> >         but they are sparse, QEMU only writes the 4KB of data.
>> > 
>> > 4) falloc: all clusters were allocated with fallocate() when the image
>> >         was created, QEMU only writes 4KB of data.
>> > 
>> > 5) full: all clusters were allocated by writing zeroes to all of them
>> >         when the image was created, QEMU only writes 4KB of data.
>> > 
>> > As I said in a previous message I'm not familiar with xfs, but the
>> > parts that I don't understand are
>> > 
>> >    - Why is (4) slower than (1)?
>> 
>> Because fallocate() is a full IO serialisation barrier at the
>> filesystem level. If you do:
>> 
>> fallocate(whole file)
>> <IO>
>> <IO>
>> <IO>
>> .....
>> 
>> The IO can run concurrent and does not serialise against anything in
>> the filesysetm except unwritten extent conversions at IO completion
>> (see answer to next question!)
>> 
>> However, if you just use (4) you get:
>> 
>> falloc(64k)
>>   <wait for inflight IO to complete>
>>   <allocates 64k as unwritten>
>> <4k io>
>>   ....
>> falloc(64k)
>>   <wait for inflight IO to complete>
>>   ....
>>   <4k IO completes, converts 4k to written>
>>   <allocates 64k as unwritten>
>> <4k io>
>> falloc(64k)
>>   <wait for inflight IO to complete>
>>   ....
>>   <4k IO completes, converts 4k to written>
>>   <allocates 64k as unwritten>
>> <4k io>
>>   ....
>> 
>
> Option 4 is described above as initial file preallocation whereas
> option 1 is per 64k cluster prealloc. Prealloc mode mixup aside, Berto
> is reporting that the initial file preallocation mode is slower than
> the per cluster prealloc mode. Berto, am I following that right?

Option (1) means that no qcow2 cluster is allocated at the beginning of
the test so, apart from updating the relevant qcow2 metadata, each write
request clears the cluster first (with fallocate(ZERO_RANGE)) then
writes the requested 4KB of data. Further writes to the same cluster
don't need changes on the qcow2 metadata so they go directly to the area
that was cleared with fallocate().

Option (4) means that all clusters are allocated when the image is
created and they are initialized with fallocate() (actually with
posix_fallocate() now that I read the code, I suppose it's the same for
xfs?). Only after that the test starts. All write requests are simply
forwarded to the disk, there is no need to touch any qcow2 metadata nor
do anything else.

And yes, (4) is a bit slower than (1) in my tests. On ext4 I get 10%
more IOPS.

I just ran the tests with aio=native and with a raw image instead of
qcow2, here are the results:

qcow2:
|----------------------+-------------+------------|
| preallocation        | aio=threads | aio=native |
|----------------------+-------------+------------|
| off                  |        8139 |       7649 |
| off (w/o ZERO_RANGE) |        2965 |       2779 |
| metadata             |        7768 |       8265 |
| falloc               |        7742 |       7956 |
| full                 |       41389 |      56668 |
|----------------------+-------------+------------|

raw:
|---------------+-------------+------------|
| preallocation | aio=threads | aio=native |
|---------------+-------------+------------|
| off           |        7647 |       7928 |
| falloc        |        7662 |       7856 |
| full          |       45224 |      58627 |
|---------------+-------------+------------|

A qcow2 file with preallocation=metadata is more or less similar to a
sparse raw file (and the numbers are indeed similar).

preallocation=off on qcow2 does not have an equivalent on raw files.

Berto



reply via email to

[Prev in Thread] Current Thread [Next in Thread]