qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [Qemu-devel] [RFC] Proposed qcow2 extension: subcluster


From: Roman Kagan
Subject: Re: [Qemu-block] [Qemu-devel] [RFC] Proposed qcow2 extension: subcluster allocation
Date: Fri, 14 Apr 2017 10:40:41 +0300
User-agent: Mutt/1.8.0 (2017-02-23)

On Thu, Apr 13, 2017 at 09:06:19PM -0400, John Snow wrote:
> So if we have a 1MB cluster with 64k subclusters as a hypothetical, if
> we write just the first subcluster, we'll have a map like:
> 
> X---------------
> 
> Whatever actually happens to exist in this space, whether it be a hole
> we punched via fallocate or literal zeroes, this space is known to the
> filesystem to be contiguous.
> 
> If we write to the last subcluster, we'll get:
> 
> X--------------X
> 
> And again, maybe the dashes are a fallocate hole, maybe they're zeroes.
> but the last subcluster is located virtually exactly 15 subclusters
> behind the first, they're not physically contiguous. We've saved the
> space between them. Future out-of-order writes won't contribute to any
> fragmentation, at least at this level.

Yeah I think this is where the confusion lies.  You apparently assume
that the filesystem is smart enough to compensate for the subclusters
being sparse within a cluster, and will make them eventually contiguous
on the *media* once they are all written.  Denis is claiming the
opposite.  I posted a simple experiment with a 64kB sparse file written
out of order which ended up being 16 disparate blocks on the platters
(ext4; with xfs this may be different), and this is obviously
detrimental for performance with rotating disks.

Note also that if the filesystem actually is smart to maintain the
subclusters contiguos even if written out of order, apparently by not
allowing blocks from other files to take the yet unused space between
sparse subclusters, the disk space saving becomes not so obvious.

> You might be able to reduce COW from 5 IOPs to 3 IOPs, but if we tune
> the subclusters right, we'll have *zero*, won't we?

Right, this is an attractive advantage.  Need to test if the later
access to such interleaved clusters is not degraded, though.

Roman.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]