qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 21/21] iotests: Add test for different refcou


From: Max Reitz
Subject: Re: [Qemu-devel] [PATCH v2 21/21] iotests: Add test for different refcount widths
Date: Thu, 20 Nov 2014 14:48:03 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0

On 2014-11-18 at 21:26, Eric Blake wrote:
On 11/17/2014 05:06 AM, Max Reitz wrote:

Umm, that sounds backwards from what you document.  It's a good test of
the _new_ reftable needing a second round of allocations.  So keep it
with corrected comments.  But I think you _intended_ to write a test
that starts with a refcount_width=64 image and resize to a
refcount_width=1, where the _old_ reftable then suffers a reallocation
as part of allocating refblocks for the new table.  It may even help if
you add a tracepoint for every iteration through the walk function
callback, to prove we are indeed executing it 3 times instead of the
usual 2, for these test cases.
I'm currently thinking about a way to test the old reftable reallocation
issue, and I can't find any. So, for the old reftable to require a
reallocation it must grow. For it to grow we need some allocation beyond
what it can currently represent. For this to happen during the refblock
allocation walk, this allocation must be the allocation of a new refblock.

If the refblock is allocated beyond the current reftable's limit, this
means that either all clusters between free_cluster_index and that point
are already taken. If the reftable is then reallocated, it will
therefore *always* be allocated behind that refblock, which is beyond
its old limit. Therefore, that walk through the old reftable will never
miss that new allocation.

So the issue can only occur if the old reftable is resized after the
walk through it, that is, when allocating the new reftable. That is
indeed an issue but I think it manifests itself basically like the issue
I'm testing here: There is now an area in the old refcount structures
which was free before but has is used now, and the allocation causing
that was the allocation of the new reftable. The only difference is
whether the it's the old or the new reftable that resides in the
previously free area. Thus, I think I'll leave it at this test – but if
you can describe to me how to create an image for a different "rewalk"
path, I'm all ears.
=====
The test you wrote does:

original image, pre-walk:
reftable is one cluster; with one refblock and 63 zero entries
  that refblock holds 4096 width-1 refcounts; of those, the first 63 are
non-zero, the remaining are zero. Image is 32256 bytes long

During the first walk, we call operation() 64 times - the first time
with refblock_empty false, the remaining 63 times with refblock_empty true.

after first walk but before reftable allocation, we have allocated one
refblock that holds 64 width-64 refcounts (all zero, because we don't
populate them until the final walk); and the old table now has 64
refcounts populated. Image is 32768 bytes long.

Then we allocating a new reftable; so far, we only created one refblock
for it to hold, so one cluster is sufficient. The allocation causes the
old table to now have 65 refcounts populated. Image is now 33280 bytes long.

On the second pass, we call operation() 64 times; now the first two
walks have refblock_empty as false, which means we allocate a new
refblock.  This allocation causes the old table to now have 66 refcounts
populated. Image is now 33792 bytes long.

So we free our first attempt at a new reftable, and allocate another (a
single cluster is still sufficient to hold two refblocks); I'm not sure
whether this free/realloc will reuse cluster 65 or if it will pick up
cluster 67 and leave a hole in 65.  [I guess it depends on whether
cluster allocation is done by first-fit analysis or whether it blindly
favors allocating at the end of the image].

There is a free_cluster_index to speed up finding the first fit. It's reset when freeing clusters before that index, therefore cluster 65 should be reused.

Either way, we have to do a
third iteration, because the second iteration allocated a refblock and
"reallocated" a reftable.

On the third pass, operation() is still called 64 times, but because the
only two calls with refblock_empty as false already have an allocated
refblock, no further allocations are needed, and we are done with the do
loop; the fourth walk can set refcounts.

=====
The test I thought you were writing would start

original image, pre-walk:
reftable is one cluster; with one refblock and 63 zero entries
  that refblock holds 64 width-64 refcounts; of those, the first 63 are
non-zero, the remaining are zero. Image is 32256 bytes long

During the first walk, we call operation() 1 time, with refblock_empty
false.

after first walk but before reftable allocation, we have allocated one
refblock that holds 4096 width-1 refcounts (all zero, because we don't
populate them until the final walk); and the old table now has 64
refcounts populated. Image is 32768 bytes long.

Then we allocating a new reftable; so far, we only created one refblock
for it to hold, so one cluster is sufficient. The allocation causes the
old table to now have 66 refcounts populated (one for the new refblock,
but also one for an additional refblock in the old table because the
first refblock was full). Image is now 33792 bytes long.

On the second pass, we call operation() 1 time with refblock_empty as
false, so we don't need any allocation.

Which means the test you wrote is correct, while my idea does NOT
trigger the third walk, at least not for the initial file size of 32256.
  You've been vindicated, you did it correctly :)


=====
Now, in response to your question about some other 3-pass inducing
pattern, let's think back to v1, where you questioned what would happen
if a hole in the reftable gets turned into data due to a later
allocation.  Let's see if I can come up with a scenario for that...

Let's stick with a cluster size of 512, and use 32-bit and 64-bit widths
as our two sizes.  If we downsize from 64 to 32 bits, then every two
refblock clusters in the old table results in one call to operation()
for the new table; conversely, if we upsize, then every refblock cluster
in the old table gives two calls to operation() in the new table.  The
trick at hand is to come up with some image where we punch a hole so
that on the first pass, we call operation() with refblock_empty true for
one iteration (necessarily a call later than the first, since the image
header guarantees the first refblock is not empty), but where we have
data after the hole, where it is the later data that triggers the
allocation that will finally start to fill the hole.

How about starting with an image that occupies between 1.5 and 2
refblocks worth of 32-width clusters (so an image anywhere between 193
and 256 clusters, or between 98816 and 131072 bytes).  You should be
able to figure out how many clusters this consumes for L1, L2, plus 1
for header, reftable, and 2 for refblocks, in order to figure out how
many remaining clusters are dedicated to data; ideally, the data
clusters are contiguous, and occupy a swath that covers at least
clusters 126 through 192.  Widening to 64-bit width will require 4
refblocks instead of 2, if all refblocks are needed.  But the whole idea
of punching a hole is that we don't need a refblock if it will be
all-zero entries.  So take this original image, and discard the data
clusters from physical index 126 through 192, (this is NOT the data
visible at guest offset 31744, but whatever actual offset of guest data
that maps to physical offset 31744).  The old reftable now looks like {
refblock_o1 [0-125 occupied, 126 and 127 empty], refblock_o2 [128-191
empty, 192-whatever occupied, tail empty] }.  With no allocations
required, this would in turn would map to the following new refblocks: {
refblock_n1 [0-64 occupied], refblock_n2 [65-125 occupied, 126-127
empty], NULL, refblock_n4 [192-whatever occupied] }.  Note that we do
not need to allocate refblock_n3 because of the hole in the old
refblock; we DO end up allocating three refblocks, but in the sequence
of things, refblock_n1 and refblock_n2 are allocated while we are
visiting refblock_o1 and still fit in refblock_o1, while refblock_n4 is
not allocated until after we have already passed over the first half of
refblock_o2.

Thus, the second walk over the image will see that we need to allocate
refblock_n3 because it now contains entries (in particular, the entry
for refblock_n4, but also the 1-cluster entry for the proposed reftable
that is allocated between the walks).  So, while your test used the
allocation of the reftable as the spillover point, my scenario here uses
the allocation of later refblocks as the spillover point that got missed
during the first iteration.

Sounds good, the only problem is that I'd have to hand-craft the image myself, because qemu generally uses self-references for refblocks (when allocating new refblocks, they will contain their own refcount).

I think this already would be too much effort (I'll reply to your second email right away ;-)). There is no fundamental difference in how new allocations for the new reftable and for the new refblocks are treated: If there's a new allocation, respin. If that works for the new reftable, that's enough to convince me it will work for new refblocks as well.

Max



reply via email to

[Prev in Thread] Current Thread [Next in Thread]