qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] Re: [PATCH] qcow2/virtio corruption: Don't allocate the sam


From: Avi Kivity
Subject: [Qemu-devel] Re: [PATCH] qcow2/virtio corruption: Don't allocate the same cluster twice
Date: Wed, 06 May 2009 19:54:30 +0300
User-agent: Thunderbird 2.0.0.21 (X11/20090320)

Kevin Wolf wrote:
A write on a qcow2 image that results in the allocation of a new cluster can be
divided into two parts: There is a part which happens before the actual data is
written, this is about allocating clusters in the image file. The second part
happens in the AIO callback handler and is about making L2 entries for the
newly allocated clusters.

When two AIO requests happen to touch the same free cluster, there is a chance
that the second request still sees the cluster as free because the callback of
the first request has not yet run. This means that it reserves another cluster
for the same virtual hard disk offset and hooks it up in its own callback,
overwriting what the first callback has done. Long story cut short: Bad Things
happen.

This patch maintains a list of in-flight requests that have allocated new
clusters. A second request touching the same cluster doesn't find an entry yet
in the L2 table, but can look it up in the list now. The second request is
limited so that it either doesn't touch the allocation of the first request
(so it can have a non-overlapping allocation) or that it doesn't exceed the
end of the allocation of the first request (so it can reuse this allocation
and doesn't need to do anything itself).



+        } else {
+            /* Reuse the previously allocated clusters */
+            if (end_offset > old_end_offset) {
+                nb_clusters = (old_end_offset - offset) >> s->cluster_bits;
+            }
+            cluster_offset = old_alloc->cluster_offset + (offset - old_offset);
+            m->nb_clusters = 0;
+            goto out;
+        }
+    }
+
+    LIST_INSERT_HEAD(&s->cluster_allocs, m, next);


What happens if the second request completes before the first? Then, when the first request completes, alloc_cluster_link_l2() will call copy_clusters() and overwrite the second request.

Also, the second request now depends on the first to update its metadata. But if the first request fail, it will not update its metadata, and the second request will complete without error and also without updating its metadata.


--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]