[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v4 2/3] qcow2: Use QcowCache
From: |
Stefan Hajnoczi |
Subject: |
Re: [Qemu-devel] [PATCH v4 2/3] qcow2: Use QcowCache |
Date: |
Mon, 24 Jan 2011 15:26:48 +0000 |
On Mon, Jan 24, 2011 at 2:54 PM, Kevin Wolf <address@hidden> wrote:
> [ Re-adding qemu-devel to CC ]
>
> Am 24.01.2011 15:34, schrieb Stefan Hajnoczi:
>> On Thu, Jan 20, 2011 at 5:10 PM, Kevin Wolf <address@hidden> wrote:
>>> @@ -702,17 +622,30 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs,
>>> QCowL2Meta *m)
>>>
>>> if (m->nb_available & (s->cluster_sectors - 1)) {
>>> uint64_t end = m->nb_available & ~(uint64_t)(s->cluster_sectors -
>>> 1);
>>> + cow = true;
>>> ret = copy_sectors(bs, start_sect + end, cluster_offset + (end <<
>>> 9),
>>> m->nb_available - end, s->cluster_sectors);
>>> if (ret < 0)
>>> goto err;
>>> }
>>>
>>> - /* update L2 table */
>>> + /*
>>> + * Update L2 table.
>>> + *
>>> + * Before we update the L2 table to actually point to the new cluster,
>>> we
>>> + * need to be sure that the refcounts have been increased and COW was
>>> + * handled.
>>> + */
>>> + if (cow) {
>>> + bdrv_flush(bs->file);
>>
>> Just bdrv_flush(bs->file) and not a refcounts cache flush?
>
> Refcounts and data need not to be ordered against each other. They only
> must both be on disk when we write the L2 table.
Have I missed where refcounts actually get flushed from the cache out
to the disk because bdrv_flush(bs->file) only syncs the file but
doesn't write out dirty data held in cache.
>>> + }
>>> +
>>> + qcow2_cache_set_dependency(bs, s->l2_table_cache,
>>> s->refcount_block_cache);
>>> ret = get_cluster_table(bs, m->offset, &l2_table, &l2_offset,
>>> &l2_index);
>>> if (ret < 0) {
>>> goto err;
>>> }
>>> + qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table);
>>>
>>> for (i = 0; i < m->nb_clusters; i++) {
>>> /* if two concurrent writes happen to the same unallocated cluster
>>> @@ -728,16 +661,9 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs,
>>> QCowL2Meta *m)
>>> (i << s->cluster_bits)) | QCOW_OFLAG_COPIED);
>>> }
>>>
>>> - /*
>>> - * Before we update the L2 table to actually point to the new cluster,
>>> we
>>> - * need to be sure that the refcounts have been increased and COW was
>>> - * handled.
>>> - */
>>> - bdrv_flush(bs->file);
>>>
>>> - ret = write_l2_entries(bs, l2_table, l2_offset, l2_index,
>>> m->nb_clusters);
>>> + ret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
>>> if (ret < 0) {
>>> - qcow2_l2_cache_reset(bs);
>>> goto err;
>>> }
>>>
>>
>> The function continues like this:
>>
>> /*
>> * If this was a COW, we need to decrease the refcount of the old cluster.
>> * Also flush bs->file to get the right order for L2 and refcount update.
>> */
>> if (j != 0) {
>> bdrv_flush(bs->file);
>> for (i = 0; i < j; i++) {
>> qcow2_free_any_clusters(bs,
>> be64_to_cpu(old_cluster[i]) & ~QCOW_OFLAG_COPIED, 1);
>> }
>> }
>>
>> Does bdrv_flush(bs->file) "get the right order for L2 and refcount
>> update"? We've just put an L2 table, should this be an L2 table
>> flush?
>
> I agree, this looks wrong. What we need is a dependency to ensure that
> first L2 is flushed and then the refcount block.
> qcow2_free_any_clusters() calls update_refcount() indirectly, which
> takes care of setting this dependency.
>
> So in the end I think it's just an unnecessary bdrv_flush. Makes sense?
I don't understand this fully. I've noticed that the cache isn't the
only mechanism for making changes to tables - there are functions like
write_l2_entries() that directly write out parts of tables without
honoring dependencies or using the dirty bit on the cache entry. I
probably need to look at this more carefully to fully understand
whether or not it is correct.
Stefan