qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [question] is it possible that big-endian l1 tableoffse


From: Max Reitz
Subject: Re: [Qemu-devel] [question] is it possible that big-endian l1 tableoffset referenced by other I/O while updating l1 table offset in qcow2_update_snapshot_refcount?
Date: Mon, 13 Oct 2014 08:40:49 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.1.0

Am 13.10.2014 um 05:17 schrieb Zhang Haoyu:
Hi,
I encounter a problem that after deleting snapshot, the qcow2 image size is 
very larger than that it should be displayed by ls command,
but the virtual disk size is okay via qemu-img info.
I suspect that during updating l1 table offset, other I/O job reference the 
big-endian l1 table offset (very large value),
so the file is truncated to very large.
Not quite.  Rather, all the data that the snapshot used to occupy is
still consuming holes in the file; the maximum offset of the file is
still unchanged, even if the file is no longer using as many referenced
clusters.  Recent changes have gone in to sparsify the file when
possible (punching holes if your kernel and file system is new enough to
support that), so that it is not consuming the amount of disk space that
a mere ls reports.  But if what you are asking for is a way to compact
the file back down, then you'll need to submit a patch.  The idea of
having an online defragmenter for qcow2 files has been kicked around
before, but it is complex enough that no one has attempted a patch yet.
Sorry, I didn't clarify the problem clearly.
In qcow2_update_snapshot_refcount(), below code,
      /* Update L1 only if it isn't deleted anyway (addend = -1) */
      if (ret == 0 && addend >= 0 && l1_modified) {
          for (i = 0; i < l1_size; i++) {
              cpu_to_be64s(&l1_table[i]);
          }

          ret = bdrv_pwrite_sync(bs->file, l1_table_offset, l1_table, l1_size2);

          for (i = 0; i < l1_size; i++) {
              be64_to_cpus(&l1_table[i]);
          }
      }
between cpu_to_be64s(&l1_table[i]); and be64_to_cpus(&l1_table[i]);,
is it possible that there is other I/O reference this interim l1 table whose 
entries contain the be64 l2 table offset?
The be64 l2 table offset maybe a very large value, hundreds of TB is possible,
then the qcow2 file will be truncated to far larger than normal size.
So we'll see the huge size of the qcow2 file by ls -hl, but the size is still 
normal displayed by qemu-img info.

If the possibility mentioned above exists, below raw code may fix it,
       if (ret == 0 && addend >= 0 && l1_modified) {
          tmp_l1_table = g_malloc0(l1_size * sizeof(uint64_t))
          memcpy(tmp_l1_table, l1_table, l1_size * sizeof(uint64_t));
          for (i = 0; i < l1_size; i++) {
              cpu_to_be64s(&tmp_l1_table[i]);
          }
          ret = bdrv_pwrite_sync(bs->file, l1_table_offset, tmp_l1_table, 
l1_size2);

          free(tmp_l1_table);
      }
l1_table is already a local variable (local to
qcow2_update_snapshot_refcount()), so I can't really imagine how
introducing another local buffer should mitigate the problem, if there
is any.

l1_table is not necessarily a local variable to qcow2_update_snapshot_refcount,
which depends on condition of "if (l1_table_offset != s->l1_table_offset)",
if the condition not true, l1_table = s->l1_table.

Oh, yes, you're right. Okay, so in theory nothing should happen anyway, because qcow2 does not have to be reentrant (so s->l1_table will not be accessed while it's big endian and therefore possibly not in CPU order). But I find it rather ugly to convert the cached L1 table to big endian, so I'd be fine with the patch you proposed.

Max



reply via email to

[Prev in Thread] Current Thread [Next in Thread]