qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] Re: [RFC][STABLE 0.13] Revert "qcow2: Use bdrv_(p)write_syn


From: Anthony Liguori
Subject: [Qemu-devel] Re: [RFC][STABLE 0.13] Revert "qcow2: Use bdrv_(p)write_sync for metadata writes"
Date: Wed, 25 Aug 2010 07:46:57 -0500
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.11) Gecko/20100713 Lightning/1.0b1 Thunderbird/3.0.6

On 08/25/2010 02:14 AM, Avi Kivity wrote:
If (c) happens before (b), then we've created an extent that's attached to a table with a zero reference count. This is a corrupt image.



If the only issue is new block allocation, it can be easily solved.

Technically, I believe there are similar issues around creating snapshots but I don't think we care.

Instead of allocating exactly the needed amount of blocks, allocate a large extent and hold them in memory.

So you're suggesting that we allocate a bunch of blocks, update the ref count table so that they are seen as allocated even though they aren't attached to an l1 table?

The next allocation can then be filled from memory, so the allocation sync is amortized over many blocks. A power fail will leak the preallocated blocks, losing some megabytes of address space, but not real disk space.

It's a clever idea, but it would lose real disk space which is probably not a huge issue.

Let's consider if we eliminate the reference count table which means eliminating internal snapshots.

1) guest submits write request
2) allocate extent
3) write data to disk (a)
4) write (a) completes
5) write extent table (c)
6) write (c) completes
7) complete guest write request

If this all happens in order and we lose power, we just leak a block. It means we need a periodic fsck.

If (c) completes before (a), then it means that the image is not corrupted but data gets lost. This is okay based on the guest contract.

And that's it.  There is no scenario where the disk is corrupted.

_if_ that's the only failure mode.

If we had another disk format that only supported growth and metadata for a backing file, can you think of another failure scenario?

Regards,

Anthony Liguori





reply via email to

[Prev in Thread] Current Thread [Next in Thread]