[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-devel] [RFC V4 01/30] qcow2: Add deduplication to the qcow2 specif
From: |
Benoît Canet |
Subject: |
[Qemu-devel] [RFC V4 01/30] qcow2: Add deduplication to the qcow2 specification. |
Date: |
Wed, 2 Jan 2013 17:16:04 +0100 |
Signed-off-by: Benoit Canet <address@hidden>
---
docs/specs/qcow2.txt | 100 +++++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 99 insertions(+), 1 deletion(-)
diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
index 36a559d..c9c0d47 100644
--- a/docs/specs/qcow2.txt
+++ b/docs/specs/qcow2.txt
@@ -80,7 +80,12 @@ in the description of a field.
tables to repair refcounts before accessing the
image.
- Bits 1-63: Reserved (set to 0)
+ Bit 1: Deduplication bit. If this bit is set then
+ deduplication is used on this image.
+ L2 tables size 64KB is different from
+ cluster size 4KB.
+
+ Bits 2-63: Reserved (set to 0)
80 - 87: compatible_features
Bitmask of compatible features. An implementation can
@@ -116,6 +121,7 @@ be stored. Each extension has a structure like the
following:
0x00000000 - End of the header extension area
0xE2792ACA - Backing file format name
0x6803f857 - Feature name table
+ 0xCD8E819B - Deduplication
other - Unknown header extension, can be safely
ignored
@@ -159,6 +165,98 @@ the header extension data. Each entry look like this:
terminated if it has full length)
+== Deduplication ==
+
+The deduplication extension contains the informations concerning the
+deduplication.
+
+ Byte 0 - 7: Offset of the RAM deduplication table
+
+ 8 - 11: Size of the RAM deduplication table = number of L1 64-bit
+ pointers
+
+ 12: Hash algo enum field
+ 0: SHA-256
+ 1: SHA3
+ 2: SKEIN-256
+
+ 13: Dedup stategies bitmap
+ 0: RAM based hash lookup
+ 1: Disk based hash lookup
+
+Disk based lookup structure will be described in a future QCOW2 specification.
+
+== Deduplication table (RAM method) ==
+
+The deduplication table maps a physical offset to a data hash and
+logical offset. It is used to store permanently the informations required to
+do the deduplication. It is loaded at startup into a RAM based representation
+used to do the lookups.
+
+The deduplication table contains 64-bit offsets to the level 2 deduplication
+table blocks.
+Each entry of these blocks contains a 32-byte SHA256 hash followed by the
+64-bit logical offset of the first encountered cluster having this hash.
+
+== Deduplication table schematic (RAM method) ==
+
+0 l1_dedup_index Size
+ |
+|--------------------------------------------------------------------|
+| | |
+| | L1 Deduplication table |
+| | |
+|--------------------------------------------------------------------|
+ |
+ |
+ |
+0 | l2_dedup_block_entries
+ |
+|---------------------------------|
+| |
+| L2 deduplication block |
+| |
+| l2_dedup_index |
+|---------------------------------|
+ |
+ 0 | 40
+ |
+ |-------------------------------|
+ | |
+ | Deduplication table entry |
+ | |
+ |-------------------------------|
+
+
+== Deduplication table entry description (RAM method) ==
+
+Each L2 deduplication table entry has the following structure:
+
+ Byte 0 - 31: hash of data cluster
+
+ 32 - 39: Logical offset of first encountered block having
+ this hash
+
+== Deduplication table arithmetics (RAM method) ==
+
+Entries in the deduplication table are ordered by physical cluster index.
+
+The number of entries in an l2 deduplication table block is :
+l2_dedup_block_entries = dedup_block_size / (32 + 8)
+
+The index in the level 1 deduplication table is :
+l1_dedup_index = physical_cluster_index / l2_block_cluster_entries
+
+The index in the level 2 deduplication table is:
+l2_dedup_index = physical_cluster_index % l2_block_cluster_entries
+
+cluster_size = 4096
+dedup_block_size = 65536
+l2_size = 65536
+
+The 16 remaining bytes in each l2 deduplication blocks are set to zero and
+reserved for a future usage.
+
== Host cluster management ==
qcow2 manages the allocation of host clusters by maintaining a reference count
--
1.7.10.4
- [Qemu-devel] [RFC V4 00/30] QCOW2 deduplication, Benoît Canet, 2013/01/02
- [Qemu-devel] [RFC V4 01/30] qcow2: Add deduplication to the qcow2 specification.,
Benoît Canet <=
- [Qemu-devel] [RFC V4 02/30] qcow2: Add deduplication structures and fields., Benoît Canet, 2013/01/02
- [Qemu-devel] [RFC V4 04/30] qcow2: Make update_refcount public., Benoît Canet, 2013/01/02
- [Qemu-devel] [RFC V4 05/30] qcow2: Create a way to link to l2 tables when deduplicating., Benoît Canet, 2013/01/02
- [Qemu-devel] [RFC V4 03/30] qcow2: Add qcow2_de dup_read_missing_and_concatenate, Benoît Canet, 2013/01/02
- [Qemu-devel] [RFC V4 07/30] qcow2: Add qcow2_dedup_store_new_hashes., Benoît Canet, 2013/01/02
- [Qemu-devel] [RFC V4 08/30] qcow2: Implement qcow2_compute_cluster_hash., Benoît Canet, 2013/01/02