[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH V13 1/6] docs: document for add-cow file format
From: |
Dong Xu Wang |
Subject: |
Re: [Qemu-devel] [PATCH V13 1/6] docs: document for add-cow file format |
Date: |
Fri, 19 Oct 2012 10:14:36 +0800 |
User-agent: |
Mozilla/5.0 (Windows NT 5.1; rv:15.0) Gecko/20120907 Thunderbird/15.0.1 |
δΊ 10/19/2012 12:10 AM, Eric Blake ει:
> On 10/18/2012 03:51 AM, Dong Xu Wang wrote:
>> Document for add-cow format, the usage and spec of add-cow are introduced.
>>
>> Signed-off-by: Dong Xu Wang <address@hidden>
>> ---
>> docs/specs/add-cow.txt | 139
>> ++++++++++++++++++++++++++++++++++++++++++++++++
>> 1 files changed, 139 insertions(+), 0 deletions(-)
>> create mode 100644 docs/specs/add-cow.txt
>>
>> diff --git a/docs/specs/add-cow.txt b/docs/specs/add-cow.txt
>> new file mode 100644
>> index 0000000..dc1e107
>> --- /dev/null
>> +++ b/docs/specs/add-cow.txt
>> @@ -0,0 +1,139 @@
>> +== General ==
>> +
>> +The raw file format does not support backing files or copy on write feature.
>> +The add-cow image format makes it possible to use backing files with raw
>
> s/with raw/with a raw/
>
Okay.
>> +image by keeping a separate .add-cow metadata file. Once all sectors
>> +have been written into the raw image it is safe to discard the .add-cow
>> +and backing files, then we can use the raw image directly.
>> +
>> +An example usage of add-cow would look like::
>> +(ubuntu.img is a disk image which has been installed OS.)
>
> s/has been installed/has an installed/
Okay.
>
>> + 1) Create a raw image with the same size of ubuntu.img
>> + qemu-img create -f raw test.raw 8G
>> + 2) Create an add-cow image which will store dirty bitmap
>> + qemu-img create -f add-cow test.add-cow \
>> + -o backing_file=ubuntu.img,image_file=test.raw
>> + 3) Run qemu with add-cow image
>> + qemu -drive if=virtio,file=test.add-cow
>> +
>> +test.raw may be larger than ubuntu.img, in that case, the size of
>> test.add-cow
>> +will be calculated from the size of test.raw.
>> +
>> +=Specification=
>> +
>> +The file format looks like this:
>> +
>> + +---------------+-------------+-----------------+
>> + | Header | Reserved | COW bitmap |
>> + +---------------+-------------+-----------------+
>
> Since you call out what all 4096 bytes must be in the Header section
> (all bytes not occupied by a backing file name or format must be NUL),
> do your really need Reserved in this section, or can you just claim that
> the 4096-byte header is directly followed by the COW bitmap?
>
Okay, I think Header + COW would be enough.
>> +
>> +All numbers in add-cow are stored in Little Endian byte order.
>> +
>> +== Header ==
>> +
>> +The Header is included in the first bytes:
>> +(#define HEADER_SIZE (4096 * header_size))
>> + Byte 0 - 7: magic
>> + add-cow magic string ("ADD_COW\xff").
>> +
>> + 8 - 11: version
>> + Version number (only valid value is 1 now).
>> +
>> + 12 - 15: backing file name offset
>> + Offset in the add-cow file at which the backing file
>> + name is stored (NB: The string is not
>> nul-terminated).
>> + If backing file name does NOT exist, this field
>> will be
>> + 0. Must be between 80 and [HEADER_SIZE - 2](a file
>> name
>> + must be at least 1 byte).
>> +
>> + 16 - 19: backing file name size
>> + Length of the backing file name in bytes. It will
>> be 0
>> + if the backing file name offset is 0. If backing
>> file
>> + name offset is non-zero, then it must be non-zero.
>> Must
>> + be less than [HEADER_SIZE - 80] to fit in the
>> reserved
>> + part of the header.
>
> More specifically, it must be small enough so that offset+size <=
> HEADER_SIZE.
Okay.
>
>> +
>> + 20 - 23: image file name offset
>> + Offset in the add-cow file at which the image file
>> name
>> + is stored (NB: The string is not null terminated).
>> It
>> + must be between 80 and [HEADER_SIZE - 2].
>> +
>> + 24 - 27: image file name size
>> + Length of the image file name in bytes.
>> + Must be less than [HEADER_SIZE - 80] to fit in the
>> reserved
>> + part of the header.
>
> More specifically, it must be small enough so that offset+size <=
> HEADER_SIZE.
>
Okay.
>> +
>> + 28 - 31: cluster bits
>> + Number of bits that are used for addressing an
>> offset
>> + within a cluster (1 << cluster_bits is the cluster
>> size).
>> + Must not be less than 9 (i.e. 512 byte clusters).
>> +
>> + Note: qemu as of today has an implementation limit
>> of 2 MB
>> + as the maximum cluster size and won't be able to
>> open images
>> + with larger cluster sizes.
>> +
>> + 32 - 39: features
>> + Bitmask of features. An implementation can safely
>> ignore
>> + any unknown bits that are set.
>
> Really? That sounds more like optional features, if an implementation
> can ignore a set bit. You really want to require that implementations
> reject operations on a file with a feature bit set that they don't
> recognize.
>
Yep, I should reject the file if un-recognized bits are set.
>> +
>> + Bit 0: All allocated bit. If this bit is set
>> then
>> + backing file and COW bitmap will not be
>> used,
>> + and can read from or write to image
>> file directly.
>
> And this particular bit sounds like an optional feature - setting the
> bit is an optimization in speed, but leaving the bit clear or ignoring
> the bit when it is set does not change correctness.
>
Okay, but if this bit is categorized to optional feature, I will be no
features bit then...
>> +
>> + Bits 1-63: Reserved (set to 0)
>> +
>> + 40 - 47: optional features
>> + Not used now. Reserved for future use. It must be
>> set to 0.
>> + And must be ignored while reading.
>
> s/. And must be ignored/, and ignored/
>
Okay.
>> +
>> + 48 - 51: header size
>> + The header field is variable-sized. This field
>> indicates
>> + how many 4096 bytes will be used to store add-cow
>> header.
>> + In add-cow v1, it is fixed to 1, so the header size
>> will
>> + be 4096 * 1 = 4096 bytes.
>
> So is the value '1' or '4096' in this field? The wording isn't quite
> clear. But reading elsewhere, it looks like this should always be '1'
> in version one add-cow.
>
I mean '1' in this field, I will describe more clearly in future patch.
>> +
>> + 52 - 67: backing file format
>> + Format of backing file. It will be filled with 0 if
>> + backing file name offset is 0. If backing file name
>> + offset is non-empty, it must be non-empty.
>
> Are you going to enforce this? Normally, if backing file format is
> omitted, then qemu knows how to probe backing file format (with the
> caveat that it is a security risk if the probe returns non-raw but the
> backing file really was raw).
>
To avoid the security risk I add this field in header, can I do anything
to enforce this more?
>> It is coded
>> + in free-form ASCII, and is not NUL-terminated. Zero
>> + padded on the right.
>> +
>> + 68 - 83: image file format
>> + Format of image file. It must be non-empty. It is
>> coded
>> + in free-form ASCII, and is not NUL-terminated. Zero
>> + padded on the right.
>
> Why do we need this field? Isn't the image file ALWAYS raw?
>
In v1, it only supports raw as a image_file, but in future it might
support other formats which do not support COW, so I add image file
format here.
>> +
>> + 84 - [HEADER_SIZE - 1]:
>
> Elsewhere in your spec, you use 80 rather than 84 for the starting point
> of valid offsets. Which is it?
Sorry, I add "cluster bits", which is 4 bytes, and I forgot changing 80
to 84. Will correct.
>
>> + It is used to make sure COW bitmap field starts at
>> the
>> + HEADER_SIZE byte, backing file name and image file
>> name
>> + will be stored here. The bytes that is not pointing
>> to
>> + backing file and image file names must be set to 0.
>
> Not just file names, but also backing file format.
>
Okay.
>> +
>> +== COW bitmap ==
>> +
>> +The "COW bitmap" field starts at offset HEADER_SIZE, stores a bitmap
>> related to
>
> Shouldn't that be HEADER_SIZE * number of header pages, since you
> dedicated a field in the header for that purpose?
>
There is one line in this doc:
(#define HEADER_SIZE (4096 * header_size))
HEADER_SIZE and header_size are not good names, will correct.
>> +backing file and image file. The bitmap will track whether the sector in
>> +backing file is dirty or not.
>> +
>> +Each bit in the bitmap tracks one cluster's status. For example, if cluster
>> +bit is 16, then each bit tracks one cluster, (1 >> 16) = 65536 bytes. The
>> size
>
> s/>>/<</
>
Okay.
>> +of bitmap is calculated according to virtual size of image file, and it must
>> +be multiple of 65536, the bits not used will be set to 0. Within each byte,
>
> What must be a multiple of 65536, the image file, or the size of the
> bitmap in the add-cow file? I think what you want is:
>
I mean bitmap must be multiple for 65536. Will describe more clearly.
> The image size is rounded up to cluster size (where any bytes in the
> last cluster that do not fit in the image are ignored), then if the
> number of clusters is not a multiple of 8, then remaining bits in the
> bitmap will be set to 0.
>
> Or do you really want to require that the bitmap is a multiple of 64k
> bytes (at 8 bits per byte, that means the bitmap covers a multiple of
> 512k clusters, and at 512 bytes as the minimum cluster size, that the
> add-cow file format manages a minimum of 256M)? That is, are you
> requiring that the bitmap end on an aligned boundary, to make the bitmap
> easier to use without having to special case a short-read on the last
> page of the bitmap?
>
I think my description was wrong. add-cow is using block-cache.c, which
is from qcow2-cache.c. I was not to change qcow2-cache.c to a great
extent, so I use it directly, and it uses cluster_size as writing unit:
ret = bdrv_pwrite(bs->file, c->entries[i].offset, c->entries[i].table,
s->cluster_size);
So the size of bitmap would be multiple of cluster_size of add-cow, do
you think if it is acceptable?
>> +the least significant bit covers the first cluster. Bit orders in one
>> +byte look like:
>> + +----+----+----+----+----+----+----+----+
>> + | b7 | b6 | b5 | b4 | b3 | b2 | b1 | b0 |
>> + +----+----+----+----+----+----+----+----+
>> +
>> +If the bit is 0, indicates the sector has not been allocated in image file,
>> data
>> +should be loaded from backing file while reading; if the bit is 1,
>> indicates the
>
> s/indicates/it indicates/ (twice)
>
Okay.
> If there is no backing file, or if the image file is larger than the
> backing file and the offset is beyond the end of the backing file, then
> the data should be read as all zero bytes instead.
>
Okay.
>> +related sector has been dirty, should be loaded from image file while
>> reading.
>> +Writing to a sector causes the corresponding bit to be set to 1.
>> +
>> +If raw image is not an even multiple of cluster bytes, bits that correspond
>> to
>> +bytes beyond the raw file size in add-cow must be written as 0 and must be
>> +ignored when reading.
>> +
>> +Image file name and backing file name must NOT be the same, we prevent this
>> +while creating add-cow files.
>
> You prevent it when creating add-cow files via qemu-img, but that
> doesn't stop malicious users from creating a file with those properties
> and then trying to get you to parse it as add-cow. I think this needs
> to instead be a requirement on the consumer of a potentially bad file,
> and not a requirement on the producer to avoid bad files, since you
> can't control all producers, as in:
> If image file name and backing file resolve to the same file, the
> add-cow image must be treated as invalid.
Okay, I will check it while opening add-cow file.
>
Thank you Eric.
- [Qemu-devel] [PATCH V13 0/6] add-cow file format, Dong Xu Wang, 2012/10/18
- [Qemu-devel] [PATCH V13 2/6] make path_has_protocol non static, Dong Xu Wang, 2012/10/18
- [Qemu-devel] [PATCH V13 1/6] docs: document for add-cow file format, Dong Xu Wang, 2012/10/18
- [Qemu-devel] [PATCH V13 3/6] qed_read_string to bdrv_read_string, Dong Xu Wang, 2012/10/18
- [Qemu-devel] [PATCH V13 6/6] qemu-iotests: add add-cow iotests support., Dong Xu Wang, 2012/10/18
- [Qemu-devel] [PATCH V13 4/6] rename qcow2-cache.c to block-cache.c, Dong Xu Wang, 2012/10/18
- [Qemu-devel] [PATCH V13 5/6] add-cow file format core code., Dong Xu Wang, 2012/10/18