qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [Bug 1399191] [NEW] Large VHDX image size


From: Jeff Cody
Subject: Re: [Qemu-devel] [Bug 1399191] [NEW] Large VHDX image size
Date: Thu, 22 Jan 2015 17:39:07 -0000

On Thu, Jan 22, 2015 at 04:24:45PM +0000, Lokesha, Amulya wrote:
> 
> -----Original Message-----
> From: Jeff Cody [mailto:address@hidden 
> Sent: Thursday, January 22, 2015 9:33 PM
> To: Lokesha, Amulya
> Cc: Bug 1399191; Geoffroy, Daniel
> Subject: Re: [Qemu-devel] [Bug 1399191] [NEW] Large VHDX image size
> 
> On Thu, Jan 22, 2015 at 05:20:20AM +0000, Lokesha, Amulya wrote:
> >     
> >
> 
> [...]
>  
>>>     
>>>>   Yes, I have sent a patch that I believe fixes the issue (I cc'ed you on
>>>>   the patch).  If you wouldn't mind testing, and verifying that it fixes
>>>>   your particular issue, that would be great.  I tested on Windows Server
>>>>   2012 w/Hyper-V, and I was able to verify the original problem and the 
>>>> fix.
>>>>    
>>>>   On what the issue was:
>>>>    
>>>>   The v1.0.0 spec for VHDX denotes that the FileOffsetMB portion of the
>>>>   block state field is 'reserved' for blocks in the state
>>>>   PAYLOAD_BLOCK_ZERO.  Before, we just went ahead and wrote the file offset
>>>>   value for that block into the field, and just ignored it during reads.
>>>>    
>>>>   If we force the FileOffsetMB field to be zero, then Hyper-V is able to
>>>>   read the images.  Furthermore, inspecting images converted by Hyper-V
>>>>   shows that Hyper-V writes '0' for the FileOffsetMB field in BAT entries
>>>>   that are in the PAYLOAD_BLOCK_ZERO state.
>>>>    
>>>>   The patch I sent mimics that behavior, and forces any BAT writes of
>>>>   PAYLOAD_BLOCK_ZERO state to have a FileOffsetMB value of 0.
>>>     
>>>     
>>>    Hi Jeff,
>>>     
>>>    Thanks a lot for the fix. We tested the VHDX creation with the new patch
>>>    fix and were able to get it deployed successfully into our Windows HyperV
>>>    Server 2012.
>>>
>>
>>You're welcome!
>
>>>    However we have one more issue and were hoping if we could get any help.
>>>    We have a vmdk image (size 13MB) with a 500 GB data disk size. After
>>>    conversion of vmdk image to vhdx format, the image size is 126 GB.
>>>     
>>>    # qemu-img info Test.vmdk
>>>    image: Test.vmdk
>>>    file format: vmdk
>>>    virtual size: 500G (536870912000 bytes)
>>>    disk size: 13M
>>>    cluster_size: 65536
>>>    Format specific information:
>>>        cid: 4124953209
>>>        parent cid: 4294967295
>>>        create type: streamOptimized
>>>        extents:
>>>            [0]:
>>>                compressed: true
>>>                virtual size: 536870912000
>>>                filename: Test.vmdk
>>>                cluster size: 65536
>>>                format:
>>>     
>>>    qemu conversion to vhdx done as below
>>>    # qemu-img convert -p -o subformat=dynamic -f vmdk -O vhdx Test.vmdk
>>>    Test.vhdx
>>>        (100.00/100%)
>>>     
>>>    # ls -ltrh
>>>    total 69M
>>>    -rw-r--r-- 1 root root  13M Jan 21 21:42 Test.vmdk
>>>    -rw-r--r-- 1 root root 126G Jan 22 05:03 Test.vhdx
>>>     
>>>    # qemu-img info Test.vhdx
>>>    image: Test.vhdx
>>>    file format: vhdx
>>>    virtual size: 500G (536870912000 bytes)
>>>    disk size: 56M
>>>    cluster_size: 33554432
>>>     
>>>     
>>>    When we tried to copy this to the SCVMM manager server (Windows OS) it
>>>    failed due to disk limitation. Windows does think it is a 126 GB file. 
>>> Can
>>>    anything be done to fix this issue?
>>>     
>>>    Thanks,
>>>    Amulya
>>
>> It sounds like a 126G sparse image was created (reported size 126G,
>> on disk size 55M), and QEMU still recognizes it as a 500G image
>> (file size sounds large, but nothing indicates yet the vhdx image
>> isn't valid, at least).
>>
>>
>> I tried to reproduce it here, and could not:
>>
>> # ./qemu-img create -f vmdk test-500G.vmdk 500G
>>
>> # ./qemu-img convert -p -o subformat=dynamic -f vmdk -O vhdx
>> test-500G.vmdk test-500G.vhdx
>>
>> # ls -lhs test-500G* 
>> 1.3M -rw-r--r--. 1 jcody jcody 8.0M Jan 22 10:25 test-500G.vhdx
>> 136K -rw-r--r--. 1 jcody jcody  63M Jan 22 10:25 test-500G.vmdk
>>
>> I was also able to inspect test.vhdx on Windows Server 2012, using
>> Hyper-V Manager, and it reported that it was a vhdx dynamic disk,
>> with a maximum size of 500G.
>>
>> Can you place your source vmdk image someplace for me to download
>> and test with, if it does not contain sensitive data?  You can send
>> me the link offlist, if you like.
>>
>
> Hi Jeff,
>
> The disk limitation which I mentioned above was just about the free
> space available on Windows hyperV Server.  I am sending the sample
> Test.vmdk attached in zip format which we used for converting to
> vhdx format. Please let me know if you are able to download this zip
> file.
>
> Thanks, Amulya
>
>

Hi Amulya,

(I added the bug tracker back on the cc list)

I was able to download the VMDK image fine to test.  What is going
on is due to the nature of the VHDX image format, due to its
relatively large block size (1MB min, 256 MB max).

Brief background:

In the VHDX image format, data is broken up into blocks.  Those blocks
may take various states, such as a 'ZERO' state, and a 'PRESENT'
state.

When an entire block is zeroes, the state stays in the 'ZERO' state,
which is now the default state for new VHDX dynamic image files.
There is no data written to disk for these zeros, so the file size
stays small.

However, if we write even 1 byte to a block that is the ZERO state, it
is now in the 'PRESENT' state, and that block data must be written to
disk (complete with the surrounding zeroes in that block).

Block sizes for VHDX images range from 1MB - 256MB.  The larger the
block size, the larger the image may be on disk for partially filled
blocks.

The qemu vhdx driver does take advantage of underlying file system
sparseness, however, and just extends the file out and only writes the
requested sectors on a write. 


What is going on in your VMDK image:

You VMDK image as a lot of (repetitive) data spread out far enough
apart that it hits a lot of different blocks [1].  If you run hexdump on
the resulting vhdx image, this will be apparent (hexdump will elide
excessively duplicate data, so the output is manageable).

The resulting image with the default block size then becomes ~126G,
but as it is quite sparse, the on disk usage is only 41M.  If you play
with the block size parameter during conversion, you can get file size
down to around 4G (with a block_size of 1MB).

I am not sure the best way to copy a sparse file over to Windows
2012 Server, but it would probably be nice to preserve any sparseness
if possible.

I think we can close the BZ, as it seems the posted patch solves
the original issue.


[1] Example of data from the converted VHDX image:

7f000000 ffff ffff ffff ffff ffff ffff ffff ffff
*
7f000040 0003 0000 0000 0000 0000 0000 0000 0000
7f000050 0000 0000 0000 0000 0000 0000 0000 0000
*
7f001400 ffff ffff ffff ffff ffff ffff ffff ffff
*
7f002000 0000 0000 0000 0000 0000 0000 0000 0000
*
7f100000 ffff ffff ffff ffff ffff ffff ffff ffff
*
7f100040 0003 0000 0000 0000 0000 0000 0000 0000
7f100050 0000 0000 0000 0000 0000 0000 0000 0000
*
7f101400 ffff ffff ffff ffff ffff ffff ffff ffff
*
7f102000 0000 0000 0000 0000 0000 0000 0000 0000


The default block size for a 500G vhdx image created by QEMU is 32M.
In the above, you can see that most of the data is 0, but still within
one block-sized chunk.

(I am assuming that the data was converted correctly, of course).

When I tested with a qcow2 output format, using a 1MB cluster size in
qcow2, I get roughly similar results as when using a block size of 1M
with vhdx (Test.qcow2 and Test.vhdx were then both ~4G).

-Jeff

>>>    > >
>>>    > >
>>>    > > You received this bug notification because you are subscribed to the
>>>    bug report.
>>>    > > https://bugs.launchpad.net/bugs/1399191
>>>    > >
>>
>>
>>[...]

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1399191

Title:
  Large VHDX image size

Status in QEMU:
  New

Bug description:
  We are trying to convert a VMDK image to VHDX image for deploy to HyperV 
Server ( SCVMM 2012 SP1) using qemu-img.
  We tried converting the image using both 'fixed' as well as 'dynamic' format. 
We found that both the disks occupy the same size of 50GB. When the same is 
done with VHD image, we found that the dynamic disks are much lesser in size (5 
GB) than the fixed disk (50GB). 

  Why is that the VHDX generates large sized images for both the
  formats?

  The following commands were used to convert the vmdk image to VHDX
  format

  1. qemu-img convert -p -o subformat=fixed  -f vmdk -O vhdx Test.vmdk
  Test-fixed.vhdx

  qemu-img info Test-fixed.vhdx
  image: Test-fixed.vhdx
  file format: vhdx
  virtual size: 50G (53687091200 bytes)
  disk size: 50G
  cluster_size: 16777216


  
  2. qemu-img convert -p -o subformat=dynamic  -f vmdk -O vhdx Test.vmdk 
Test-dynamic.vhdx

  qemu-img info Test-dynamic.vhdx
  image: Test-dynamic.vhdx
  file format: vhdx
  virtual size: 50G (53687091200 bytes)
  disk size: 50G
  cluster_size: 16777216

  
  We tried this with the following version of qemu
  1. qemu-2.0.0
  2. qemu-2.1.2
  3. qemu-2.2.0-rc4

  
  Please let us know how to create compact VHDX images using qemu-img.
  Thank you

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1399191/+subscriptions



reply via email to

[Prev in Thread] Current Thread [Next in Thread]