qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [BUG] qemu-1.1.2 [FIXED-BY] qcow2: Fix avail_sectors in


From: Kevin Wolf
Subject: Re: [Qemu-devel] [BUG] qemu-1.1.2 [FIXED-BY] qcow2: Fix avail_sectors in cluster allocation code
Date: Wed, 12 Dec 2012 14:41:49 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:13.0) Gecko/20120605 Thunderbird/13.0

Hi Philipp,

Am 12.12.2012 14:25, schrieb Philipp Hahn:
> Hello Kevin, hello Michael, hello *,
> 
> we noticed a data corruption bug in qemu-1.1.2, which will be shipped by 
> Debian and our own Debian based distibution.
> The corruption mostly manifests while installing large Debian package files 
> and seems to be reladed to memory preasure: As long as the file is still in 
> the page cache, everything looks fine, but when the file is re-read from the 
> virtual hard disk using a qcow2 file backed by another qcow2 file, the file 
> is corrupted: dpkg complains that the .tar.gz file inside the Debian archive 
> file is corrupted and the md5sum no longer matches.
> 
> I tracked this down using "git bisect" to your patch attached below, which 
> fixed this bug, so everything is fine with qemu-kvm-1.2.0.
> From my reading this seems to explain our problems, since during my own 
> testing during development I never used backing chains and the problem only 
> showed up when my collegues started using qemu-kvm-1.1.2 with their VMs using 
> backing chains.
> 
> @Kevin: Do you thinks that's a valid explanation and your patch should fix 
> that problem?
> I'd like to get your expertise before filing a bug with Debian and asking 
> Michael to include that patch with his next stable update for 1.1.

As you can see in the commit message of that patch I was convinced that
no bug did exist in practice and this was only dangerous with respect to
future changes. Therefore my first question is if you're using an
unmodified upstream qemu or if some backported patches are applied to
it? If it's indeed unmodified, we should probably review the code once
again to understand why it makes a difference.

In any case, this is the cluster allocation code. It's probably not
related to rereading things from disk, but rather to the writeout of the
page cache.

Kevin



reply via email to

[Prev in Thread] Current Thread [Next in Thread]