[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] broken incoming migration
From: |
Paolo Bonzini |
Subject: |
Re: [Qemu-devel] broken incoming migration |
Date: |
Thu, 30 May 2013 15:00:41 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130514 Thunderbird/17.0.6 |
Il 30/05/2013 11:08, Peter Lieven ha scritto:
> Am 30.05.2013 10:18, schrieb Alexey Kardashevskiy:
>> On 05/30/2013 05:49 PM, Paolo Bonzini wrote:
>>> Il 30/05/2013 09:44, Alexey Kardashevskiy ha scritto:
>>>> Hi!
>>>>
>>>> I found the migration broken on pseries platform, specifically, this patch
>>>> broke it:
>>>>
>>>> f1c72795af573b24a7da5eb52375c9aba8a37972
>>>> migration: do not sent zero pages in bulk stage
>>>>
>>>> The idea is not to send zero pages to the destination guest which is
>>>> expected to have 100% empty RAM.
>>>>
>>>> However on pseries plaftorm the guest always has some stuff in the RAM as a
>>>> part of initialization (device tree, system firmware and rtas (?)) so it is
>>>> not completely empty. As the source guest cannot detect this, it skips some
>>>> pages during migration and we get a broken destination guest. Bug.
>>>>
>>>> While the idea is ok in general, I do not see any easy way to fix it as
>>>> neither QEMUMachine::init nor QEMUMachine::reset callbacks has information
>>>> about whether we are about to receive a migration or not (-incoming
>>>> parameter) and we cannot move device-tree and system firmware
>>>> initialization anywhere else.
>>>>
>>>> ram_bulk_stage is static and cannot be disabled from the platform
>>>> initialization code.
>>>>
>>>> So what would the community suggest?
>>> Revert the patch. :)
>> I'll wait for 24 hours (forgot to cc: the author) and then post a revert
>> patch :)
>>
> does this problem only occur on pseries emulation?
Probably not. On a PC, it would occur if you had 4K of zeros in the
source BIOS but not in the destination BIOS. When you reboot, the BIOS
image is wrong.
> not sending zero pages is not only a performance benefit it also makes
> overcomitted memory usable. the madv_dontneed seems to kick in asynchronously
> and memory is not available immediately.
You could also scan the page for nonzero values before writing it.
Paolo
Re: [Qemu-devel] broken incoming migration, Peter Maydell, 2013/05/30