[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-ppc] Problem with "savevm" on ppc64
From: |
Thomas Huth |
Subject: |
Re: [Qemu-ppc] Problem with "savevm" on ppc64 |
Date: |
Fri, 21 Oct 2016 08:45:21 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 |
On 21.10.2016 07:26, David Gibson wrote:
> On Thu, Oct 20, 2016 at 03:17:12PM +0200, Thomas Huth wrote:
>> Hi all,
>>
>> I'm currently facing a strange problem with the "savevm" HMP command on
>> ppc64 with TCG and the pseries machine. Steps for reproduction:
>>
>> 1) Create a disk image:
>> qemu-img create -f qcow2 /tmp/test.qcow2 1M
>>
>> 2) Start QEMU (in TCG mode):
>> qemu-system-ppc64 -nographic -vga none -m 256 -hda /tmp/test.qcow2
>>
>> 3) Hit "CTRL-a c" to enter the HMP monitor
>>
>> 4) Type "savevm test1" to save a snapshot
>>
>> The savevm command then hangs forever and the test.qcow2 image keeps
>> growing and growing.
>>
>> It seems like qemu_savevm_state_iterate() does not make any more
>> progress because ram_save_iterate() keeps returning 0 ... but why can
>> that happen?
>
> Hmm. You don't mention installing anything on the disk image, so I'm
> assuming the VM is just sitting in SLOF, unable to boot.
Right. This is basically what is currently happening with the failing
test tests/qemu-iotests/007 on ppc64.
[...]
> So, looking at this I think it's unsafe. htab_save_first_pass() never
> examines dirty bits, so we could get:
> htab_save_first_pass() called once, saves part of HPT
> guest dirties an HPTE in the already saved region
> enter migration completion stage
> htab_save_first_pass() saves the remainder of the HPT, returns 1
>
> That would trigger the code to think the HPT migration is complete,
> without ever saving the HPTE that got dirtied part way through.
There's still htab_save_complete() which seems always to be called at
the end - and this function calls htab_save_later_pass() again to save
the remaining entries. But I am really no expert in this part of the
code, so maybe I've also missed something here.
> But.. then I looked further and got *really* confused.
>
> The comment above qemu_savevm_state_iterate() and the logic in
> qemu_savevm_state() say that the iterate function returns >0 to
> indicate that it is done and we can move onto the completion phase.
>
> But both ram_save_iterate() and block_save_iterate() seem to have that
> inverted: they return >0 if they actually saved something.
Yes, that confused me completely, too! Something is really fishy in the
logic here. I hope that one of the migration experts can enlighten us
how it is really meant to work...
Thomas
signature.asc
Description: OpenPGP digital signature