[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH for-2.8] migration: Fix return code of ram_save_
From: |
Thomas Huth |
Subject: |
Re: [Qemu-devel] [PATCH for-2.8] migration: Fix return code of ram_save_iterate() |
Date: |
Wed, 9 Nov 2016 08:46:34 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 |
On 09.11.2016 08:18, Amit Shah wrote:
> On (Fri) 04 Nov 2016 [14:10:17], Thomas Huth wrote:
>> qemu_savevm_state_iterate() expects the iterators to return 1
>> when they are done, and 0 if there is still something left to do.
>> However, ram_save_iterate() does not obey this rule and returns
>> the number of saved pages instead. This causes a fatal hang with
>> ppc64 guests when you run QEMU like this (also works with TCG):
>
> "works with" -- does that mean reproduces with?
Yes, that's what I've meant: You can reproduce it with TCG (e.g. running
on a x86 system), too, there's no need for a real POWER machine with KVM
here.
>> qemu-img create -f qcow2 /tmp/test.qcow2 1M
>> qemu-system-ppc64 -nographic -nodefaults -m 256 \
>> -hda /tmp/test.qcow2 -serial mon:stdio
>>
>> ... then switch to the monitor by pressing CTRL-a c and try to
>> save a snapshot with "savevm test1" for example.
>>
>> After the first iteration, ram_save_iterate() always returns 0 here,
>> so that qemu_savevm_state_iterate() hangs in an endless loop and you
>> can only "kill -9" the QEMU process.
>> Fix it by using proper return values in ram_save_iterate().
>>
>> Signed-off-by: Thomas Huth <address@hidden>
>> ---
>> migration/ram.c | 6 +++---
>> 1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/migration/ram.c b/migration/ram.c
>> index fb9252d..a1c8089 100644
>> --- a/migration/ram.c
>> +++ b/migration/ram.c
>> @@ -1987,7 +1987,7 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
>> int ret;
>> int i;
>> int64_t t0;
>> - int pages_sent = 0;
>> + int done = 0;
>>
>> rcu_read_lock();
>> if (ram_list.version != last_version) {
>> @@ -2007,9 +2007,9 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
>> pages = ram_find_and_save_block(f, false, &bytes_transferred);
>> /* no more pages to sent */
>> if (pages == 0) {
>> + done = 1;
>> break;
>> }
>> - pages_sent += pages;
>> acct_info.iterations++;
>>
>> /* we want to check in the 1st loop, just in case it was the 1st
>> time
>> @@ -2044,7 +2044,7 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
>> return ret;
>> }
>>
>> - return pages_sent;
>> + return done;
>> }
>
> I agree with David, we can just remove the return value. The first
> patch of the series can do that; and this one could become the 2nd
> patch. Should be OK for the soft freeze.
Sorry, I still did not quite get it - if I'd change the return type of
ram_save_iterate() and the other iterate functions to "void", how is
qemu_savevm_state_iterate() supposed to know whether all iterators are
done or not? And other iterators also use negative return values to
signal errors - should that then be handled via an "Error **" parameter
instead? ... my gut feeling still says that such a bigger rework (we've
got to touch all iterators for this!) should rather not be done right in
the middle of the freeze period...
Thomas
Re: [Qemu-devel] [PATCH for-2.8] migration: Fix return code of ram_save_iterate(), Juan Quintela, 2016/11/14