[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH COLO-Frame v12 10/38] COLO: Implement colo check
From: |
Markus Armbruster |
Subject: |
Re: [Qemu-devel] [PATCH COLO-Frame v12 10/38] COLO: Implement colo checkpoint protocol |
Date: |
Mon, 11 Jan 2016 13:47:05 +0100 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) |
Hailiang Zhang <address@hidden> writes:
> Hi Markus,
>
> On 2015/12/19 16:54, Markus Armbruster wrote:
>> Jumping in at v12 for a bit of QAPI review (and whatever else catched my
>> eye nearby), please pardon my ignorance of COLO in general, and previous
>> review of this series in particular.
>>
>
> Thanks all the same :)
[...]
>>> diff --git a/migration/colo.c b/migration/colo.c
>>> index 0ab9618..0ce2a6e 100644
>>> --- a/migration/colo.c
>>> +++ b/migration/colo.c
>>> @@ -10,10 +10,12 @@
>>> * later. See the COPYING file in the top-level directory.
>>> */
>>>
>>> +#include <unistd.h>
>>> #include "sysemu/sysemu.h"
>>> #include "migration/colo.h"
>>> #include "trace.h"
>>> #include "qemu/error-report.h"
>>> +#include "qemu/sockets.h"
>>>
>>> bool colo_supported(void)
>>> {
>>> @@ -34,6 +36,100 @@ bool migration_incoming_in_colo_state(void)
>>> return mis && (mis->state == MIGRATION_STATUS_COLO);
>>> }
>>>
>>> +static int colo_put_cmd(QEMUFile *f, uint32_t cmd)
>>> +{
>>> + int ret;
>>> +
>>> + if (cmd >= COLO_COMMAND_MAX) {
>>
>> Needs a trivial rebase due to commit 7fb1cf1.
>>
>
>>> + error_report("%s: Invalid cmd", __func__);
>>> + return -EINVAL;
>>
>> Can this run in a context with different error handling needs?
>>
>> Or asked differently: who may ultimately handle this error? Whoever
>> that may be, how does it need to report errors?
>>
>> Peeking ahead: the immediate callers don't handle this error, they just
>> pass it on their callers.
>>
>> I'm asking because I'm trying to understand whether error_report() is
>> appropriate here, or whether you need to use error_setg(), and leave the
>> actual reporting to the spot that ultimately handles this error.
>>
>
> Hmm, i know what you mean, we handled them all together after exit
> from the colo process loop,
> Use error_setg() seems to be a good idea, with this modification, we
> can also drop the return
> value. I will fix it in next version.
>
>
>>> + }
>>> + qemu_put_be32(f, cmd);
>>> + qemu_fflush(f);
>>> +
>>> + ret = qemu_file_get_error(f);
>>> + trace_colo_put_cmd(COLOCommand_lookup[cmd]);
>>> +
>>> + return ret;
>>> +}
>>
>> Looks like @cmd is a COLOCommand. Why is the parameter type uint32_t?
>>
>
> OK, i will change it to use enum COLOCommand.
>
>>> +
>>> +static int colo_get_cmd(QEMUFile *f, uint32_t *cmd)
>>> +{
>>> + int ret;
>>> +
>>> + *cmd = qemu_get_be32(f);
>>> + ret = qemu_file_get_error(f);
>>> + if (ret < 0) {
>>> + return ret;
>>> + }
>>> + if (*cmd >= COLO_COMMAND_MAX) {
>>> + error_report("%s: Invalid cmd", __func__);
>>> + return -EINVAL;
>>> + }
>>> + trace_colo_get_cmd(COLOCommand_lookup[*cmd]);
>>> + return 0;
>>> +}
>>
>> Same question.
>>
>> The "get" in the name suggests the function returns the value gotten,
>> like similarly named function elsewhere in migration/ do.
>>
> Do you mean it should return the cmd value directly, not though parameter way
> ?
> After we convert it to use error_setg() to indicate success or not, we
> can do like that.
> I will fix it.
Sounds good to me.
[...]
>>> diff --git a/qapi-schema.json b/qapi-schema.json
>>> index c9ff34e..85f7800 100644
>>> --- a/qapi-schema.json
>>> +++ b/qapi-schema.json
>>> @@ -720,6 +720,31 @@
>>> { 'command': 'migrate-start-postcopy' }
>>>
>>> ##
>>> +# @COLOCommand
>>> +#
>>> +# The commands for COLO fault tolerance
>>> +#
>>> +# @checkpoint-ready: SVM is ready for checkpointing
>>> +#
>>> +# @checkpoint-request: PVM tells SVM to prepare for new checkpointing
>>> +#
>>> +# @checkpoint-reply: SVM gets PVM's checkpoint request
>>> +#
>>> +# @vmstate-send: VM's state will be sent by PVM.
>>> +#
>>> +# @vmstate-size: The total size of VMstate.
>>> +#
>>> +# @vmstate-received: VM's state has been received by SVM.
>>> +#
>>> +# @vmstate-loaded: VM's state has been loaded by SVM.
>>> +#
>>> +# Since: 2.6
>>> +##
>>> +{ 'enum': 'COLOCommand',
>>> + 'data': [ 'checkpoint-ready', 'checkpoint-request', 'checkpoint-reply',
>>> + 'vmstate-send', 'vmstate-size','vmstate-received',
>>> + 'vmstate-loaded' ] }
>>> +
>>
>> Space after 'vmstate-size', please.
>>
>
>> 'vmstate-size' is not used in this patch. You may want to add it with
>> its first use instead.
>>
>
> OK, i will move it to the corresponding patch.
>
>> Should this enum really be named "COLOCommand"? 'checkpoint-ready',
>> 'checkpoint-request', 'vmstate-send' look like commands to me, but the
>> others look like replies.
>>
>
> Yes, COLOCommand is not so exact. what about name it COLOProtocol?
A protocol specifies valid sequences of messages, and what they mean.
This isn't a protocol, it's a message within a protocol. COLOMessage?
>>
>>> # @MouseInfo:
>>> #
>>> # Information about a mouse device.
>>> diff --git a/trace-events b/trace-events
>>> index 5565e79..39fdd8d 100644
>>> --- a/trace-events
>>> +++ b/trace-events
>>> @@ -1579,6 +1579,8 @@ postcopy_ram_incoming_cleanup_join(void) ""
>>>
>>> # migration/colo.c
>>> colo_vm_state_change(const char *old, const char *new) "Change
>>> '%s' => '%s'"
>>> +colo_put_cmd(const char *msg) "Send '%s' cmd"
>>> +colo_get_cmd(const char *msg) "Receive '%s' cmd"
>>>
>>> # kvm-all.c
>>> kvm_ioctl(int type, void *arg) "type 0x%x, arg %p"
>>
>> I like how this commit creates just the two state machines, and leaves
>> filling in their actions to later commits. Helps ignorant rewiewers
>> like me :)
>>
>>
>
> Do you mean i should split this patch ? Leave this patch with the
> simplest colo process,
> maybe just 'ready, request, reply', and add the other states in later patch?
No, I *like* how you split up the work.
- Re: [Qemu-devel] [PATCH COLO-Frame v12 10/38] COLO: Implement colo checkpoint protocol,
Markus Armbruster <=