[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v6 11/19] migration: Start of multiple fd work
From: |
Juan Quintela |
Subject: |
Re: [Qemu-devel] [PATCH v6 11/19] migration: Start of multiple fd work |
Date: |
Mon, 14 Aug 2017 15:43:58 +0200 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/25.2 (gnu/linux) |
"Daniel P. Berrange" <address@hidden> wrote:
> On Tue, Aug 08, 2017 at 06:26:21PM +0200, Juan Quintela wrote:
>> We create new channels for each new thread created. We send through
>> them a string containing <uuid> multifd <channel number> so we are
>> sure that we connect the right channels in both sides.
>>
>> Signed-off-by: Juan Quintela <address@hidden>
>>
>> +/* Default uuid for multifd when qemu is not started with uuid */
>> +static char multifd_uuid[] = "5c49fd7e-af88-4a07-b6e8-091fd696ad40";
>
> Why is this better than just using the qemu_uuid unconditionally.
> UUIC, it'll just be 00000000-0000-0000-0000-000000000000.
>
> Either way you've got a non-unique UUID if multiple QEMUs are
> started, so I dont see a benefit in inventing a new uuid here.
I hate a message full of zeros, it is the default value. If you have
more than one qemu and you don't set uuid, you are asking for trouble.
But if people preffer the 00000 uuid, it is also ok with me.
>
>> +/* strlen(multifd) + '-' + <channel id> + '-' + UUID_FMT + '\0' */
>> +#define MULTIFD_UUID_MSG (7 + 1 + 3 + 1 + UUID_FMT_LEN + 1)
>> +
>> static void *multifd_send_thread(void *opaque)
>> {
>> MultiFDSendParams *p = opaque;
>> + char *string;
>> + char *string_uuid;
>> +
>> + if (qemu_uuid_set) {
>> + string_uuid = qemu_uuid_unparse_strdup(&qemu_uuid);
>> + } else {
>> + string_uuid = g_strdup(multifd_uuid);
>> + }
>> + string = g_strdup_printf("%s multifd %03d", string_uuid, p->id);
>> + g_free(string_uuid);
>> + qio_channel_write(p->c, string, MULTIFD_UUID_MSG, &error_abort);
>
> Must check return code here as it can do a short write, which won't
> trigger error_abort. Also you must not use error_abort at all. QEMU
> must not abort when migration hits an I/O error as that needlessly
> kills the user's VM.
>
> I also think it is not nice to be formatting a string with printf
> here, sending it and then using scanf to extract the data. If we
> need to send structured data, then we should define a proper binary
> format for it eg
>
> struct MigrateUUIDMsg {
> uint32_t chnanelid;
> QemuUUID uuid;
> } __attribute__((__packed__));
>
> and then just send the raw struct across.
Not that I believe that it still works (or that it worked ever), but
qemu migration "on stream" was supposed to be Endian safe .....
>> + g_free(string);
>>
>> while (true) {
>> qemu_mutex_lock(&p->mutex);
>> @@ -445,6 +467,12 @@ int multifd_save_setup(void)
>> qemu_sem_init(&p->sem, 0);
>> p->quit = false;
>> p->id = i;
>> + p->c = socket_send_channel_create();
>> + if (!p->c) {
>> + error_report("Error creating a send channel");
>> + multifd_save_cleanup();
>> + return -1;
>> + }
>
> We should pass an 'Error *' object into socket_send_channel_create()
> so that we can receive & report the actual error message, instead
> of a useless generic message.
Yeap, that is what I told was doing next on the Cover letter.
Just that it means changing lots of functions prototypes ....
>> + qio_channel_read(ioc, string, sizeof(string), &error_abort);
>
> Must not use error_abort which kills the whole VM ungracefully
Changing also that.
>
>> + sscanf(string, "%s multifd %03d", string_uuid, &id);
>> +
>> + if (qemu_uuid_set) {
>> + uuid = qemu_uuid_unparse_strdup(&qemu_uuid);
>> + } else {
>> + uuid = g_strdup(multifd_uuid);
>> + }
>> + if (strcmp(string_uuid, uuid)) {
>> + error_report("multifd: received uuid '%s' and expected uuid '%s'"
>> + " for channel %d", string_uuid, uuid, id);
>> + migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
>> + MIGRATION_STATUS_FAILED);
>> + terminate_multifd_recv_threads();
>> + return;
>> + }
>> + g_free(uuid);
>
> As mentioned above, we should receive a binary struct here instead
> of playing games with scanf.
>> @@ -534,22 +612,15 @@ int multifd_load_setup(void)
>> multifd_recv_state = g_malloc0(sizeof(*multifd_recv_state));
>> multifd_recv_state->params = g_new0(MultiFDRecvParams *, thread_count);
>> multifd_recv_state->count = 0;
>> - for (i = 0; i < thread_count; i++) {
>> - char thread_name[16];
>> - MultiFDRecvParams *p = multifd_recv_state->params[i];
>> -
>> - qemu_mutex_init(&p->mutex);
>> - qemu_sem_init(&p->sem, 0);
>> - p->quit = false;
>> - p->id = i;
>> - snprintf(thread_name, sizeof(thread_name), "multifdrecv_%d", i);
>> - qemu_thread_create(&p->thread, thread_name, multifd_recv_thread, p,
>> - QEMU_THREAD_JOINABLE);
>> - multifd_recv_state->count++;
>> - }
>
> It us a little strange to be deleting a bunch of code you just added in the
> previous patch
We create the send/recv channels on previous patch. And then we switch
to create them as we receive the connections. I will try to see if
creating first the receiving threads it gets clearer.
>> static SocketAddress *tcp_build_address(const char *host_port, Error **errp)
>> {
>> @@ -96,6 +128,9 @@ static void
>> socket_start_outgoing_migration(MigrationState *s,
>> struct SocketConnectData *data = g_new0(struct SocketConnectData, 1);
>>
>> data->s = s;
>> + outgoing_args.saddr = saddr;
>> + outgoing_args.errp = errp;
>
> Taking a stack local 'errp' pointer and saving it for later use
> in a global variable is asking for trouble. There's no guarantee
> that stack frame will still be valid when 'outgoing_args.errp'
> is later used.
Yeap, changing to an Error * pointer by thread.
Thanks, Juan.
- Re: [Qemu-devel] [PATCH v6 07/19] migration: Create x-multifd-threads parameter, (continued)
[Qemu-devel] [PATCH v6 12/19] migration: Create ram_multifd_page, Juan Quintela, 2017/08/08
[Qemu-devel] [PATCH v6 14/19] migration: Send the fd number which we are going to use for this page, Juan Quintela, 2017/08/08
[Qemu-devel] [PATCH v6 13/19] migration: Really use multiple pages at a time, Juan Quintela, 2017/08/08
[Qemu-devel] [PATCH v6 15/19] migration: Create thread infrastructure for multifd recv side, Juan Quintela, 2017/08/08
[Qemu-devel] [PATCH v6 16/19] migration: Test new fd infrastructure, Juan Quintela, 2017/08/08