qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 09/17] migration/multifd: Device state transfer support -


From: Maciej S. Szmigiero
Subject: Re: [PATCH v2 09/17] migration/multifd: Device state transfer support - receive side
Date: Mon, 2 Sep 2024 22:12:01 +0200
User-agent: Mozilla Thunderbird

On 30.08.2024 22:22, Fabiano Rosas wrote:
"Maciej S. Szmigiero" <mail@maciej.szmigiero.name> writes:

From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>

Add a basic support for receiving device state via multifd channels -
channels that are shared with RAM transfers.

To differentiate between a device state and a RAM packet the packet
header is read first.

Depending whether MULTIFD_FLAG_DEVICE_STATE flag is present or not in the
packet header either device state (MultiFDPacketDeviceState_t) or RAM
data (existing MultiFDPacket_t) is then read.

The received device state data is provided to
qemu_loadvm_load_state_buffer() function for processing in the
device's load_state_buffer handler.

Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
---
  migration/multifd.c | 127 +++++++++++++++++++++++++++++++++++++-------
  migration/multifd.h |  31 ++++++++++-
  2 files changed, 138 insertions(+), 20 deletions(-)

diff --git a/migration/multifd.c b/migration/multifd.c
index b06a9fab500e..d5a8e5a9c9b5 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
(..)
      g_free(p->zero);
@@ -1126,8 +1159,13 @@ static void *multifd_recv_thread(void *opaque)
      rcu_register_thread();
while (true) {
+        MultiFDPacketHdr_t hdr;
          uint32_t flags = 0;
+        bool is_device_state = false;
          bool has_data = false;
+        uint8_t *pkt_buf;
+        size_t pkt_len;
+
          p->normal_num = 0;
if (use_packets) {
@@ -1135,8 +1173,28 @@ static void *multifd_recv_thread(void *opaque)
                  break;
              }
- ret = qio_channel_read_all_eof(p->c, (void *)p->packet,
-                                           p->packet_len, &local_err);
+            ret = qio_channel_read_all_eof(p->c, (void *)&hdr,
+                                           sizeof(hdr), &local_err);
+            if (ret == 0 || ret == -1) {   /* 0: EOF  -1: Error */
+                break;
+            }
+
+            ret = multifd_recv_unfill_packet_header(p, &hdr, &local_err);
+            if (ret) {
+                break;
+            }
+
+            is_device_state = p->flags & MULTIFD_FLAG_DEVICE_STATE;
+            if (is_device_state) {
+                pkt_buf = (uint8_t *)p->packet_dev_state + sizeof(hdr);
+                pkt_len = sizeof(*p->packet_dev_state) - sizeof(hdr);
+            } else {
+                pkt_buf = (uint8_t *)p->packet + sizeof(hdr);
+                pkt_len = p->packet_len - sizeof(hdr);
+            }

Should we have made the packet an union as well? Would simplify these
sorts of operations. Not sure I want to start messing with that at this
point to be honest. But OTOH, look at this...

RAM packet length is not constant (at least from the viewpoint of the
migration code) so the union allocation would need some kind of a
"multifd_ram_packet_size()" runtime size determination.

Also, since RAM and device state packet body size is different then
for the extra complexity introduced by that union we'll just get rid of
that single pkt_buf assignment.

+
+            ret = qio_channel_read_all_eof(p->c, (char *)pkt_buf, pkt_len,
+                                           &local_err);
              if (ret == 0 || ret == -1) {   /* 0: EOF  -1: Error */
                  break;
              }
@@ -1181,8 +1239,33 @@ static void *multifd_recv_thread(void *opaque)
              has_data = !!p->data->size;
          }
- if (has_data) {
-            ret = multifd_recv_state->ops->recv(p, &local_err);
+        if (!is_device_state) {
+            if (has_data) {
+                ret = multifd_recv_state->ops->recv(p, &local_err);
+                if (ret != 0) {
+                    break;
+                }
+            }
+        } else {
+            g_autofree char *idstr = NULL;
+            g_autofree char *dev_state_buf = NULL;
+
+            assert(use_packets);
+
+            if (p->next_packet_size > 0) {
+                dev_state_buf = g_malloc(p->next_packet_size);
+
+                ret = qio_channel_read_all(p->c, dev_state_buf, 
p->next_packet_size, &local_err);
+                if (ret != 0) {
+                    break;
+                }
+            }

What's the use case for !next_packet_size and still call
load_state_buffer below? I can't see it.

Currently, next_packet_size == 0 has not usage indeed - it is
a leftover from an early version of the patch set (not public)
that had device state packet (chunk) indexing done by
the common migration code, rather than by the VFIO consumer.

And then an empty packet could be used to mark the stream
boundary - like the max chunk number to expect.

...because I would suggest to set has_data up there with
p->next_packet_size:

if (use_packets) {
    ...
    has_data = p->next_packet_size || p->zero_num;
} else {
    ...
    has_data = !!p->data_size;
}

and this whole block would be:

if (has_data) {
    if (is_device_state) {
        multifd_device_state_recv(p, &local_err);
    } else {
        ret = multifd_recv_state->ops->recv(p, &local_err);
    }
}

The above block makes sense to me with two caveats:
1) If empty device state packets (next_packet_size == 0) were
to be unsupported they need to be rejected cleanly rather
than silently skipped,

2) has_data has to have its value computed depending on whether
this is a RAM or a device state packet since looking at
p->normal_num and p->zero_num makes no sense for a device state
packet while I am not sure that looking at p->next_packet_size
for a RAM packet won't introduce some subtle regression.

+
+            idstr = g_strndup(p->packet_dev_state->idstr, 
sizeof(p->packet_dev_state->idstr));
+            ret = qemu_loadvm_load_state_buffer(idstr,
+                                                
p->packet_dev_state->instance_id,
+                                                dev_state_buf, 
p->next_packet_size,
+                                                &local_err);
              if (ret != 0) {
                  break;
              }
@@ -1190,6 +1273,11 @@ static void *multifd_recv_thread(void *opaque)
if (use_packets) {
              if (flags & MULTIFD_FLAG_SYNC) {
+                if (is_device_state) {
+                    error_setg(&local_err, "multifd: received SYNC device state 
packet");
+                    break;
+                }

assert(!is_device_state) enough?

It's not bug in the receiver code but rather an issue with the
remote QEMU sending us wrong data if we get a SYNC device state
packet.

So I think returning an error is more appropriate than triggering
an assert() failure for that.

+
                  qemu_sem_post(&multifd_recv_state->sem_sync);
                  qemu_sem_wait(&p->sem_sync);
              }
@@ -1258,6 +1346,7 @@ int multifd_recv_setup(Error **errp)
              p->packet_len = sizeof(MultiFDPacket_t)
                  + sizeof(uint64_t) * page_count;
              p->packet = g_malloc0(p->packet_len);
+            p->packet_dev_state = g_malloc0(sizeof(*p->packet_dev_state));
          }
          p->name = g_strdup_printf("mig/dst/recv_%d", i);
          p->normal = g_new0(ram_addr_t, page_count);
diff --git a/migration/multifd.h b/migration/multifd.h
index a3e35196d179..a8f3e4838c01 100644
--- a/migration/multifd.h
+++ b/migration/multifd.h
@@ -45,6 +45,12 @@ MultiFDRecvData *multifd_get_recv_data(void);
  #define MULTIFD_FLAG_QPL (4 << 1)
  #define MULTIFD_FLAG_UADK (8 << 1)
+/*
+ * If set it means that this packet contains device state
+ * (MultiFDPacketDeviceState_t), not RAM data (MultiFDPacket_t).
+ */
+#define MULTIFD_FLAG_DEVICE_STATE (1 << 4)

Overlaps with UADK. I assume on purpose because device_state doesn't
support compression? Might be worth a comment.


Yes, the device state transfer bit stream does not support compression
so it is not a problem since these "compression type" flags will never
be set in such bit stream anyway.

Will add a relevant comment here.

Thanks,
Maciej




reply via email to

[Prev in Thread] Current Thread [Next in Thread]