[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v2 04/20] So we use multifd to transmit zero pages.
|
From: |
Fabiano Rosas |
|
Subject: |
Re: [PATCH v2 04/20] So we use multifd to transmit zero pages. |
|
Date: |
Thu, 16 Nov 2023 12:14:04 -0300 |
Hao Xiang <hao.xiang@bytedance.com> writes:
> From: Juan Quintela <quintela@redhat.com>
>
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> Reviewed-by: Leonardo Bras <leobras@redhat.com>
> ---
> migration/multifd.c | 7 ++++---
> migration/options.c | 13 +++++++------
> migration/ram.c | 45 ++++++++++++++++++++++++++++++++++++++-------
> qapi/migration.json | 1 -
> 4 files changed, 49 insertions(+), 17 deletions(-)
>
> diff --git a/migration/multifd.c b/migration/multifd.c
> index 1b994790d5..1198ffde9c 100644
> --- a/migration/multifd.c
> +++ b/migration/multifd.c
> @@ -13,6 +13,7 @@
> #include "qemu/osdep.h"
> #include "qemu/cutils.h"
> #include "qemu/rcu.h"
> +#include "qemu/cutils.h"
> #include "exec/target_page.h"
> #include "sysemu/sysemu.h"
> #include "exec/ramblock.h"
> @@ -459,7 +460,6 @@ static int multifd_send_pages(QEMUFile *f)
> p->packet_num = multifd_send_state->packet_num++;
> multifd_send_state->pages = p->pages;
> p->pages = pages;
> -
> qemu_mutex_unlock(&p->mutex);
> qemu_sem_post(&p->sem);
>
> @@ -684,7 +684,7 @@ static void *multifd_send_thread(void *opaque)
> MigrationThread *thread = NULL;
> Error *local_err = NULL;
> /* qemu older than 8.2 don't understand zero page on multifd channel */
> - bool use_zero_page = !migrate_use_main_zero_page();
> + bool use_multifd_zero_page = !migrate_use_main_zero_page();
> int ret = 0;
> bool use_zero_copy_send = migrate_zero_copy_send();
>
> @@ -713,6 +713,7 @@ static void *multifd_send_thread(void *opaque)
> RAMBlock *rb = p->pages->block;
> uint64_t packet_num = p->packet_num;
> uint32_t flags;
> +
> p->normal_num = 0;
> p->zero_num = 0;
>
> @@ -724,7 +725,7 @@ static void *multifd_send_thread(void *opaque)
>
> for (int i = 0; i < p->pages->num; i++) {
> uint64_t offset = p->pages->offset[i];
> - if (use_zero_page &&
> + if (use_multifd_zero_page &&
We could have a new function in multifd_ops for zero page
handling. We're already considering an accelerator for the compression
method in the other series[1] and in this series we're adding an
accelerator for zero page checking. It's about time we make the
multifd_ops generic instead of only compression/no compression.
1- [PATCH v2 0/4] Live Migration Acceleration with IAA Compression
20231109154638.488213-1-yuan1.liu@intel.com">https://lore.kernel.org/r/20231109154638.488213-1-yuan1.liu@intel.com
> buffer_is_zero(rb->host + offset, p->page_size)) {
> p->zero[p->zero_num] = offset;
> p->zero_num++;
> diff --git a/migration/options.c b/migration/options.c
> index 00c0c4a0d6..97d121d4d7 100644
> --- a/migration/options.c
> +++ b/migration/options.c
> @@ -195,6 +195,7 @@ Property migration_properties[] = {
> DEFINE_PROP_MIG_CAP("x-block", MIGRATION_CAPABILITY_BLOCK),
> DEFINE_PROP_MIG_CAP("x-return-path", MIGRATION_CAPABILITY_RETURN_PATH),
> DEFINE_PROP_MIG_CAP("x-multifd", MIGRATION_CAPABILITY_MULTIFD),
> + DEFINE_PROP_MIG_CAP("x-main-zero-page",
> MIGRATION_CAPABILITY_MAIN_ZERO_PAGE),
> DEFINE_PROP_MIG_CAP("x-background-snapshot",
> MIGRATION_CAPABILITY_BACKGROUND_SNAPSHOT),
> #ifdef CONFIG_LINUX
> @@ -288,13 +289,9 @@ bool migrate_multifd(void)
>
> bool migrate_use_main_zero_page(void)
> {
> - //MigrationState *s;
> -
> - //s = migrate_get_current();
> + MigrationState *s = migrate_get_current();
>
> - // We will enable this when we add the right code.
> - // return s->enabled_capabilities[MIGRATION_CAPABILITY_MAIN_ZERO_PAGE];
> - return true;
> + return s->capabilities[MIGRATION_CAPABILITY_MAIN_ZERO_PAGE];
What happens if we disable main-zero-page while multifd is not enabled?
> }
>
> bool migrate_pause_before_switchover(void)
> @@ -457,6 +454,7 @@
> INITIALIZE_MIGRATE_CAPS_SET(check_caps_background_snapshot,
> MIGRATION_CAPABILITY_LATE_BLOCK_ACTIVATE,
> MIGRATION_CAPABILITY_RETURN_PATH,
> MIGRATION_CAPABILITY_MULTIFD,
> + MIGRATION_CAPABILITY_MAIN_ZERO_PAGE,
> MIGRATION_CAPABILITY_PAUSE_BEFORE_SWITCHOVER,
> MIGRATION_CAPABILITY_AUTO_CONVERGE,
> MIGRATION_CAPABILITY_RELEASE_RAM,
> @@ -534,6 +532,9 @@ bool migrate_caps_check(bool *old_caps, bool *new_caps,
> Error **errp)
> error_setg(errp, "Postcopy is not yet compatible with multifd");
> return false;
> }
> + if (new_caps[MIGRATION_CAPABILITY_MAIN_ZERO_PAGE]) {
> + error_setg(errp, "Postcopy is not yet compatible with main zero
> copy");
> + }
Won't this will breaks compatibility for postcopy? A command that used
to work now will have to disable main-zero-page first.
> }
>
> if (new_caps[MIGRATION_CAPABILITY_BACKGROUND_SNAPSHOT]) {
> diff --git a/migration/ram.c b/migration/ram.c
> index 8c7886ab79..f7a42feff2 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -2059,17 +2059,42 @@ static int ram_save_target_page_legacy(RAMState *rs,
> PageSearchStatus *pss)
> if (save_zero_page(rs, pss, offset)) {
> return 1;
> }
> -
> /*
> - * Do not use multifd in postcopy as one whole host page should be
> - * placed. Meanwhile postcopy requires atomic update of pages, so even
> - * if host page size == guest page size the dest guest during run may
> - * still see partially copied pages which is data corruption.
> + * Do not use multifd for:
> + * 1. Compression as the first page in the new block should be posted out
> + * before sending the compressed page
> + * 2. In postcopy as one whole host page should be placed
> */
> - if (migrate_multifd() && !migration_in_postcopy()) {
> + if (!migrate_compress() && migrate_multifd() &&
> !migration_in_postcopy()) {
> + return ram_save_multifd_page(pss->pss_channel, block, offset);
> + }
This could go into ram_save_target_page_multifd like so:
if (!migrate_compress() && !migration_in_postcopy() &&
!migration_main_zero_page()) {
return ram_save_multifd_page(pss->pss_channel, block, offset);
} else {
return ram_save_target_page_legacy();
}
> +
> + return ram_save_page(rs, pss);
> +}
> +
> +/**
> + * ram_save_target_page_multifd: save one target page
> + *
> + * Returns the number of pages written
> + *
> + * @rs: current RAM state
> + * @pss: data about the page we want to send
> + */
> +static int ram_save_target_page_multifd(RAMState *rs, PageSearchStatus *pss)
> +{
> + RAMBlock *block = pss->block;
> + ram_addr_t offset = ((ram_addr_t)pss->page) << TARGET_PAGE_BITS;
> + int res;
> +
> + if (!migration_in_postcopy()) {
> return ram_save_multifd_page(pss->pss_channel, block, offset);
> }
>
> + res = save_zero_page(rs, pss, offset);
> + if (res > 0) {
> + return res;
> + }
> +
> return ram_save_page(rs, pss);
> }
>
> @@ -2982,9 +3007,15 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
> }
>
> migration_ops = g_malloc0(sizeof(MigrationOps));
> - migration_ops->ram_save_target_page = ram_save_target_page_legacy;
> +
> + if (migrate_multifd() && !migrate_use_main_zero_page()) {
> + migration_ops->ram_save_target_page = ram_save_target_page_multifd;
> + } else {
> + migration_ops->ram_save_target_page = ram_save_target_page_legacy;
> + }
This should not check main-zero-page. Just have multifd vs. legacy and
have the multifd function defer to _legacy if main-zero-page or
in_postcopy.
>
> qemu_mutex_unlock_iothread();
> +
> ret = multifd_send_sync_main(f);
> qemu_mutex_lock_iothread();
> if (ret < 0) {
> diff --git a/qapi/migration.json b/qapi/migration.json
> index 09e4393591..9783289bfc 100644
> --- a/qapi/migration.json
> +++ b/qapi/migration.json
> @@ -531,7 +531,6 @@
> # and can result in more stable read performance. Requires KVM
> # with accelerator property "dirty-ring-size" set. (Since 8.1)
> #
> -#
> # @main-zero-page: If enabled, the detection of zero pages will be
> # done on the main thread. Otherwise it is done on
> # the multifd threads.
- [PATCH v2 00/20] Use Intel DSA accelerator to offload zero page checking in multifd live migration., Hao Xiang, 2023/11/14
- [PATCH v2 01/20] multifd: Add capability to enable/disable zero_page, Hao Xiang, 2023/11/14
- [PATCH v2 02/20] multifd: Support for zero pages transmission, Hao Xiang, 2023/11/14
- [PATCH v2 03/20] multifd: Zero pages transmission, Hao Xiang, 2023/11/14
- [PATCH v2 04/20] So we use multifd to transmit zero pages., Hao Xiang, 2023/11/14
- Re: [PATCH v2 04/20] So we use multifd to transmit zero pages.,
Fabiano Rosas <=
- [PATCH v2 05/20] meson: Introduce new instruction set enqcmd to the build system., Hao Xiang, 2023/11/14
- [PATCH v2 06/20] util/dsa: Add dependency idxd., Hao Xiang, 2023/11/14
- [PATCH v2 11/20] util/dsa: Implement DSA task asynchronous submission and wait for completion., Hao Xiang, 2023/11/14
- [PATCH v2 08/20] util/dsa: Implement DSA task enqueue and dequeue., Hao Xiang, 2023/11/14
- [PATCH v2 12/20] migration/multifd: Add new migration option for multifd DSA offloading., Hao Xiang, 2023/11/14
- [PATCH v2 07/20] util/dsa: Implement DSA device start and stop logic., Hao Xiang, 2023/11/14
- [PATCH v2 13/20] migration/multifd: Prepare to introduce DSA acceleration on the multifd path., Hao Xiang, 2023/11/14
- [PATCH v2 09/20] util/dsa: Implement DSA task asynchronous completion thread model., Hao Xiang, 2023/11/14
- [PATCH v2 14/20] migration/multifd: Enable DSA offloading in multifd sender path., Hao Xiang, 2023/11/14
- [PATCH v2 15/20] migration/multifd: Add test hook to set normal page ratio., Hao Xiang, 2023/11/14