[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v4] migration: Allow user to specify available switchover ban
|
From: |
Markus Armbruster |
|
Subject: |
Re: [PATCH v4] migration: Allow user to specify available switchover bandwidth |
|
Date: |
Tue, 17 Oct 2023 16:12:40 +0200 |
|
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux) |
Peter Xu <peterx@redhat.com> writes:
> Migration bandwidth is a very important value to live migration. It's
> because it's one of the major factors that we'll make decision on when to
> switchover to destination in a precopy process.
>
> This value is currently estimated by QEMU during the whole live migration
> process by monitoring how fast we were sending the data. This can be the
> most accurate bandwidth if in the ideal world, where we're always feeding
> unlimited data to the migration channel, and then it'll be limited to the
> bandwidth that is available.
>
> However in reality it may be very different, e.g., over a 10Gbps network we
> can see query-migrate showing migration bandwidth of only a few tens of
> MB/s just because there are plenty of other things the migration thread
> might be doing. For example, the migration thread can be busy scanning
> zero pages, or it can be fetching dirty bitmap from other external dirty
> sources (like vhost or KVM). It means we may not be pushing data as much
> as possible to migration channel, so the bandwidth estimated from "how many
> data we sent in the channel" can be dramatically inaccurate sometimes.
how much data we've sent to the channel
>
> With that, the decision to switchover will be affected, by assuming that we
> may not be able to switchover at all with such a low bandwidth, but in
> reality we can.
>
> The migration may not even converge at all with the downtime specified,
> with that wrong estimation of bandwidth, keeping iterations forever with a
iterating forever
> low estimation of bandwidth.
>
> The issue is QEMU itself may not be able to avoid those uncertainties on
> measuing the real "available migration bandwidth". At least not something
> I can think of so far.
>
> One way to fix this is when the user is fully aware of the available
> bandwidth, then we can allow the user to help providing an accurate value.
>
> For example, if the user has a dedicated channel of 10Gbps for migration
> for this specific VM, the user can specify this bandwidth so QEMU can
> always do the calculation based on this fact, trusting the user as long as
> specified. It may not be the exact bandwidth when switching over (in which
> case qemu will push migration data as fast as possible), but much better
> than QEMU trying to wildly guess, especially when very wrong.
>
> A new parameter "avail-switchover-bandwidth" is introduced just for this.
> So when the user specified this parameter, instead of trusting the
> estimated value from QEMU itself (based on the QEMUFile send speed), it
> trusts the user more by using this value to decide when to switchover,
> assuming that we'll have such bandwidth available then.
>
> Note that specifying this value will not throttle the bandwidth for
> switchover yet, so QEMU will always use the full bandwidth possible for
> sending switchover data, assuming that should always be the most important
> way to use the network at that time.
>
> This can resolve issues like "unconvergence migration" which is caused by
> hilarious low "migration bandwidth" detected for whatever reason.
"unconvergence" isn't a word :)
Suggest "like migration not converging, because the automatically
detected migration bandwidth is hilariously low for whatever reason."
Appreciate the thorough explanation!
>
> Reported-by: Zhiyi Guo <zhguo@redhat.com>
> Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
> v4:
> - Rebase to master, with duplicated documentations
> ---
> qapi/migration.json | 34 +++++++++++++++++++++++++++++++++-
> migration/migration.h | 2 +-
> migration/options.h | 1 +
> migration/migration-hmp-cmds.c | 14 ++++++++++++++
> migration/migration.c | 24 +++++++++++++++++++++---
> migration/options.c | 28 ++++++++++++++++++++++++++++
> migration/trace-events | 2 +-
> 7 files changed, 99 insertions(+), 6 deletions(-)
>
> diff --git a/qapi/migration.json b/qapi/migration.json
> index 8843e74b59..0c897a99b1 100644
> --- a/qapi/migration.json
> +++ b/qapi/migration.json
> @@ -759,6 +759,16 @@
> # @max-bandwidth: to set maximum speed for migration. maximum speed
> # in bytes per second. (Since 2.8)
> #
> +# @avail-switchover-bandwidth: to set the available bandwidth that
> +# migration can use during switchover phase. NOTE! This does not
> +# limit the bandwidth during switchover, but only for calculations when
> +# making decisions to switchover. By default, this value is zero,
> +# which means QEMU will estimate the bandwidth automatically. This can
> +# be set when the estimated value is not accurate, while the user is
> +# able to guarantee such bandwidth is available when switching over.
> +# When specified correctly, this can make the switchover decision much
> +# more accurate. (Since 8.2)
We tend to eschew abbreviations in QAPI schema identifiers.
available-switchover-bandwidth is a mouthful, though. What do you
think?
> +#
> # @downtime-limit: set maximum tolerated downtime for migration.
> # maximum downtime in milliseconds (Since 2.8)
> #
> @@ -840,7 +850,7 @@
> 'cpu-throttle-initial', 'cpu-throttle-increment',
> 'cpu-throttle-tailslow',
> 'tls-creds', 'tls-hostname', 'tls-authz', 'max-bandwidth',
> - 'downtime-limit',
> + 'avail-switchover-bandwidth', 'downtime-limit',
> { 'name': 'x-checkpoint-delay', 'features': [ 'unstable' ] },
> 'block-incremental',
> 'multifd-channels',
> @@ -925,6 +935,16 @@
> # @max-bandwidth: to set maximum speed for migration. maximum speed
> # in bytes per second. (Since 2.8)
> #
> +# @avail-switchover-bandwidth: to set the available bandwidth that
> +# migration can use during switchover phase. NOTE! This does not
> +# limit the bandwidth during switchover, but only for calculations when
> +# making decisions to switchover. By default, this value is zero,
> +# which means QEMU will estimate the bandwidth automatically. This can
> +# be set when the estimated value is not accurate, while the user is
> +# able to guarantee such bandwidth is available when switching over.
> +# When specified correctly, this can make the switchover decision much
> +# more accurate. (Since 8.2)
> +#
> # @downtime-limit: set maximum tolerated downtime for migration.
> # maximum downtime in milliseconds (Since 2.8)
> #
> @@ -1018,6 +1038,7 @@
> '*tls-hostname': 'StrOrNull',
> '*tls-authz': 'StrOrNull',
> '*max-bandwidth': 'size',
> + '*avail-switchover-bandwidth': 'size',
> '*downtime-limit': 'uint64',
> '*x-checkpoint-delay': { 'type': 'uint32',
> 'features': [ 'unstable' ] },
> @@ -1128,6 +1149,16 @@
> # @max-bandwidth: to set maximum speed for migration. maximum speed
> # in bytes per second. (Since 2.8)
> #
> +# @avail-switchover-bandwidth: to set the available bandwidth that
> +# migration can use during switchover phase. NOTE! This does not
> +# limit the bandwidth during switchover, but only for calculations when
> +# making decisions to switchover. By default, this value is zero,
> +# which means QEMU will estimate the bandwidth automatically. This can
> +# be set when the estimated value is not accurate, while the user is
> +# able to guarantee such bandwidth is available when switching over.
> +# When specified correctly, this can make the switchover decision much
> +# more accurate. (Since 8.2)
> +#
> # @downtime-limit: set maximum tolerated downtime for migration.
> # maximum downtime in milliseconds (Since 2.8)
> #
> @@ -1218,6 +1249,7 @@
> '*tls-hostname': 'str',
> '*tls-authz': 'str',
> '*max-bandwidth': 'size',
> + '*avail-switchover-bandwidth': 'size',
> '*downtime-limit': 'uint64',
> '*x-checkpoint-delay': { 'type': 'uint32',
> 'features': [ 'unstable' ] },
Regardless:
Acked-by: Markus Armbruster <armbru@redhat.com>
[...]