[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: [PATCH v6 1/5] docs/migration: add qatzip compression feature
From: |
Liu, Yuan1 |
Subject: |
RE: [PATCH v6 1/5] docs/migration: add qatzip compression feature |
Date: |
Tue, 16 Jul 2024 02:34:07 +0000 |
> -----Original Message-----
> From: Yichen Wang <yichen.wang@bytedance.com>
> Sent: Tuesday, July 16, 2024 6:13 AM
> To: Peter Xu <peterx@redhat.com>; Fabiano Rosas <farosas@suse.de>; Paolo
> Bonzini <pbonzini@redhat.com>; Daniel P. Berrangé <berrange@redhat.com>;
> Eduardo Habkost <eduardo@habkost.net>; Marc-André Lureau
> <marcandre.lureau@redhat.com>; Thomas Huth <thuth@redhat.com>; Philippe
> Mathieu-Daudé <philmd@linaro.org>; Eric Blake <eblake@redhat.com>; Markus
> Armbruster <armbru@redhat.com>; Laurent Vivier <lvivier@redhat.com>; qemu-
> devel@nongnu.org
> Cc: Hao Xiang <hao.xiang@linux.dev>; Liu, Yuan1 <yuan1.liu@intel.com>;
> Zou, Nanhai <nanhai.zou@intel.com>; Ho-Ren (Jack) Chuang
> <horenchuang@bytedance.com>; Wang, Yichen <yichen.wang@bytedance.com>
> Subject: [PATCH v6 1/5] docs/migration: add qatzip compression feature
>
> From: Yuan Liu <yuan1.liu@intel.com>
>
> add Intel QATzip compression method introduction
>
> Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
> Reviewed-by: Nanhai Zou <nanhai.zou@intel.com>
> Reviewed-by: Peter Xu <peterx@redhat.com>
> Reviewed-by: Yichen Wang <yichen.wang@bytedance.com>
> ---
> docs/devel/migration/features.rst | 1 +
> docs/devel/migration/qatzip-compression.rst | 251 ++++++++++++++++++++
> 2 files changed, 252 insertions(+)
> create mode 100644 docs/devel/migration/qatzip-compression.rst
>
> diff --git a/docs/devel/migration/features.rst
> b/docs/devel/migration/features.rst
> index 58f8fd9e16..8f431d52f9 100644
> --- a/docs/devel/migration/features.rst
> +++ b/docs/devel/migration/features.rst
> @@ -14,3 +14,4 @@ Migration has plenty of features to support different
> use cases.
> CPR
> qpl-compression
> uadk-compression
> + qatzip-compression
> diff --git a/docs/devel/migration/qatzip-compression.rst
> b/docs/devel/migration/qatzip-compression.rst
> new file mode 100644
> index 0000000000..72fa3e2826
> --- /dev/null
> +++ b/docs/devel/migration/qatzip-compression.rst
> @@ -0,0 +1,251 @@
> +==================
> +QATzip Compression
> +==================
> +In scenarios with limited network bandwidth, the ``QATzip`` solution can
> help
> +users save a lot of host CPU resources by accelerating compression and
> +decompression through the Intel QuickAssist Technology(``QAT``) hardware.
Hi Yichen
Thanks for adding the part of Performance Testing with QATzip, I wonder if we
can remove Performance Testing with QATzip part and directly add the following
content.
Here, we use a typical example of limited bandwidth to illustrate the advantages
of QATzip. If the user is interested in qatzip, he still needs to verify the
performance
by himself.
+The following test was conducted using 8 multifd channels and 10Gbps network
+bandwidth. The results show that, compared to zstd, ``QATzip`` significantly
+saves CPU resources on the sender and reduces migration time. Compared to the
+uncompressed solution, ``QATzip`` greatly improves the dirty page processing
+capability, indicated by the Pages per Second metric, and also reduces the
+total migration time.
+
+::
+
+ VM Configuration: 16 vCPU and 64G memory
+ VM Workload: all vCPUs are idle and 54G memory is filled with Silesia data.
+ QAT Devices: 4
+ |-----------|--------|---------|----------|----------|------|------|
+ |8 Channels |Total |down |throughput|pages per | send | recv |
+ | |time(ms)|time(ms) |(mbps) |second | cpu %| cpu% |
+ |-----------|--------|---------|----------|----------|------|------|
+ |qatzip | 16630| 28| 10467| 2940235| 160| 360|
+ |-----------|--------|---------|----------|----------|------|------|
+ |zstd | 20165| 24| 8579| 2391465| 810| 340|
+ |-----------|--------|---------|----------|----------|------|------|
+ |none | 46063| 40| 10848| 330240| 45| 85|
+ |-----------|--------|---------|----------|----------|------|------|
> +``QATzip`` is a user space library which builds on top of the Intel
> QuickAssist
> +Technology user space library, to provide extended accelerated
> compression and
> +decompression services.
> +
> +For more ``QATzip`` introduction, please refer to `QATzip Introduction
> +<https://github.com/intel/QATzip?tab=readme-ov-file#introductionl>`_
> +
> +QATzip Compression Framework
> +============================
> +
> +::
> +
> + +----------------+
> + | MultiFd Thread |
> + +-------+--------+
> + |
> + | compress/decompress
> + +-------+--------+
> + | QATzip library |
> + +-------+--------+
> + |
> + +-------+--------+
> + | QAT library |
> + +-------+--------+
> + | user space
> + --------+---------------------
> + | kernel space
> + +------+-------+
> + | QAT Driver |
> + +------+-------+
> + |
> + +------+-------+
> + | QAT Devices |
> + +--------------+
> +
> +
> +QATzip Installation
> +-------------------
> +
> +The ``QATzip`` installation package has been integrated into some Linux
> +distributions and can be installed directly. For example, the Ubuntu
> Server
> +24.04 LTS system can be installed using below command
> +
> +.. code-block:: shell
> +
> + #apt search qatzip
> + libqatzip-dev/noble 1.2.0-0ubuntu3 amd64
> + Intel QuickAssist user space library development files
> +
> + libqatzip3/noble 1.2.0-0ubuntu3 amd64
> + Intel QuickAssist user space library
> +
> + qatzip/noble,now 1.2.0-0ubuntu3 amd64 [installed]
> + Compression user-space tool for Intel QuickAssist Technology
> +
> + #sudo apt install libqatzip-dev libqatzip3 qatzip
> +
> +If your system does not support the ``QATzip`` installation package, you
> can
> +use the source code to build and install, please refer to `QATzip source
> code installation
> +<https://github.com/intel/QATzip?tab=readme-ov-file#build-intel-
> quickassist-technology-driver>`_
> +
> +QAT Hardware Deployment
> +-----------------------
> +
> +``QAT`` supports physical functions(PFs) and virtual functions(VFs) for
> +deployment, and users can configure ``QAT`` resources for migration
> according
> +to actual needs. For more details about ``QAT`` deployment, please refer
> to
> +`Intel QuickAssist Technology Documentation
> +<https://intel.github.io/quickassist/index.html>`_
> +
> +For more ``QAT`` hardware introduction, please refer to `intel-quick-
> assist-technology-overview
> +<https://www.intel.com/content/www/us/en/architecture-and-
> technology/intel-quick-assist-technology-overview.html>`_
> +
> +How To Use QATzip Compression
> +=============================
> +
> +1 - Install ``QATzip`` library
> +
> +2 - Build ``QEMU`` with ``--enable-qatzip`` parameter
> +
> + E.g. configure --target-list=x86_64-softmmu --enable-kvm ``--enable-
> qatzip``
> +
> +3 - Set ``migrate_set_parameter multifd-compression qatzip``
> +
> +4 - Set ``migrate_set_parameter multifd-qatzip-level comp_level``, the
> default
> +comp_level value is 1, and it supports levels from 1 to 9
> +
> +
> +Performance Testing with QATzip
> +===============================
> +
> +Testing environment is being set as below:
> +
> +VM configuration:16 vCPU, 64G memory;
> +
> +VM Workload: all vCPUs are idle and 54G memory is filled with Silesia
> data;
> +
> +QAT Devices: 4;
> +
> +Sender migration parameters:
> +
> +.. code-block:: shell
> +
> + migrate_set_capability multifd on
> + migrate_set_parameter multifd-channels 2/4/8
> + migrate_set_parameter max-bandwidth 1G/10G
> + migrate_set_parameter multifd-compression qatzip/zstd
> +
> +Receiver migration parameters:
> +
> +.. code-block:: shell
> +
> + migrate_set_capability multifd on
> + migrate_set_parameter multifd-channels 2
> + migrate_set_parameter multifd-compression qatzip/zstd
> +
> +max-bandwidth: 1 GBps (Gbytes/sec)
> +
> +.. code-block:: text
> +
> + |-----------|--------|---------|----------|------|------|
> + |2 Channels |Total |down |throughput| send | recv |
> + | |time(ms)|time(ms) |(mbps) | cpu %| cpu% |
> + |-----------|--------|---------|----------|------|------|
> + |qatzip | 21607| 77| 8051| 88| 125|
> + |-----------|--------|---------|----------|------|------|
> + |zstd | 78351| 96| 2199| 204| 80|
> + |-----------|--------|---------|----------|------|------|
> +
> + |-----------|--------|---------|----------|------|------|
> + |4 Channels |Total |down |throughput| send | recv |
> + | |time(ms)|time(ms) |(mbps) | cpu %| cpu% |
> + |-----------|--------|---------|----------|------|------|
> + |qatzip | 20336| 25| 8557| 110| 190|
> + |-----------|--------|---------|----------|------|------|
> + |zstd | 39324| 31| 4389| 406| 160|
> + |-----------|--------|---------|----------|------|------|
> +
> + |-----------|--------|---------|----------|------|------|
> + |8 Channels |Total |down |throughput| send | recv |
> + | |time(ms)|time(ms) |(mbps) | cpu %| cpu% |
> + |-----------|--------|---------|----------|------|------|
> + |qatzip | 20208| 22| 8613| 125| 300|
> + |-----------|--------|---------|----------|------|------|
> + |zstd | 20515| 22| 8438| 800| 340|
> + |-----------|--------|---------|----------|------|------|
> +
> +max-bandwidth: 10 GBps (Gbytes/sec)
> +
> +.. code-block:: text
> +
> + |-----------|--------|---------|----------|------|------|
> + |2 Channels |Total |down |throughput| send | recv |
> + | |time(ms)|time(ms) |(mbps) | cpu %| cpu% |
> + |-----------|--------|---------|----------|------|------|
> + |qatzip | 22450| 77| 7748| 80| 125|
> + |-----------|--------|---------|----------|------|------|
> + |zstd | 78339| 76| 2199| 204| 80|
> + |-----------|--------|---------|----------|------|------|
> +
> + |-----------|--------|---------|----------|------|------|
> + |4 Channels |Total |down |throughput| send | recv |
> + | |time(ms)|time(ms) |(mbps) | cpu %| cpu% |
> + |-----------|--------|---------|----------|------|------|
> + |qatzip | 13017| 24| 13401| 180| 285|
> + |-----------|--------|---------|----------|------|------|
> + |zstd | 39466| 21| 4373| 406| 160|
> + |-----------|--------|---------|----------|------|------|
> +
> + |-----------|--------|---------|----------|------|------|
> + |8 Channels |Total |down |throughput| send | recv |
> + | |time(ms)|time(ms) |(mbps) | cpu %| cpu% |
> + |-----------|--------|---------|----------|------|------|
> + |qatzip | 10255| 22| 17037| 280| 590|
> + |-----------|--------|---------|----------|------|------|
> + |zstd | 20126| 77| 8595| 810| 340|
> + |-----------|--------|---------|----------|------|------|
> +
> +max-bandwidth: 1.25 GBps (Gbytes/sec)
> +
> +.. code-block:: text
> +
> + |-----------|--------|---------|----------|----------|------|------|
> + |8 Channels |Total |down |throughput|pages per | send | recv |
> + | |time(ms)|time(ms) |(mbps) |second | cpu %| cpu% |
> + |-----------|--------|---------|----------|----------|------|------|
> + |qatzip | 16630| 28| 10467| 2940235| 160| 360|
> + |-----------|--------|---------|----------|----------|------|------|
> + |zstd | 20165| 24| 8579| 2391465| 810| 340|
> + |-----------|--------|---------|----------|----------|------|------|
> + |none | 46063| 40| 10848| 330240| 45| 85|
> + |-----------|--------|---------|----------|----------|------|------|
> +
> +If the user has enabled compression in live migration, using QAT can save
> the
> +host CPU resources.
> +
> +When compression is enabled, the bottleneck of migration is usually the
> +compression throughput on the sender side, since CPU decompression
> throughput
> +is higher than compression, some reference data
> +https://github.com/inikep/lzbench, so more CPU resources need to be
> allocated
> +to the sender side.
> +
> +Summary:
> +
> +1. In the 1GBps case, QAT only uses 88% CPU utilization to reach 1GBps,
> but
> + ZSTD needs 800%.
> +
> +2. In the 10Gbps case, QAT uses 180% CPU utilization to reach 10GBps. but
> ZSTD
> + still cannot reach 10Gbps even if it uses 810%.
> +
> +3. The QAT decompression CPU utilization is higher than compression and
> ZSTD,
> + because:
> +
> + a. When using QAT compression, the data needs to be copied to the QAT
> memory
> + (for DMA operations), and the same for decompression. However,
> + do_user_addr_fault will be triggered during decompression because the
> QAT
> + decompressed data is copied to the VM address space for the first
> time, in
> + addition, both compression and decompression are processed by QAT and
> do not
> + consume CPU resources, so the CPU utilization of the receiver is
> slightly
> + higher than the sender.
> +
> + b. Since zstd decompression decompresses data directly into the VM
> address
> + space, there is one less memory copy than QAT, so the CPU utilization
> on the
> + receiver is better than QAT. For the 1GBps case, the receiver CPU
> + utilization is 125%, and the memory copy occupies ~80% of CPU
> utilization.
> +
> +How To Choose Between QATzip and QPL
> +====================================
> +Starting from Intel 4th Gen Intel Xeon Scalable processors, codenamed
> Sapphire
> +Rapids processor(``SPR``), it supports multiple build-in accelerators
> including
> +``QAT`` and ``IAA``, the former can accelerate ``QATzip``, and the latter
> is
> +used to accelerate ``QPL``.
> +
> +Here are some suggestions:
> +
> +1 - If your live migration scenario is limited network bandwidth and
> ``QAT``
> +hardware resources exceed ``IAA``, then use the ``QATzip`` method, which
> +can save a lot of host CPU resources for compression.
> +
> +2 - If your system cannot support shared virtual memory(SVM) technology,
> please
> +use ``QATzip`` method because ``QPL`` performance is not good without SVM
> +support.
> +
> +3 - For other scenarios, please use the ``QPL`` method first.
> --
> Yichen Wang
- [PATCH v6 0/5] Implement QATzip compression method, Yichen Wang, 2024/07/15
- [PATCH v6 2/5] meson: Introduce 'qatzip' feature to the build system, Yichen Wang, 2024/07/15
- [PATCH v6 1/5] docs/migration: add qatzip compression feature, Yichen Wang, 2024/07/15
- RE: [PATCH v6 1/5] docs/migration: add qatzip compression feature,
Liu, Yuan1 <=
- [PATCH v6 3/5] migration: Add migration parameters for QATzip, Yichen Wang, 2024/07/15
- [PATCH v6 5/5] tests/migration: Add integration test for 'qatzip' compression method, Yichen Wang, 2024/07/15
- [PATCH v6 4/5] migration: Introduce 'qatzip' compression method, Yichen Wang, 2024/07/15