qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC PATCH RDMA support v1: 10/13] introduce new comman


From: Michael R. Hines
Subject: Re: [Qemu-devel] [RFC PATCH RDMA support v1: 10/13] introduce new command migrate_check_for_zero
Date: Thu, 11 Apr 2013 11:35:02 -0400
User-agent: Mozilla/5.0 (X11; Linux i686; rv:17.0) Gecko/20130106 Thunderbird/17.0.2

On 04/11/2013 11:08 AM, Paolo Bonzini wrote:
Il 11/04/2013 16:57, Michael R. Hines ha scritto:
We have hardware already with front side bus speeds of 13 GB/s.

We also already have 5 GB/s RDMA hardware, and we will likely
have even faster RDMA hardware in the future.

This analysis is not factoring into account the cycles it takes to
map the pages before they are checked for duplicate bytes,
Do you mean the TLB misses?

Keeping in mind that this primarily happens during the bulk-phase round,
then yes, both TLB missing + the time it takes to trap into the
kernel, map the page, and let the TLB re-walk the page table.

But, as you pointed out, I do conceded that since most of the pages
will already have been mapped after the bulk phase round, this
should not be a problem anymore *after* that round has finished.

Using the /proc/<pid>/pagemap will probably go much further towards
solving the problem than disabling zero page scanning.
If its already possible to know if a page is not mapped, then there
won't be any need to scan it in the first place.

Once the page is mapped already, yes, I do see clearly that is_dup_page()
performance would probably be minimal.

Nevertheless, the initial "burst" of the bulk phase round is still important
to optimize, and I would like to know if the maintainer would accept this
API for disabling the scan or not. We think it's important because the total
migration time can be much smaller with high-throughput RDMA links
by optimizing the bulk-phase round and that lower total migration time is very valuable to many of our workloads, in addition to the low-downtime benefits you get with RDMA.

These are the real world scenarios that I was talking about.  Do you
have profiles of these, with the latest QEMU code, that show
is_dup_page() to be expensive?

I have expensive numbers only for the bulk phase round. Other than that,
I would be breaking confidentiality outside of the paper we have already published.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]