[Qemu-devel] Re: [PATCH 09/10] Exit loop if we have been there too long

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] Re: [PATCH 09/10] Exit loop if we have been there too long

From:	Anthony Liguori
Subject:	[Qemu-devel] Re: [PATCH 09/10] Exit loop if we have been there too long
Date:	Tue, 30 Nov 2010 09:00:09 -0600
User-agent:	Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.15) Gecko/20101027 Lightning/1.0b1 Thunderbird/3.0.10

On 11/30/2010 08:12 AM, Paolo Bonzini wrote:

On 11/30/2010 02:47 PM, Anthony Liguori wrote:
On 11/30/2010 01:15 AM, Paolo Bonzini wrote:
On 11/30/2010 03:11 AM, Anthony Liguori wrote:
BufferedFile should hit the qemu_file_rate_limit check when the socket
buffer gets filled up.
The problem is that the file rate limit is not hit because work is
done elsewhere. The rate can limit the bandwidth used and makes QEMU
aware that socket operations may block (because that's what the
buffered file freeze/unfreeze logic does); but it cannot be used to
limit the _time_ spent in the migration code.
Yes, it can, if you set the rate limit sufficiently low.
You mean, just like you can drive a car without brakes by keeping thespeed sufficiently low.
[..] accounting zero pages as full sized
pages should "fix" the problem.
I know you used quotes, but it's a very very generous definition offix. Both these proposed "fixes" are nothing more than workarounds,and even particularly ugly ones. The worst thing about them is thatthere is no guarantee of migration finishing in a reasonable time, orat all.
If you account zero pages as full, you don't use effectively thebandwidth that was allotted to you, you use only 0.2% of it (8/4096).It then takes an exaggerate amount of time to start iteration on pagesthat matter. If you set the bandwidth low, instead, you do not havethe bandwidth you need in order to converge.
Even from an aesthetic point of view, if there is such a thing, Idon't understand why you advocate conflating network bandwidth and CPUusage into a single measurement. Nobody disagrees that all youpropose is nice to have, and that what Juan sent is a stopgap measure(though a very effective one). However, this doesn't negate thatJuan's accounting patches make a lot of sense in the current design.

Juan's patch, IIUC, does the following: If you've been iterating in atight loop, return to the main loop for *one* iteration every 50ms.

But this means that during this 50ms period of time, a VCPU may beblocked from running. If the guest isn't doing a lot of device I/O*and* you're on a relatively low link speed, then this will mean thatyou don't hold qemu_mutex for more than 50ms at a time.

But in the degenerate case where you have a high speed link and you havea guest doing a lot of device I/O, you'll see the guest VCPU beingblocked for 50ms, then getting to run for a very brief period of time,followed by another block for 50ms. The guest's execution will beextremely sporadic.

This isn't fixable with this approach. The only way to really fix thisis to say that over a given period of time, migration may only consumeXX amount of CPU time which guarantees the VCPUs get the qemu_mutex forthe rest of the time.

This is exactly what rate limiting does. Yes, it results in a longermigration time but that's the trade-off we have to make if we wantdeterministic VCPU execution until we can implement threading properly.

If you want a simple example, doing I/O with the rtl8139 adapter whiledoing your migration test and run a tight loop in the get runninggettimeofday(). Graph the results to see how much execution time theguest is actually getting.

In the long term, we need a new dirty bit interface from kvm.ko that
uses a multi-level table. That should dramatically improve scan
performance. We also need to implement live migration in a separate
thread that doesn't carry qemu_mutex while it runs.


This may be a good way to fix it, but it's also basically a rewrite.

The only correct short term solution I can see if rate limitingunfortunately.


Regards,

Anthony Liguori

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to address@hidden
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Prev in Thread]

Current Thread

[Next in Thread]

[Qemu-devel] Re: [PATCH 08/10] Count nanoseconds with uint64_t not doubles, (continued)
- [Qemu-devel] [PATCH 09/10] Exit loop if we have been there too long, Juan Quintela, 2010/11/23
  - [Qemu-devel] Re: [PATCH 09/10] Exit loop if we have been there too long, Michael S. Tsirkin, 2010/11/24
    - [Qemu-devel] Re: [PATCH 09/10] Exit loop if we have been there too long, Juan Quintela, 2010/11/24
    - [Qemu-devel] Re: [PATCH 09/10] Exit loop if we have been there too long, Michael S. Tsirkin, 2010/11/24
    - [Qemu-devel] Re: [PATCH 09/10] Exit loop if we have been there too long, Paolo Bonzini, 2010/11/24
    - [Qemu-devel] Re: [PATCH 09/10] Exit loop if we have been there too long, Michael S. Tsirkin, 2010/11/24
    - Message not available
    - [Qemu-devel] Re: [PATCH 09/10] Exit loop if we have been there too long, Juan Quintela, 2010/11/30
  - Message not available
    - Message not available
    - [Qemu-devel] Re: [PATCH 09/10] Exit loop if we have been there too long, Anthony Liguori, 2010/11/30
    - [Qemu-devel] Re: [PATCH 09/10] Exit loop if we have been there too long, Paolo Bonzini, 2010/11/30
    - [Qemu-devel] Re: [PATCH 09/10] Exit loop if we have been there too long, Anthony Liguori <=
    - [Qemu-devel] Re: [PATCH 09/10] Exit loop if we have been there too long, Juan Quintela, 2010/11/30
    - [Qemu-devel] Re: [PATCH 09/10] Exit loop if we have been there too long, Avi Kivity, 2010/11/30
    - [Qemu-devel] Re: [PATCH 09/10] Exit loop if we have been there too long, Anthony Liguori, 2010/11/30
    - [Qemu-devel] Re: [PATCH 09/10] Exit loop if we have been there too long, Avi Kivity, 2010/11/30
    - [Qemu-devel] Re: [PATCH 09/10] Exit loop if we have been there too long, Takuya Yoshikawa, 2010/11/30
    - [Qemu-devel] Re: [PATCH 09/10] Exit loop if we have been there too long, Juan Quintela, 2010/11/30
    - [Qemu-devel] Re: [PATCH 09/10] Exit loop if we have been there too long, Takuya Yoshikawa, 2010/11/30
    - [Qemu-devel] Re: [PATCH 09/10] Exit loop if we have been there too long, Juan Quintela, 2010/11/30
    - [Qemu-devel] Re: [PATCH 09/10] Exit loop if we have been there too long, Anthony Liguori, 2010/11/30
- [Qemu-devel] [PATCH 10/10] Maintaing number of dirty pages, Juan Quintela, 2010/11/23

Prev by Date: [Qemu-devel] [PATCH 2/3] block: Introduce path_has_protocol() function
Next by Date: [Qemu-devel] Re: [PATCHv6 00/16] boot order specification
Previous by thread: [Qemu-devel] Re: [PATCH 09/10] Exit loop if we have been there too long
Next by thread: [Qemu-devel] Re: [PATCH 09/10] Exit loop if we have been there too long
Index(es):
- Date
- Thread