[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-devel] Re: [PATCH 10/10] Maintaing number of dirty pages
From: |
Juan Quintela |
Subject: |
[Qemu-devel] Re: [PATCH 10/10] Maintaing number of dirty pages |
Date: |
Wed, 01 Dec 2010 19:51:16 +0100 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/23.2 (gnu/linux) |
Anthony Liguori <address@hidden> wrote:
> On 12/01/2010 11:01 AM, Avi Kivity wrote:
>> On 12/01/2010 06:56 PM, Anthony Liguori wrote:
>>> On 12/01/2010 10:52 AM, Avi Kivity wrote:
>>>> On 12/01/2010 06:49 PM, Anthony Liguori wrote:
>>>>>> We need actual measurements instead of speculations.
>>>>>
>>>>>
>>>>> Yes, I agree 100%. I think the place to start is what I
>>>>> suggested in a previous note in this thread, we need to measure
>>>>> actual stall time in the guest.
>>>>
>>>> I'd actually start at the host. How much time does
>>>> ioctl(KVM_GET_DIRTY_LOG) take? What's the percentage of time
>>>> qemu_mutex is held?
>>>
>>> The question is, what really are the symptoms of the problem. It's
>>> not necessarily a bad thing if KVM_GET_DIRTY_LOG takes a long while
>>> qemu_mutex is held.
>>
>> Whether or not qemu_mutex is held, long KVM_GET_DIRTY_LONG runtimes
>> are bad, since they are a lower bound on your downtime. And
>> KVM_GET_DIRTY_LOG does a lot of work, and invokes
>> synchronize_srcu_expedited(), which can be very slow.
>
> That's fine, and you're right, it's a useful thing to do, but this
> series originated because of a problem and we ought to make sure we
> capture what the actual problem is. That's not to say we shouldn't
> improve things that could stand to be improved.
>
>>>
>>> Is the problem that the monitor responds slowly? Is the problem
>>> that the guest isn't consistently getting execution time? Is the
>>> proper simply that the guest isn't getting enough total execution
>>> time?
>>
>> All three can happen if qemu_mutex is held too long.
>
> Right, but I'm starting to think that the root of the problem is not
> that it's being held too long but that it's being held too often.
Ok, I tested yesterday dropping qemu_mutex on ram_save_block (crude
thing, just qemu_mutex_unlock_iothread(); loop ;
qemu_mutex_lock_iothread();
As requested by Anthony, I tested on the guest to see how big stalls
were. Code is:
while (1) {
if (gettimeofday(&t0, NULL) != 0)
perror("gettimeofday 1");
if (usleep(100) != 0)
perror("usleep");
if (gettimeofday(&t1, NULL) != 0)
perror("gettimeofday 2");
t1.tv_usec -= t0.tv_usec;
if (t1.tv_usec < 0) {
t1.tv_usec += 1000000;
t1.tv_sec--;
}
t1.tv_sec -= t0.tv_sec;
if (t1.tv_sec || t1.tv_usec > 5000)
printf("delay of %ld\n", t1.tv_sec * 1000000 +
t1.tv_usec);
}
I tried in a guest that is completely idle with 8vcpus. on idle, only
some stalls in the 5-8ms happens (as expected).
(this is after my series).
As soon as I start migration, we got several stalls in the 15-200ms
range. Notice that stalls are not bigger because I limit the time that
qemu_mutex is held on the iothread to 50ms each time.
doing the crude qemu_mutex drop on ram_save_live, means that this
ministalls got way smaller in the 10-15ms range (some rare at 20ms).
And then we have an stall of around 120ms during the non-live part of
the migration. I can't find where this stall comes from (i.e. saving
all the rest of pages and normal sections take much less time). But on
the other hand, I have no instrumentation yet to measure how long it
takes to move to the other host and restart there.
So, we are still not there, but now we have only a single 120ms stall on
the guest, versus the 1-4 seconds ones that we used to have.
I don't have access to this machines until next week, so I am spending
this week implementing the ideas given on this thread.
Later, Juan.
- [Qemu-devel] Re: [PATCH 10/10] Maintaing number of dirty pages, (continued)
- [Qemu-devel] Re: [PATCH 10/10] Maintaing number of dirty pages, Juan Quintela, 2010/12/01
- [Qemu-devel] Re: [PATCH 10/10] Maintaing number of dirty pages, Anthony Liguori, 2010/12/01
- [Qemu-devel] Re: [PATCH 10/10] Maintaing number of dirty pages, Juan Quintela, 2010/12/01
- [Qemu-devel] Re: [PATCH 10/10] Maintaing number of dirty pages, Anthony Liguori, 2010/12/01
- [Qemu-devel] Re: [PATCH 10/10] Maintaing number of dirty pages, Avi Kivity, 2010/12/01
- [Qemu-devel] Re: [PATCH 10/10] Maintaing number of dirty pages, Anthony Liguori, 2010/12/01
- [Qemu-devel] Re: [PATCH 10/10] Maintaing number of dirty pages, Avi Kivity, 2010/12/01
- [Qemu-devel] Re: [PATCH 10/10] Maintaing number of dirty pages, Anthony Liguori, 2010/12/01
- [Qemu-devel] Re: [PATCH 10/10] Maintaing number of dirty pages, Avi Kivity, 2010/12/01
- [Qemu-devel] Re: [PATCH 10/10] Maintaing number of dirty pages, Anthony Liguori, 2010/12/01
- [Qemu-devel] Re: [PATCH 10/10] Maintaing number of dirty pages,
Juan Quintela <=