qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [PATCH 1/1] Improve block job rate limiting for small b


From: Max Reitz
Subject: Re: [Qemu-block] [PATCH 1/1] Improve block job rate limiting for small bandwidth values
Date: Tue, 5 Jul 2016 18:56:49 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.1.1

On 04.07.2016 16:30, Sascha Silbe wrote:
> Dear Max,
> 
> Max Reitz <address@hidden> writes:
> 
>> On 28.06.2016 17:28, Sascha Silbe wrote:
> [block/mirror.c]
>>> @@ -416,7 +416,9 @@ static uint64_t coroutine_fn 
>>> mirror_iteration(MirrorBlockJob *s)
>>>          assert(io_sectors);
>>>          sector_num += io_sectors;
>>>          nb_chunks -= DIV_ROUND_UP(io_sectors, sectors_per_chunk);
>>> -        delay_ns += ratelimit_calculate_delay(&s->limit, io_sectors);
>>> +        if (s->common.speed) {
>>> +            delay_ns = ratelimit_calculate_delay(&s->limit, io_sectors);
>>> +        }
>>
>> Hmm... Was it a bug that ratelimit_calculate_delay() was called
>> unconditionally before?
> 
> One could argue either way. It happened to work because
> ratelimit_calculate_delay() only delayed the _second_ time (within one
> time slice) the quota was exceeded. With zero duration time slices,
> there never was a second time.
> 
> With the new implementation we would divide by zero when slice_quota is
> 0, so we need to guard against that. Most callers already did, only
> mirror_iteration() needed to be adjusted.
> 
> 
> [...]
> [include/qemu/ratelimit.h]
>>>  static inline int64_t ratelimit_calculate_delay(RateLimit *limit, uint64_t 
>>> n)
>>>  {
>>>      int64_t now = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
>>> +    uint64_t delay_slices;
>>>  
>>> -    if (limit->next_slice_time < now) {
>>> -        limit->next_slice_time = now + limit->slice_ns;
>>> +    assert(limit->slice_quota && limit->slice_ns);
>>> +
>>> +    if (limit->slice_end_time < now) {
>>> +        /* Previous, possibly extended, time slice finished; reset the
>>> +         * accounting. */
>>> +        limit->slice_start_time = now;
>>> +        limit->slice_end_time = now + limit->slice_ns;
>>>          limit->dispatched = 0;
>>>      }
>>> -    if (limit->dispatched == 0 || limit->dispatched + n <= 
>>> limit->slice_quota) {
>>> -        limit->dispatched += n;
>>> +
>>> +    limit->dispatched += n;
>>> +    if (limit->dispatched < limit->slice_quota) {
>>
>> Nitpick: This should probably stay <=.
> 
> This is a subtle edge case. Previously, when limit->dispatched ==
> limit->slice_quota, we returned 0 so that the _current_ request (which
> is still within quota) wouldn't be delayed. Now, we return a delay so
> that the _next_ request (which would be over quota) will be delayed.

Hm, but that depends on the size of the next request. Of course, if we
get limit->dispatched == limit->slice_quota we know for sure that we
need to delay the next request. But if we get limit->dispatched ==
limit->slice_quota - 1... Then we probably also have to delay it, but we
don't know for sure.

So I think it would be better to have small but consistent systematic
error here, i.e. that we will not delay the last request even though we
should. Or you could insert a delay after the last request in all block
jobs, too.

Or did I fail to understand the issue? I'm not sure.

> [...]
>>> +    /* Quota exceeded. Calculate the next time slice we may start
>>> +     * sending data again. */
>>> +    delay_slices = (limit->dispatched + limit->slice_quota - 1) /
>>> +        limit->slice_quota;
>>> +    limit->slice_end_time = limit->slice_start_time +
>>> +        delay_slices * limit->slice_ns;
>>
>> I think it would make sense to make this a floating point calculation.
> 
> Then we'd have fully variable length time slices, instead of just
> "occupying" multiple fixed-length time slices with a single
> request. Maybe that would be even better, or maybe we'd cause other
> interesting things to happen (think interactions with the scheduler).

:-)

>                                                                       As
> this code will hopefully disappear during the 2.8 time line anyway, I'd
> prefer to go with the lowest risk option that is enough to fix the race
> conditions encountered by the test suite.

OK with me.

Max

>> If you don't agree, though:
>>
>> Reviewed-by: Max Reitz <address@hidden>
> 
> Thanks for the review!
> 
> Sascha
> 


Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]