[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-block] [PATCH 1/1] Improve block job rate limiting for small b
From: |
Max Reitz |
Subject: |
Re: [Qemu-block] [PATCH 1/1] Improve block job rate limiting for small bandwidth values |
Date: |
Sat, 2 Jul 2016 15:22:11 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.1.1 |
On 28.06.2016 17:28, Sascha Silbe wrote:
> ratelimit_calculate_delay() previously reset the accounting every time
> slice, no matter how much data had been processed before. This had (at
> least) two consequences:
>
> 1. The minimum speed is rather large, e.g. 5 MiB/s for commit and stream.
>
> Not sure if there are real-world use cases where this would be a
> problem. Mirroring and backup over a slow link (e.g. DSL) would
> come to mind, though.
>
> 2. Tests for block job operations (e.g. cancel) were rather racy
>
> All block jobs currently use a time slice of 100ms. That's a
> reasonable value to get smooth output during regular
> operation. However this also meant that the state of block jobs
> changed every 100ms, no matter how low the configured limit was. On
> busy hosts, qemu often transferred additional chunks until the test
> case had a chance to cancel the job.
>
> Fix the block job rate limit code to delay for more than one time
> slice to address the above issues. To make it easier to handle
> oversized chunks we switch the semantics from returning a delay
> _before_ the current request to a delay _after_ the current
> request. If necessary, this delay consists of multiple time slice
> units.
>
> Since the mirror job sends multiple chunks in one go even if the rate
> limit was exceeded in between, we need to keep track of the start of
> the current time slice so we can correctly re-compute the delay for
> the updated amount of data.
>
> The minimum bandwidth now is 1 data unit per time slice. The block
> jobs are currently passing the amount of data transferred in sectors
> and using 100ms time slices, so this translates to 5120
> bytes/second. With chunk sizes usually being O(512KiB), tests have
> plenty of time (O(100s)) to operate on block jobs. The chance of a
> race condition now is fairly remote, except possibly on insanely
> loaded systems.
>
> Signed-off-by: Sascha Silbe <address@hidden>
> ---
> block/commit.c | 13 +++++--------
> block/mirror.c | 4 +++-
> block/stream.c | 12 ++++--------
> include/qemu/ratelimit.h | 43 ++++++++++++++++++++++++++++++++++---------
> 4 files changed, 46 insertions(+), 26 deletions(-)
>
[...]
> diff --git a/block/mirror.c b/block/mirror.c
> index a04ed9c..d9d70f0 100644
> --- a/block/mirror.c
> +++ b/block/mirror.c
> @@ -416,7 +416,9 @@ static uint64_t coroutine_fn
> mirror_iteration(MirrorBlockJob *s)
> assert(io_sectors);
> sector_num += io_sectors;
> nb_chunks -= DIV_ROUND_UP(io_sectors, sectors_per_chunk);
> - delay_ns += ratelimit_calculate_delay(&s->limit, io_sectors);
> + if (s->common.speed) {
> + delay_ns = ratelimit_calculate_delay(&s->limit, io_sectors);
> + }
Hmm... Was it a bug that ratelimit_calculate_delay() was called
unconditionally before?
> }
> return delay_ns;
> }
[...]
> diff --git a/include/qemu/ratelimit.h b/include/qemu/ratelimit.h
> index d413a4a..12db769 100644
> --- a/include/qemu/ratelimit.h
> +++ b/include/qemu/ratelimit.h
> @@ -15,34 +15,59 @@
> #define QEMU_RATELIMIT_H 1
>
> typedef struct {
> - int64_t next_slice_time;
> + int64_t slice_start_time;
> + int64_t slice_end_time;
> uint64_t slice_quota;
> uint64_t slice_ns;
> uint64_t dispatched;
> } RateLimit;
>
> +/** Calculate and return delay for next request in ns
> + *
> + * Record that we sent @p n data units. If we may send more data units
> + * in the current time slice, return 0 (i.e. no delay). Otherwise
> + * return the amount of time (in ns) until the start of the next time
> + * slice that will permit sending the next chunk of data.
> + *
> + * Recording sent data units even after exceeding the quota is
> + * permitted; the time slice will be extended accordingly.
> + */
> static inline int64_t ratelimit_calculate_delay(RateLimit *limit, uint64_t n)
> {
> int64_t now = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
> + uint64_t delay_slices;
>
> - if (limit->next_slice_time < now) {
> - limit->next_slice_time = now + limit->slice_ns;
> + assert(limit->slice_quota && limit->slice_ns);
> +
> + if (limit->slice_end_time < now) {
> + /* Previous, possibly extended, time slice finished; reset the
> + * accounting. */
> + limit->slice_start_time = now;
> + limit->slice_end_time = now + limit->slice_ns;
> limit->dispatched = 0;
> }
> - if (limit->dispatched == 0 || limit->dispatched + n <=
> limit->slice_quota) {
> - limit->dispatched += n;
> +
> + limit->dispatched += n;
> + if (limit->dispatched < limit->slice_quota) {
Nitpick: This should probably stay <=.
> + /* We may send further data within the current time slice, no
> + * need to delay the next request. */
> return 0;
> - } else {
> - limit->dispatched = n;
> - return limit->next_slice_time - now;
> }
> +
> + /* Quota exceeded. Calculate the next time slice we may start
> + * sending data again. */
> + delay_slices = (limit->dispatched + limit->slice_quota - 1) /
> + limit->slice_quota;
> + limit->slice_end_time = limit->slice_start_time +
> + delay_slices * limit->slice_ns;
I think it would make sense to make this a floating point calculation.
If you don't agree, though:
Reviewed-by: Max Reitz <address@hidden>
> + return limit->slice_end_time - now;
> }
>
> static inline void ratelimit_set_speed(RateLimit *limit, uint64_t speed,
> uint64_t slice_ns)
> {
> limit->slice_ns = slice_ns;
> - limit->slice_quota = ((double)speed * slice_ns)/1000000000ULL;
> + limit->slice_quota = MAX(((double)speed * slice_ns) / 1000000000ULL, 1);
> }
>
> #endif
>
signature.asc
Description: OpenPGP digital signature
- Re: [Qemu-block] [PATCH 1/1] Improve block job rate limiting for small bandwidth values,
Max Reitz <=