Hi all!
The series turn backup into series of block_copy_async calls, covering
the whole disk, so we get block-status based paralallel async requests
out of the box, which gives performance gain:
All results are in seconds
----------------- ----------- ------------- --------------
--------------------- --------------------------------
------------------------------------
A B C D
E F
mirror(old) backup(old) backup(old) backup(new)
backup(new) backup(new)
copy-range=on copy-range=off
copy-range=on copy-range=on
max-workers=1
hdd-ext4:hdd-ext4 19 20 21 ± 14% 19
51 ± 12% 22 ± 24%
A+5% A+12% B+6% A+3% B-2%
C-8% A+174% B+161% C+145% D+165% A+18% B+12% C+5% D+14% E-57%
hdd-ext4:ssd-ext4 8.7 9.4 ± 3% 9.6 ± 2% 8.8
24 ± 2% 8.9
A+8% A+10% B+2% A+1% B-7%
C-9% A+174% B+155% C+149% D+173% A+2% B-5% C-8% D+1% E-63%
ssd-ext4:hdd-ext4 9 12 ± 9% 11 ± 7% 9.7 ± 7%
11 ± 2% 10 ± 3%
A+36% A+28% B-6% A+7% B-21%
C-16% A+21% B-11% C-5% D+13% A+16% B-14% C-9% D+8% E-4%
ssd-ext4:ssd-ext4 4.4 11 ± 4% 10 ± 3% 4.7
5.7 10 ± 5%
A+143% A+134% B-4% A+6% B-56%
C-55% A+30% B-46% C-45% D+22% A+133% B-4% C-1% D+119% E+79%
hdd-xfs:hdd-xfs 19 20 ± 3% 20 20
45 ± 4% 19
A+3% A+4% B+1% A+3% B+0%
C-1% A+131% B+125% C+122% D+125% A-1% B-4% C-4% D-3% E-57%
hdd-xfs:ssd-xfs 9.1 9.9 ± 4% 9.5 9.1 ± 3%
23 ± 2% 9.2
A+8% A+4% B-4% A+0% B-8%
C-4% A+151% B+132% C+142% D+151% A+1% B-7% C-3% D+1% E-60%
ssd-xfs:hdd-xfs 9.1 11 ± 9% 11 9.5 ± 4%
12 ± 22% 11 ± 3%
A+16% A+22% B+6% A+4% B-10%
C-15% A+32% B+14% C+8% D+26% A+18% B+2% C-4% D+13% E-10%
ssd-xfs:ssd-xfs 4.1 8.7 ± 7% 9.2 ± 5% 4.5 ± 2%
5.7 ± 3% 9.7 ± 5%
A+113% A+126% B+6% A+11% B-48%
C-51% A+40% B-34% C-38% D+27% A+138% B+12% C+5% D+115% E+70%
ssd-ext4:nbd 9.1 ± 2% 37 37 ± 2% 11
11 ± 3% 19 ± 2%
A+302% A+304% B+1% A+18% B-71%
C-71% A+18% B-71% C-71% D+0% A+106% B-49% C-49% D+74% E+75%
nbd:ssd-ext4 9 30 ± 3% 31 9
9 17
A+237% A+245% B+2% A+0% B-70%
C-71% A+0% B-70% C-71% D+0% A+93% B-43% C-44% D+93% E+93%
----------------- ----------- ------------- --------------
--------------------- --------------------------------
------------------------------------
Here column B is current backup and column D is new backup with
default parameters.
Mirror is still faster, but we are very close to it.
v3:
01: add Max's r-b
02: change to perf.use-copy-range
03: add Max's r-b
04: - more explicit finish status of async block_copy
- block_copy_async always return non-NULL
- personal opaque for new cb
05: - new arguments added in this patch
- no default value for arguments in block_copy_async()
06: new
07: - caller does _kick() by hand
- grammar in commit msg
- add new parameter in _this_ patch
- switch to opposite ignore_ratelimit
08: cancel now is async
09,10: add Max's r-b
11: changed a lot
12: add timeout
14: rebase on x-perf, keep r-b
15: rebase on x-perf
16: rebase on x-perf, keep r-b
17,18: new
19: now only backup.c is changed in this patch, changed a lot
20,21: new
22: rebased, keep r-b
23: new, split from 24
24: drop unrelated change (now patch23), keep r-b
25: changed a lot, explicitly specify options for each env (test table column)
To run benchmark do the following:
prepare images:
In a directories, where you want to place source and target images,
prepare images by:
for img in test-source test-target; do
./qemu-img create -f raw $img 1000M;
./qemu-img bench -c 1000 -d 1 -f raw -s 1M -w --pattern=0xff $img
done
prepare similar image for nbd server, and start it somewhere by
qemu-nbd --persistent --nocache -f raw IMAGE
Then, run benchmark, like this:
./bench-backup.py --env
old:/work/src/qemu/up-backup-block-copy-master/build/qemu-system-x86_64,mirror
old,copy-range=on old,copy-range=off new:../../build/qemu-system-x86_64
new,copy-range=on new,copy-range=on,max-workers=1 --dir hdd-ext4:/test-a
hdd-xfs:/test-b ssd-ext4:/ssd ssd-xfs:/ssd-xfs --test $(for fs in ext4 xfs; do
echo hdd-$fs:hdd-$fs hdd-$fs:ssd-$fs ssd-$fs:hdd-$fs ssd-$fs:ssd-$fs; done)
--nbd 192.168.100.5 --test ssd-ext4:nbd nbd:ssd-ext4
(you may simply reduce number of directories/test-cases, use --help for
help)
Vladimir Sementsov-Ogievskiy (25):
iotests: 129 don't check backup "busy"
qapi: backup: add perf.use-copy-range parameter
block/block-copy: More explicit call_state
block/block-copy: implement block_copy_async
block/block-copy: add max_chunk and max_workers parameters
block/block-copy: add list of all call-states
block/block-copy: add ratelimit to block-copy
block/block-copy: add block_copy_cancel
blockjob: add set_speed to BlockJobDriver
job: call job_enter from job_user_pause
qapi: backup: add max-chunk and max-workers to x-perf struct
iotests: 56: prepare for backup over block-copy
iotests: 129: prepare for backup over block-copy
iotests: 185: prepare for backup over block-copy
iotests: 219: prepare for backup over block-copy
iotests: 257: prepare for backup over block-copy
block/block-copy: make progress_bytes_callback optional
block/backup: drop extra gotos from backup_run()
backup: move to block-copy
qapi: backup: disable copy_range by default
block/block-copy: drop unused block_copy_set_progress_callback()
block/block-copy: drop unused argument of block_copy()
simplebench/bench_block_job: use correct shebang line with python3
simplebench: bench_block_job: add cmd_options argument
simplebench: add bench-backup.py
qapi/block-core.json | 26 ++-
block/backup-top.h | 1 +
include/block/block-copy.h | 58 ++++-
include/block/block_int.h | 3 +
include/block/blockjob_int.h | 2 +
block/backup-top.c | 6 +-
block/backup.c | 233 ++++++++++++-------
block/block-copy.c | 227 +++++++++++++++---
block/replication.c | 2 +
blockdev.c | 14 ++
blockjob.c | 6 +
job.c | 1 +
scripts/simplebench/bench-backup.py | 165 +++++++++++++
scripts/simplebench/bench-example.py | 2 +-
scripts/simplebench/bench_block_job.py | 13 +-
tests/qemu-iotests/056 | 9 +-
tests/qemu-iotests/129 | 3 +-
tests/qemu-iotests/185 | 3 +-
tests/qemu-iotests/185.out | 2 +-
tests/qemu-iotests/219 | 13 +-
tests/qemu-iotests/257 | 1 +
tests/qemu-iotests/257.out | 306 ++++++++++++-------------
22 files changed, 798 insertions(+), 298 deletions(-)
create mode 100755 scripts/simplebench/bench-backup.py