[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-devel] [PATCH 00/47] Block job improvements for 1.2
From: |
Paolo Bonzini |
Subject: |
[Qemu-devel] [PATCH 00/47] Block job improvements for 1.2 |
Date: |
Tue, 24 Jul 2012 13:03:38 +0200 |
Hi all, this is the first non-RFC submission of my block job patches
for 1.2. Everything is there, including multiple in-flight operations
in the mirroring job and new testcases (for all of streaming, mirroring,
hierarchical bitmap). The tests use blkdebug to test error reporting
for both streaming and mirroring.
This still does not include a persistent dirty bitmap, which will be work
for 1.3.
If you want to tinker with this, everything is available at
git://github.com/bonzini/qemu.git in branch blkmirror-job-1.2.
I know it's a lot of code, I'm sorry for dropping this quite close to
the feature freeze. Unfortunately, preparing for the Linux merge window
and other non-QEMU tasks have dragged this 1-2 weeks more than I would
have liked.
The patches are organized as follows:
01-12 preparatory work for block job errors, including support for
pausing and resuming jobs
13-17 introduce block job errors, and add support in block-stream
18-26 preparatory work for block mirroring, including creating new
new functions out of existing code.
27-34 introduce a simple version of mirroring. The initial patch
add the mirroring logic, followed by the ability to switch to
the destination of migration, to query the target file (for
example, polling the high-water mark), and to handle errors
during the job. All these changes come with testcases.
35-43 These patches introduce the first optimizations, namely supporting
an arbitrary granularity for the dirty bitmap. The current default,
1M, is too coarse to let the job converge quickly and in almost
real-time. These patches reimplement the block device dirty bitmap
to allow efficient iteration, and add cluster copy-on-write logic.
Cluster copy-on-write is needed because management will want to
start the copy before the backing file is in place in the destination;
if mirroring takes care of copy-on-write, BDRV_O_NO_BACKING can be
used even if the granularity is smaller than the cluster size.
44-47 A second round optimizations, replacing serialized read-write
operations with multiple asynchronous I/O operations. The various
in-flight operations can be of arbitrary size. The initial copy
will end up reading large chunks sequentially (10M by default),
while subsequent passes can mimic more closely the guest's I/O
patterns.
Compared to v1, the last four patches are entirely new, and so are many
of the testcase changes. All comments from Eric's review are addressed.
In some cases the patches were modified (reversing if conditions or things
like that) in order to keep later patches simpler. I also added several
new tracepoints.
Latency is vital to any migration scheme using a dirty bitmap, especially
because completion is entirely asynchronous, so I expect this to be used
either with pretty good storage, or on guests doing relatively little I/O.
I tested this both on my laptop and with moderately high-end SAS disks.
On the SAS disks, time between checkpoints (trace_mirror_before_flush)
on kernel compilation (-j3 to -j12, 4 or 8 vCPUs) is almost always within
1 second, usually much less targeting a local disk. On hibernation,
which is a worst-case test (sequential I/O happening with no flushes
in between) and failed completely to converge on my lowly laptop hard
disk, a checkpoint was reached every 0.5 to 3 seconds. When targeting
a local qemu-nbd server performance was similar. Kernel compilation
showed occasional bumps, but they were fixed in 1.5-7 seconds.
Please review!
Paolo Bonzini (47):
qapi: generalize documentation of streaming commands
qerror/block: introduce QERR_BLOCK_JOB_NOT_ACTIVE
block: move job APIs to separate files
block: add block_job_query
block: add support for job pause/resume
qmp: add block-job-pause and block-job-resume
qemu-iotests: add test for pausing a streaming operation
block: rename block_job_complete to block_job_completed
block: rename BlockErrorAction, BlockQMPEventAction
block: move BlockdevOnError declaration to QAPI
block: reorganize io error code
block: sort BlockDeviceIoStatus errors by severity
block: introduce block job error
stream: add on-error argument
blkdebug: process all set_state rules in the old state
qemu-iotests: map underscore to dash in QMP argument names
qemu-iotests: add tests for streaming error handling
block: live snapshot documentation tweaks
block: add bdrv_query_info
block: add bdrv_query_stats
block: add bdrv_ensure_backing_file
block: make device optional in BlockInfo
block: add target info to QMP query-blockjobs command
block: introduce new dirty bitmap functionality
block: add block-job-complete
block: introduce BLOCK_JOB_READY event
block: introduce mirror job
qmp: add drive-mirror command
mirror: support querying target file
mirror: implement completion
qemu-iotests: add mirroring test case
block: forward bdrv_iostatus_reset to block job
mirror: add support for on-source-error/on-target-error
qmp: add pull_event function
qemu-iotests: add testcases for mirroring
on-source-error/on-target-error
host-utils: add ffsl and flsl
add hierarchical bitmap data type and test cases
block: implement dirty bitmap using HBitmap
block: make round_to_clusters public
mirror: perform COW if the cluster size is bigger than the
granularity
block: return count of dirty sectors, not chunks
block: allow customizing the granularity of the dirty bitmap
mirror: allow customizing the granularity
mirror: switch mirror_iteration to AIO
mirror: add buf-size argument to drive-mirror
mirror: support more than one in-flight AIO operation
mirror: support arbitrarily-sized iterations
Makefile.objs | 5 +-
QMP/qmp-events.txt | 43 +++
QMP/qmp.py | 20 ++
block-migration.c | 8 +-
block.c | 486 ++++++++++++------------------
block.h | 37 ++-
block/Makefile.objs | 3 +-
block/blkdebug.c | 14 +-
block/mirror.c | 562 +++++++++++++++++++++++++++++++++++
block/stream.c | 33 +-
block_int.h | 192 +++---------
blockdev.c | 257 +++++++++++++---
blockjob.c | 290 ++++++++++++++++++
blockjob.h | 285 ++++++++++++++++++
hbitmap.c | 394 ++++++++++++++++++++++++
hbitmap.h | 51 ++++
hmp-commands.hx | 73 ++++-
hmp.c | 65 +++-
hmp.h | 4 +
host-utils.h | 45 +++
hw/fdc.c | 4 +-
hw/ide/core.c | 20 +-
hw/scsi-disk.c | 23 +-
hw/scsi-generic.c | 4 +-
hw/virtio-blk.c | 19 +-
monitor.c | 2 +
monitor.h | 2 +
qapi-schema.json | 238 +++++++++++++--
qemu-tool.c | 6 +
qerror.c | 12 +
qerror.h | 9 +
qmp-commands.hx | 72 ++++-
tests/Makefile | 2 +
tests/qemu-iotests/030 | 178 ++++++++++-
tests/qemu-iotests/039 | 661 +++++++++++++++++++++++++++++++++++++++++
tests/qemu-iotests/group | 3 +-
tests/qemu-iotests/iotests.py | 19 +-
tests/test-hbitmap.c | 384 ++++++++++++++++++++++++
trace-events | 24 +-
39 files changed, 3946 insertions(+), 603 deletions(-)
create mode 100644 block/mirror.c
create mode 100644 blockjob.c
create mode 100644 blockjob.h
create mode 100644 hbitmap.c
create mode 100644 hbitmap.h
create mode 100755 tests/qemu-iotests/039
create mode 100644 tests/test-hbitmap.c
--
1.7.10.4
- [Qemu-devel] [PATCH 00/47] Block job improvements for 1.2,
Paolo Bonzini <=
[Qemu-devel] [PATCH 05/47] block: add support for job pause/resume, Paolo Bonzini, 2012/07/24
[Qemu-devel] [PATCH 04/47] block: add block_job_query, Paolo Bonzini, 2012/07/24