qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v13 00/10] Block replication for continuous chec


From: Wen Congyang
Subject: Re: [Qemu-devel] [PATCH v13 00/10] Block replication for continuous checkpoints
Date: Fri, 29 Jan 2016 18:27:57 +0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.0

On 01/29/2016 06:07 PM, Dr. David Alan Gilbert wrote:
> * Wen Congyang (address@hidden) wrote:
>> On 01/27/2016 07:03 PM, Dr. David Alan Gilbert wrote:
>>> Hi,
>>>   I've got a block error if I kill the secondary.
>>>
>>> Start both primary & secondary
>>> kill -9 secondary qemu
>>> x_colo_lost_heartbeat on primary
>>>
>>> The guest sees a block error and the ext4 root switches to read-only.
>>>
>>> I gdb'd the primary with a breakpoint on quorum_report_bad; see
>>> backtrace below.
>>> (This is based on colo-v2.4-periodic-mode of the framework
>>> code with the block and network proxy merged in; so it could be my
>>> merging but I don't think so ?)
>>>
>>>
>>> (gdb) where
>>> #0  quorum_report_bad (node_name=0x7f2946a0892c "node0", ret=-5, 
>>> acb=0x7f2946cb3910, acb=0x7f2946cb3910)
>>>     at /root/colo/jan-2016/qemu/block/quorum.c:222
>>> #1  0x00007f2943b23058 in quorum_aio_cb (opaque=<optimized out>, 
>>> ret=<optimized out>)
>>>     at /root/colo/jan-2016/qemu/block/quorum.c:315
>>> #2  0x00007f2943b311be in bdrv_co_complete (acb=0x7f2946cb3f60) at 
>>> /root/colo/jan-2016/qemu/block/io.c:2122
>>> #3  0x00007f2943ae777d in aio_bh_call (bh=<optimized out>) at 
>>> /root/colo/jan-2016/qemu/async.c:64
>>> #4  aio_bh_poll (address@hidden) at /root/colo/jan-2016/qemu/async.c:92
>>> #5  0x00007f2943af5090 in aio_dispatch (ctx=0x7f2945b771d0) at 
>>> /root/colo/jan-2016/qemu/aio-posix.c:305
>>> #6  0x00007f2943ae756e in aio_ctx_dispatch (source=<optimized out>, 
>>> callback=<optimized out>, 
>>>     user_data=<optimized out>) at /root/colo/jan-2016/qemu/async.c:231
>>> #7  0x00007f293b84a79a in g_main_context_dispatch () from 
>>> /lib64/libglib-2.0.so.0
>>> #8  0x00007f2943af3a00 in glib_pollfds_poll () at 
>>> /root/colo/jan-2016/qemu/main-loop.c:211
>>> #9  os_host_main_loop_wait (timeout=<optimized out>) at 
>>> /root/colo/jan-2016/qemu/main-loop.c:256
>>> #10 main_loop_wait (nonblocking=<optimized out>) at 
>>> /root/colo/jan-2016/qemu/main-loop.c:504
>>> #11 0x00007f29438529ee in main_loop () at /root/colo/jan-2016/qemu/vl.c:1945
>>> #12 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) 
>>> at /root/colo/jan-2016/qemu/vl.c:4707
>>>
>>> (gdb) p s->num_children
>>> $1 = 2
>>> (gdb) p acb->success_count
>>> $2 = 0
>>> (gdb) p acb->is_read
>>> $5 = false
>>
>> Sorry for the late reply.
> 
> No problem.
> 
>> What it the value of acb->count?
> 
> (gdb) p acb->count
> $1 = 1

Note, the count is 1, not 2. Writing to children.0 is in flight. If writing to 
children.0 successes,
the guest doesn't know this error.

> 
>> If secondary host is down, you should remove quorum's children.1. Otherwise, 
>> you will get
>> I/O error event.
> 
> Is that safe?  If the secondary fails, do you always have time to issue the 
> command to
> remove the children.1  before the guest sees the error?

We will write to two children, and expect that writing to children.0 will 
success. If so,
the guest doesn't know this error. You just get the I/O error event.

> 
> Anyway, I tried removing children.1 but it segfaults now, I guess the 
> replication is unhappy:
> 
> (qemu) x_block_change colo-disk0 -d children.1
> (qemu) x_colo_lost_heartbeat 

Hmm, you should not remove the child before failover. I will check it how to 
avoid it in the codes.

> 
> 12973 Segmentation fault      (core dumped) 
> ./try/x86_64-softmmu/qemu-system-x86_64 -enable-kvm $console_param -S -boot c 
> -m 4080 -smp 4 -machine pc-i440fx-2.5,accel=kvm -name debug-threads=on -trace 
> events=trace-file -device virtio-rng-pci $block_param $net_param
> 
> #0  0x00007f0a398a864c in bdrv_stop_replication (bs=0x7f0a3b0a8430, 
> failover=true, errp=0x7fff6a5c3420)
>     at /root/colo/jan-2016/qemu/block.c:4426
> 
> (gdb) p drv
> $1 = (BlockDriver *) 0x5d2a
> 
>   it looks like the whole of bs is bogus.
> 
> #1  0x00007f0a398d87f6 in quorum_stop_replication (bs=<optimized out>, 
> failover=<optimized out>, 
>     errp=<optimized out>) at /root/colo/jan-2016/qemu/block/quorum.c:1213
> 
> (gdb) p s->replication_index
> $3 = 1
> 
> I guess quorum_del_child needs to stop replication before it removes the 
> child?

Yes, but in the newest version, quorum doesn't know the block replication, and 
I think
we shoud add an reference to the bs when starting block replication.

Thanks
Wen Congyang

> (although it would have to be careful not to block on the dead nbd).
> 
> #2  0x00007f0a398a8901 in bdrv_stop_replication_all (address@hidden, 
> address@hidden)
>     at /root/colo/jan-2016/qemu/block.c:4504
> #3  0x00007f0a3984b0af in primary_vm_do_failover () at 
> /root/colo/jan-2016/qemu/migration/colo.c:144
> #4  colo_do_failover (s=<optimized out>) at 
> /root/colo/jan-2016/qemu/migration/colo.c:162
> #5  0x00007f0a3989d7fd in aio_bh_call (bh=<optimized out>) at 
> /root/colo/jan-2016/qemu/async.c:64
> #6  aio_bh_poll (address@hidden) at /root/colo/jan-2016/qemu/async.c:92
> #7  0x00007f0a398ab110 in aio_dispatch (ctx=0x7f0a3a6c21d0) at 
> /root/colo/jan-2016/qemu/aio-posix.c:305
> #8  0x00007f0a3989d5ee in aio_ctx_dispatch (source=<optimized out>, 
> callback=<optimized out>, 
>     user_data=<optimized out>) at /root/colo/jan-2016/qemu/async.c:231
> #9  0x00007f0a3160079a in g_main_context_dispatch () from 
> /lib64/libglib-2.0.so.0
> #10 0x00007f0a398a9a80 in glib_pollfds_poll () at 
> /root/colo/jan-2016/qemu/main-loop.c:211
> #11 os_host_main_loop_wait (timeout=<optimized out>) at 
> /root/colo/jan-2016/qemu/main-loop.c:256
> #12 main_loop_wait (nonblocking=<optimized out>) at 
> /root/colo/jan-2016/qemu/main-loop.c:504
> #13 0x00007f0a396089ee in main_loop () at /root/colo/jan-2016/qemu/vl.c:1945
> #14 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) 
> at /root/colo/jan-2016/qemu/vl.c:4707
> 
> Dave
> 
>> Thanks
>> Wen Congyang
>>
>>>
>>> (qemu) info block
>>> colo-disk0 (#block080): json:{"children": [{"driver": "raw", "file": 
>>> {"driver": "file", "filename": "/root/colo/bugzilla.raw"}}, {"driver": 
>>> "replication", "mode": "primary", "file": {"port": "8889", "host": 
>>> "ibpair", "driver": "nbd", "export": "colo-disk0"}}], "driver": "quorum", 
>>> "blkverify": false, "rewrite-corrupted": false, "vote-threshold": 1} 
>>> (quorum)
>>>     Cache mode:       writeback, direct
>>>
>>> Dave
>>>
>>> * Changlong Xie (address@hidden) wrote:
>>>> Block replication is a very important feature which is used for
>>>> continuous checkpoints(for example: COLO).
>>>>
>>>> You can get the detailed information about block replication from here:
>>>> http://wiki.qemu.org/Features/BlockReplication
>>>>
>>>> Usage:
>>>> Please refer to docs/block-replication.txt
>>>>
>>>> This patch series is based on the following patch series:
>>>> 1. http://lists.nongnu.org/archive/html/qemu-devel/2015-12/msg04570.html
>>>>
>>>> You can get the patch here:
>>>> https://github.com/Pating/qemu/tree/changlox/block-replication-v13
>>>>
>>>> You can get the patch with framework here:
>>>> https://github.com/Pating/qemu/tree/changlox/colo_framework_v12
>>>>
>>>> TODO:
>>>> 1. Continuous block replication. It will be started after basic functions
>>>>    are accepted.
>>>>
>>>> Changs Log:
>>>> V13:
>>>> 1. Rebase to the newest codes
>>>> 2. Remove redundant marcos and semicolon in replication.c 
>>>> 3. Fix typos in block-replication.txt
>>>> V12:
>>>> 1. Rebase to the newest codes
>>>> 2. Use backing reference to replcace 'allow-write-backing-file'
>>>> V11:
>>>> 1. Reopen the backing file when starting blcok replication if it is not
>>>>    opened in R/W mode
>>>> 2. Unblock BLOCK_OP_TYPE_BACKUP_SOURCE and BLOCK_OP_TYPE_BACKUP_TARGET
>>>>    when opening backing file
>>>> 3. Block the top BDS so there is only one block job for the top BDS and
>>>>    its backing chain.
>>>> V10:
>>>> 1. Use blockdev-remove-medium and blockdev-insert-medium to replace backing
>>>>    reference.
>>>> 2. Address the comments from Eric Blake
>>>> V9:
>>>> 1. Update the error messages
>>>> 2. Rebase to the newest qemu
>>>> 3. Split child add/delete support. These patches are sent in another 
>>>> patchset.
>>>> V8:
>>>> 1. Address Alberto Garcia's comments
>>>> V7:
>>>> 1. Implement adding/removing quorum child. Remove the option non-connect.
>>>> 2. Simplify the backing refrence option according to Stefan Hajnoczi's 
>>>> suggestion
>>>> V6:
>>>> 1. Rebase to the newest qemu.
>>>> V5:
>>>> 1. Address the comments from Gong Lei
>>>> 2. Speed the failover up. The secondary vm can take over very quickly even
>>>>    if there are too many I/O requests.
>>>> V4:
>>>> 1. Introduce a new driver replication to avoid touch nbd and qcow2.
>>>> V3:
>>>> 1: use error_setg() instead of error_set()
>>>> 2. Add a new block job API
>>>> 3. Active disk, hidden disk and nbd target uses the same AioContext
>>>> 4. Add a testcase to test new hbitmap API
>>>> V2:
>>>> 1. Redesign the secondary qemu(use image-fleecing)
>>>> 2. Use Error objects to return error message
>>>> 3. Address the comments from Max Reitz and Eric Blake
>>>>
>>>> Wen Congyang (10):
>>>>   unblock backup operations in backing file
>>>>   Store parent BDS in BdrvChild
>>>>   Backup: clear all bitmap when doing block checkpoint
>>>>   Allow creating backup jobs when opening BDS
>>>>   docs: block replication's description
>>>>   Add new block driver interfaces to control block replication
>>>>   quorum: implement block driver interfaces for block replication
>>>>   Implement new driver for block replication
>>>>   support replication driver in blockdev-add
>>>>   Add a new API to start/stop replication, do checkpoint to all BDSes
>>>>
>>>>  block.c                    | 145 ++++++++++++
>>>>  block/Makefile.objs        |   3 +-
>>>>  block/backup.c             |  14 ++
>>>>  block/quorum.c             |  78 +++++++
>>>>  block/replication.c        | 545 
>>>> +++++++++++++++++++++++++++++++++++++++++++++
>>>>  blockjob.c                 |  11 +
>>>>  docs/block-replication.txt | 227 +++++++++++++++++++
>>>>  include/block/block.h      |   9 +
>>>>  include/block/block_int.h  |  15 ++
>>>>  include/block/blockjob.h   |  12 +
>>>>  qapi/block-core.json       |  33 ++-
>>>>  11 files changed, 1089 insertions(+), 3 deletions(-)
>>>>  create mode 100644 block/replication.c
>>>>  create mode 100644 docs/block-replication.txt
>>>>
>>>> -- 
>>>> 1.9.3
>>>>
>>>>
>>>>
>>> --
>>> Dr. David Alan Gilbert / address@hidden / Manchester, UK
>>>
>>>
>>> .
>>>
>>
>>
>>
> --
> Dr. David Alan Gilbert / address@hidden / Manchester, UK
> 
> 
> .
> 






reply via email to

[Prev in Thread] Current Thread [Next in Thread]