[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] COLO: how to flip a secondary to a primary?
From: |
Wen Congyang |
Subject: |
Re: [Qemu-devel] COLO: how to flip a secondary to a primary? |
Date: |
Mon, 25 Jan 2016 09:32:51 +0800 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.0 |
On 01/23/2016 03:35 AM, Dr. David Alan Gilbert wrote:
> Hi,
> I've been looking at what's needed to add a new secondary after
> a primary failed; from the block side it doesn't look as hard
> as I'd expected, perhaps you can tell me if I'm missing something!
>
> The normal primary setup is:
>
> quorum
> Real disk
> nbd client
quorum
real disk
replication
nbd client
>
> The normal secondary setup is:
> replication
> active-disk
> hidden-disk
> Real-disk
IIRC, we can do it like this:
quorum
replication
active-disk
hidden-disk
real-disk
>
> With a couple of minor code hacks; I changed the secondary to be:
>
> quorum
> replication
> active-disk
> hidden-disk
> Real-disk
> dummy-disk
after failover,
quorum
replicaion(old, mode is secondary)
active-disk
hidden-disk*
real-disk*
replication(new, mode is primary)
nbd-client
In the newest version, we active commit active-disk to real-disk.
So it will be:
quorum
replicaion(old, mode is secondary)
active-disk(it is real disk now)
replication(new, mode is primary)
nbd-client
>
> and then after the primary fails, I start a new secondary
> on another host and then on the old secondary do:
>
> nbd_server_stop
> stop
> x_block_change top-quorum -d children.0 # deletes use of real disk,
> leaves dummy
> drive_del active-disk0
> x_block_change top-quorum -a node-real-disk
> x_block_change top-quorum -d children.1 # Seems to have deleted the
> dummy?!, the disk is now child 0
> drive_add buddy
> driver=replication,mode=primary,file.driver=nbd,file.host=ibpair,file.port=8889,file.export=colo-disk0,node-name=nbd-client,if=none,cache=none
> x_block_change top-quorum -a nbd-client
> c
> migrate_set_capability x-colo on
> migrate -d -b tcp:ibpair:8888
>
> and I think that means what was the secondary, has the same disk
> structure as a normal primary.
> That's not quite happy yet, and I've not figured out why - but the
> order/structure of the block devices looks right?
>
> Notes:
> a) The dummy serves two purposes, 1) it works around the segfault
> I reported in the other mail, 2) when I delete the real disk in the
> first x_block_change it means the quorum still has 1 disk so doesn't
> get upset.
I don't understand the purpose 2.
> b) I had to remove the restriction in quorum_start_replication
> on which mode it would run in.
IIRC, this check will be removed.
> c) I'm not really sure everything knows it's in secondary mode yet, and
> I'm not convinced whether the replication is doing the right thing.
> d) The migrate -d -b eventually fails on the destination, not worked out
> why
> yet.
Can you give me the error message?
> e) Adding/deleting children on quorum is hard having to use the
> children.0/1
> notation when you've added children using node names - it's worrying
> which number is which; is there a way to give them a name?
No. I think we can improve 'info block' output.
> f) I've not thought about the colo-proxy that much yet - I guess that
> existing connections need to keep their sequence number offset but
> new connections made by what is now the primary dont need to do anything
> special.
Hailiang or Zhijian can answer this question.
Thanks
Wen Congyang
>
> Dave
> --
> Dr. David Alan Gilbert / address@hidden / Manchester, UK
>
>
> .
>