qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC PATCH v2 00/10] Add colo-proxy based on netfilter


From: Jason Wang
Subject: Re: [Qemu-devel] [RFC PATCH v2 00/10] Add colo-proxy based on netfilter
Date: Mon, 18 Jan 2016 17:29:51 +0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1


On 01/18/2016 03:05 PM, Zhang Chen wrote:
>
>
> On 01/06/2016 01:16 PM, Jason Wang wrote:
>>
>> On 01/04/2016 07:17 PM, Zhang Chen wrote:
>>>
>>> On 01/04/2016 05:46 PM, Jason Wang wrote:
>>>> On 01/04/2016 04:16 PM, Zhang Chen wrote:
>>>>> On 01/04/2016 01:37 PM, Jason Wang wrote:
>>>>>> On 12/31/2015 04:40 PM, Zhang Chen wrote:
>>>>>>> On 12/31/2015 10:36 AM, Jason Wang wrote:
>>>>>>>> On 12/22/2015 06:42 PM, Zhang Chen wrote:
>>>>>>>>> From: zhangchen <address@hidden>
>>>>>>>>>
>>>>>>>>> Hi,all
>>>>>>>>>
>>>>>>>>> This patch add an colo-proxy object, COLO-Proxy is a part of
>>>>>>>>> COLO,
>>>>>>>>> based on qemu netfilter and it's a plugin for qemu netfilter. the
>>>>>>>>> function
>>>>>>>>> keep Secondary VM connect normal to Primary VM and compare
>>>>>>>>> packets
>>>>>>>>> sent by PVM to sent by SVM.if the packet difference,notify
>>>>>>>>> COLO do
>>>>>>>>> checkpoint and send all primary packet has queued.
>>>>>>>> Thanks for the work. I don't object this method but still not
>>>>>>>> convinced
>>>>>>>> that qemu is the best place for this.
>>>>>>>>
>>>>>>>> As been raised in the past discussion, it's almost impossible to
>>>>>>>> cooperate with vhost backends. If we want this to be used in
>>>>>>>> production
>>>>>>>> environment, need to think of a solution for vhost. There's no
>>>>>>>> such
>>>>>>>> worry if we decouple this from qemu.
>>>>>>>>
>>>>>>>>> You can also get the series from:
>>>>>>>>>
>>>>>>>>> https://github.com/zhangckid/qemu/tree/colo-v2.2-periodic-mode-with-colo-proxyV2
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Usage:
>>>>>>>>>
>>>>>>>>> primary:
>>>>>>>>> -netdev tap,id=bn0 -device e1000,netdev=bn0
>>>>>>>>> -object
>>>>>>>>> colo-proxy,id=f0,netdev=bn0,queue=all,mode=primary,addr=host:port
>>>>>>>>>
>>>>>>>>> secondary:
>>>>>>>>> -netdev tap,id=bn0 -device e1000,netdev=bn0
>>>>>>>>> -object
>>>>>>>>> colo-proxy,id=f0,netdev=bn0,queue=all,mode=secondary,addr=host:port
>>>>>>>>>
>>>>>>>> Have a quick glance at how secondary mode work. What it does is
>>>>>>>> just
>>>>>>>> forwarding packets between a nic and a socket, qemu socket
>>>>>>>> backend did
>>>>>>>> exact the same job. You could even use socket in primary node and
>>>>>>>> let
>>>>>>>> packet compare module talk to both primary and secondary node.
>>>>>>> If we use qemu socket backend , the same netdev will used by qemu
>>>>>>> socket and
>>>>>>> qemu netfilter. this will against qemu net design. and then, when
>>>>>>> colo
>>>>>>> do failover,
>>>>>>> secondary do not have backend to use. that's the real problem.
>>>>>> Then, maybe it's time to implement changing the netdev of a nic. The
>>>>>> point here is that what secondary mode did is in fact a netdev
>>>>>> backend
>>>>>> instead of a filter ...
>>>>> Currently, you are right. in colo-proxy V2 code, I just compare IP
>>>>> packet to
>>>>> decide whether to do checkpoint.
>>>>> But, in colo-proxy V3 I will compare tcp,icmp,udp packet to decide
>>>>> it.
>>>>> because that can reduce frequency of checkpoint and improve
>>>>> performance. To keep tcp connection well, colo secondary need to
>>>>> record
>>>>> primary guest's init seq and adjust secondary guest's ack. if colo do
>>>>> failover,
>>>>> secondary also need do this to old tcp connection. qemu socket
>>>>> can't do this job.
>>>> So a question here: is it a must to do things (e.g TCP analysis
>>>> stuffs)
>>>> at secondary? Looks like we could do this at primary node. And I saw
>>>> you're doing packet comparing in primary node, any advantages of doing
>>>> this in primary instead of secondary?
>>> We think must  to do this in secondary, because if colo do
>>> failover,secondary
>>> must continues do TCP analysis stuffs to before tcp connection(if not,
>>> tcp connection
>>> will disconnect in that time), in this time primary already down or
>>> disconnect to
>>> secondary.so we can't make primary do this  TCP analysis stuffs.it can
>>> not ensure
>>> FT function.
>>>
>>> Thanks
>>> zhangchen
>> Makes sense.
>>
>> Thanks
>
> Hi~, Jason.
> No news for a week.
> Can you give me some comments for code.
> Let's make colo-proxy work well.

Sure.

Two main comments/suggestions:

- TCP analysis is missed in current version, maybe you point a git tree
(or another version of RFC) to me for a better understanding of the
design. (Just a skeleton for TCP should be sufficient to discuss).
- I prefer to make the code as reusable as possible. So it's better to
split/decouple the reusable parts from the codes. So a vague idea is:

1) Decouple the packet comparing from the netfilter. You've achieved
this 99% since the work has been done in a thread. Just let the thread
poll sockets directly, then the comparing have the possibility to be
reused by other kinds of dataplane.
2) Implement traffic mirror/redirector as filter.
3) Implement TCP seq rewriting as a filter.

Then, in primary node, you need just a traffic mirror, which did:
- mirror ingress traffic to secondary node
- mirror outgress traffic to packet comparing thread

And in secondadry node, you need two filters:
- A TCP seq rewriter which adjust tcp sequence number.
- A traffic redirector which redirect packet from a socket as ingress
traffic, and redirect outgress traffic to the socket which could be
polled by remote packet comparing thread.
 
Thoughts?

Thanks

>
> Thanks
> zhangchen 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]