qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC PATCH v2 00/10] Add colo-proxy based on netfilter


From: Wen Congyang
Subject: Re: [Qemu-devel] [RFC PATCH v2 00/10] Add colo-proxy based on netfilter
Date: Fri, 22 Jan 2016 13:56:57 +0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.0

On 01/22/2016 01:41 PM, Jason Wang wrote:
> 
> 
> On 01/22/2016 11:28 AM, Wen Congyang wrote:
>> On 01/22/2016 11:15 AM, Jason Wang wrote:
>>>
>>> On 01/20/2016 06:30 PM, Wen Congyang wrote:
>>>> On 01/20/2016 06:19 PM, Jason Wang wrote:
>>>>>>
>>>>>> On 01/20/2016 06:01 PM, Wen Congyang wrote:
>>>>>>>> On 01/20/2016 02:54 PM, Jason Wang wrote:
>>>>>>>>>> On 01/20/2016 11:29 AM, Zhang Chen wrote:
>>>>>>>>>>>>>> Sure.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Two main comments/suggestions:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> - TCP analysis is missed in current version, maybe you point a 
>>>>>>>>>>>>>> git tree
>>>>>>>>>>>>>> (or another version of RFC) to me for a better understanding of 
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> design. (Just a skeleton for TCP should be sufficient to 
>>>>>>>>>>>>>> discuss).
>>>>>>>>>>>>>> - I prefer to make the code as reusable as possible. So it's 
>>>>>>>>>>>>>> better to
>>>>>>>>>>>>>> split/decouple the reusable parts from the codes. So a vague 
>>>>>>>>>>>>>> idea is:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 1) Decouple the packet comparing from the netfilter. You've 
>>>>>>>>>>>>>> achieved
>>>>>>>>>>>>>> this 99% since the work has been done in a thread. Just let the 
>>>>>>>>>>>>>> thread
>>>>>>>>>>>>>> poll sockets directly, then the comparing have the possibility 
>>>>>>>>>>>>>> to be
>>>>>>>>>>>>>> reused by other kinds of dataplane.
>>>>>>>>>>>>>> 2) Implement traffic mirror/redirector as filter.
>>>>>>>>>>>>>> 3) Implement TCP seq rewriting as a filter.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Then, in primary node, you need just a traffic mirror, which did:
>>>>>>>>>>>>>> - mirror ingress traffic to secondary node
>>>>>>>>>>>>>> - mirror outgress traffic to packet comparing thread
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> And in secondadry node, you need two filters:
>>>>>>>>>>>>>> - A TCP seq rewriter which adjust tcp sequence number.
>>>>>>>>>>>>>> - A traffic redirector which redirect packet from a socket as 
>>>>>>>>>>>>>> ingress
>>>>>>>>>>>>>> traffic, and redirect outgress traffic to the socket which could 
>>>>>>>>>>>>>> be
>>>>>>>>>>>>>> polled by remote packet comparing thread.
>>>>>>>>>>>>>>   Thoughts?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>>> zhangchen
>>>>>>>>>>>> Hi, Jason.
>>>>>>>>>>>> We consider your suggestion to split/decouple
>>>>>>>>>>>> the reusable parts from the codes.
>>>>>>>>>>>> Due to filter plugin are traversed one by one in order
>>>>>>>>>>>> we will split colo-proxy to three filters in each side.
>>>>>>>>>>>>
>>>>>>>>>>>> But in this plan,primary and secondary both have socket
>>>>>>>>>>>> server,startup is a problem.
>>>>>>>>>> I believe this issue could be solved by reusing socket chardev.
>>>>>>>>>>
>>>>>>>>>>>>  Primary qemu                                                      
>>>>>>>>>>>> Secondary qemu
>>>>>>>>>>>> +----------------------------------------------------------+      
>>>>>>>>>>>> +-----------------------------------------------------------+
>>>>>>>>>>>> | +-----------------------------------------------------+  |       
>>>>>>>>>>>> | 
>>>>>>>>>>>> +------------------------------------------------------+ |
>>>>>>>>>>>> | |                                                     |  |       
>>>>>>>>>>>> | 
>>>>>>>>>>>> |                                                      | |
>>>>>>>>>>>> | |                        guest                        |  |       
>>>>>>>>>>>> | 
>>>>>>>>>>>> |                        guest                         | |
>>>>>>>>>>>> | |                                                     |  |       
>>>>>>>>>>>> | 
>>>>>>>>>>>> |                                                      | |
>>>>>>>>>>>> | +-----------^--------------+--------------------------+  |       
>>>>>>>>>>>> | 
>>>>>>>>>>>> +---------------------+--------+-----------------------+ |
>>>>>>>>>>>> |             |              |                             |      
>>>>>>>>>>>> |                        ^        |                         |
>>>>>>>>>>>> |             |              |                             |      
>>>>>>>>>>>> |                        |        |                         |
>>>>>>>>>>>> |             +-------------------------------------------------+ 
>>>>>>>>>>>> |                        |        |                         |
>>>>>>>>>>>> |  netfilter  |              |                             |    |  
>>>>>>>>>>>> |  
>>>>>>>>>>>> netfilter            |        |                         |
>>>>>>>>>>>> | +-----------------------------------------------------+  |    |  
>>>>>>>>>>>> | 
>>>>>>>>>>>> +------------------------------------------------------+ |
>>>>>>>>>>>> | |           |              |     filter excute order  |  |    |  
>>>>>>>>>>>> | 
>>>>>>>>>>>> |                     |        |  filter excute order  | |
>>>>>>>>>>>> | |           |              |    +-------------------> |  |    |  
>>>>>>>>>>>> | 
>>>>>>>>>>>> |                     |        | +-------------------> | |
>>>>>>>>>>>> | |           |              |                          |  |    |  
>>>>>>>>>>>> | 
>>>>>>>>>>>> |                     |        |   TCP                 | |
>>>>>>>>>>>> | | +---------+-+     +------v-----+    +----+ +-----+  |  |    |  
>>>>>>>>>>>> | 
>>>>>>>>>>>> | +-----------+   +---+----+---v+rewriter+  +--------+ | |
>>>>>>>>>>>> | | |           |     |            |    |            |  |  |    |  
>>>>>>>>>>>> | 
>>>>>>>>>>>> | |           |   |        |             |  |        | | |
>>>>>>>>>>>> | | |  mirror   |     |  redirect  +---->  compare   |  |  |   
>>>>>>>>>>>> +--------> mirror   +---> adjust |   adjust    +-->redirect| | |
>>>>>>>>>>>> | | |  client   |     |  server    |    |            |  |  |       
>>>>>>>>>>>> | 
>>>>>>>>>>>> | |  server   |   | ack    |   seq       |  |client  | | |
>>>>>>>>>>>> | | |           |     |            |    |            |  |  |       
>>>>>>>>>>>> | 
>>>>>>>>>>>> | |           |   |        |             |  |        | | |
>>>>>>>>>>>> | | +----^------+     +----^-------+    +-----+------+  |  |       
>>>>>>>>>>>> | 
>>>>>>>>>>>> | +-----------+   +--------+-------------+  +----+---+ | |
>>>>>>>>>>>> | |      |     tx          |      rx          |     rx  |  |       
>>>>>>>>>>>> | 
>>>>>>>>>>>> |            tx                        all       |  rx | |
>>>>>>>>>>>> | +-----------------------------------------------------+  |       
>>>>>>>>>>>> | 
>>>>>>>>>>>> +------------------------------------------------------+ |
>>>>>>>>>>>> |        |                
>>>>>>>>>>>> +-------------------------------------------------------------------------------------------+
>>>>>>>>>>>>       
>>>>>>>>>>>> |
>>>>>>>>>>>> |        |                                    |            |      
>>>>>>>>>>>> |                                                           |
>>>>>>>>>>>> +----------------------------------------------------------+      
>>>>>>>>>>>> +-----------------------------------------------------------+
>>>>>>>>>>>>          |                                    |
>>>>>>>>>>>>          |guest receive                       |guest send
>>>>>>>>>>>>          |                                    |
>>>>>>>>>>>> +--------+------------------------------------v------------+
>>>>>>>>>>>> |                                                          |
>>>>>>>>>>>> |                                                          |
>>>>>>>>>>>> |                         tap                             
>>>>>>>>>>>> |                              NOTE: filter direction is rx/tx/all
>>>>>>>>>>>> |                                                         
>>>>>>>>>>>> |                              rx:receive packets sent to the 
>>>>>>>>>>>> netdev
>>>>>>>>>>>> |                                                         
>>>>>>>>>>>> |                              tx:receive packets sent by the 
>>>>>>>>>>>> netdev
>>>>>>>>>>>> +----------------------------------------------------------+
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>> I still like to decouple comparer from netfilter. It have two obvious
>>>>>>>>>> advantages:
>>>>>>>>>>
>>>>>>>>>> - make it can be reused by other dataplane (e.g vhost)
>>>>>>>>>> - secondary redirector could redirect rx to comparer on primary node
>>>>>>>>>> directly which simplify the design.
>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> guest recv packet route
>>>>>>>>>>>>
>>>>>>>>>>>> primary
>>>>>>>>>>>> tap --> mirror client filter
>>>>>>>>>>>> mirror client will send packet to guest,at the
>>>>>>>>>>>> same time, copy and forward packet to secondary
>>>>>>>>>>>> mirror server.
>>>>>>>>>>>>
>>>>>>>>>>>> secondary
>>>>>>>>>>>> mirror server filter --> TCP rewriter
>>>>>>>>>>>> if recv packet is TCP packet,we will adjust ack
>>>>>>>>>>>> and update TCP checksum, then send to secondary
>>>>>>>>>>>> guest. else directly send to guest.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> guest send packet route
>>>>>>>>>>>>
>>>>>>>>>>>> primary
>>>>>>>>>>>> guest --> redirect server filter
>>>>>>>>>>>> redirect server filter recv primary guest packet
>>>>>>>>>>>> but do nothing, just pass to next filter.
>>>>>>>>>>>>
>>>>>>>>>>>> redirect server filter --> compare filter
>>>>>>>>>>>> compare filter recv primary guest packet then
>>>>>>>>>>>> waiting scondary redirect packet to compare it.
>>>>>>>>>>>> if packet same,send primary packet and clear secondary
>>>>>>>>>>>> packet, else send primary packet and do
>>>>>>>>>>>> checkpoint.
>>>>>>>>>>>>
>>>>>>>>>>>> secondary
>>>>>>>>>>>> guest --> TCP rewriter filter
>>>>>>>>>>>> if the packet is TCP packet,we will adjust seq
>>>>>>>>>>>> and update TCP checksum. then send it to
>>>>>>>>>>>> redirect client filter. else directly send to
>>>>>>>>>>>> redirect client filter.
>>>>>>>>>>>>
>>>>>>>>>>>> redirect client filter --> redirect server filter
>>>>>>>>>>>> forward packet to primary
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> In failover scene(primary is down), the TCP rewriter will keep
>>>>>>>>>>>> servicing
>>>>>>>>>>>> for the TCP connection which is established after the last 
>>>>>>>>>>>> checkpoint。
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> How about this plan?
>>>>>>>>>> Sounds good.
>>>>>>>>>>
>>>>>>>>>> And there's indeed no need to differ client/server by reusing the 
>>>>>>>>>> socket
>>>>>>>>>> chardev. E.g:
>>>>>>>>>>
>>>>>>>>>> In primary node:
>>>>>>>>>>
>>>>>>>>>> ...
>>>>>>>>>> -chardev socket,id=comparer0,host=ip_primary,port=X,server,nowait
>>>>>>>>>> -chardev socket,id=comparer1,host=ip_primary,port=Y,server,nowait
>>>>>>>>>> -chardev socket,id=mirrorer0,host=ip_primary,port=Z,server,nowait
>>>>>>>>>> -netdev tap,id=hn0
>>>>>>>>>> -traffic-mirrorer netdev=hn0,id=t0,indev=comparer0,outdev=mirrorer0
>>>>>>>>>> -colo-comparer primary_traffic=comparer0,secondary_traffic=comparer1
>>>>>>>> Why mirrorer has indev? 
>>>>>>
>>>>>> As I said in the previous mails. I would like to decouple packet
>>>>>> comparing from netfilter. You've already done most of this since the
>>>>>> comparing is done in an independent thread. So the indev here is to
>>>>>> mirror the packet sent by guest to the packet comparing thread.
>>>>>>
>>>>>>>> I think we can use traffic-redirector to do it.
>>>>>>>> The command line is:
>>>>>>>> -netdev tap,id=hn0
>>>>>>>> -object traffic-mirrorer,id=f0,netdev=hn0,queue=tx,outdev=mirrorer0
>>>>>>>> -object traffic-redirector,id=f1,netdev=hn0,queue=rx,outdev=comparer0
>>>>>>>> -colo-comparer 
>>>>>>>> primary_traffic=comparer0,secondary_traffic=comparer1,netdev=hn0
>>>>>>>> In the comparer thread, we can use qemu_net_queue_send_iov() to send
>>>>>>>> out the packet.
>>>>>>>>
>>>>>>>> Also, we can merge the socketdev comparer1 and mirrorer0.
>>>>>> It depends on whether or not packet comparing was done in a net filter
>>>>>> (which I prefer not).
>>>> I mean that: packet comapring is done in a thread, not a net filter.
>>>> The flow of the packet sent from guest:
>>>> 1. traffice-redirecotr, we will redirector the packet to comparer0, the 
>>>> next
>>>>    filter will never see it.
>>>> 2. comparing thread: read it from socket chardev comparer0
>>>> 3. call qemu_net_queue_send_iov() to send it back to the netdev.
>>> Ok, looks like I miss something.
>>>
>>> My suggestion tries best to let the packet comparing not tie to filter
>>> or netdev. But your suggestion still need it to be coupled with a
>>> netdev. Any advantages of doing this (or is there a reason that packet
>>> must be sent to netdev after doing comparing?). If not, why not just
>> Yes, the packet must be sent to netdev after doing comparing. If both
>> the primary packet and secondary packet are the same(contains the same
>> application level data), we will drop the secondary packet, and send the
>> primary packet to the netdev. Otherwise, we will sync the state.
> 
> And drop primary packet also here?

No, the primary packet must be sent back to the netdev, so the client can 
receive
the response.

For example:
1. guest has a ftp server
2. we connect to the ftp server via the network
3. both primary guest and secondary guest receive this request
4. both primary guest and secondary guest ack it
5. we compare these two ack packets in the comparing thread
6. it is the same(the seqno is different, but it is not important, we can 
modify it in
   colo-rewriter). So we drop the secondary packets, and sent back the primary 
packet
   to netdev
7. The primary ack packet is sent to the ftp client via netdev.

The ftp client only cares of the received packet. So if the packets from primay
and secondary guest contain the same data, we can say they are in the "same" 
state.

Thanks
Wen Congyang

> 
>>
>>> mirror (duplicate the packet and forward it to a chardev, and pass the
>>> original packet to the next filter or netdev)? And doing
>> We cannot send the packet to the netdev before comparing. We need to keep
>> the connection after failover.
>>
>> Thanks
>> Wen Congyang
>>
>>> qemu_net_queue_send_iov() to a netdev in another thread may need some
>>> synchronization with iothread.
>>>
>>>> Thanks
>>>> Wen Congyang
>>>>
>>>
>>>
>>> .
>>>
>>
>>
> 
> 
> 
> .
> 






reply via email to

[Prev in Thread] Current Thread [Next in Thread]