qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [POC]colo-proxy in qemu


From: Jason Wang
Subject: Re: [Qemu-devel] [POC]colo-proxy in qemu
Date: Wed, 11 Nov 2015 11:04:42 +0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0


On 11/10/2015 05:35 PM, Tkid wrote:
>
>
> On 11/10/2015 03:35 PM, Jason Wang wrote:
>> On 11/10/2015 01:26 PM, Tkid wrote:
>>> Hi,all
>>>
>>> We are planning to reimplement colo proxy in userspace (Here is in
>>> qemu) to
>>> cache and compare net packets.This module is one of the important
>>> components
>>> of COLO project and now it is still in early stage, so any comments and
>>> feedback are warmly welcomed,thanks in advance.
>>>
>>> ## Background
>>> COLO FT/HA (COarse-grain LOck-stepping Virtual Machines for Non-stop
>>> Service)
>>> project is a high availability solution. Both Primary VM (PVM) and
>>> Secondary VM
>>> (SVM) run in parallel. They receive the same request from client, and
>>> generate
>>> responses in parallel too. If the response packets from PVM and SVM are
>>> identical, they are released immediately. Otherwise, a VM checkpoint
>>> (on demand)
>>> is conducted.
>>> Paper:
>>> http://www.socc2013.org/home/program/a3-dong.pdf?attredirects=0
>>> COLO on Xen:
>>> http://wiki.xen.org/wiki/COLO_-_Coarse_Grain_Lock_Stepping
>>> COLO on Qemu/KVM:
>>> http://wiki.qemu.org/Features/COLO
>>>
>>> By the needs of capturing response packets from PVM and SVM and
>>> finding out
>>> whether they are identical, we introduce a new module to qemu
>>> networking called
>>> colo-proxy.
>>>
>>> This document describes the design of the colo-proxy module
>>>
>>> ## Glossary
>>>    PVM - Primary VM, which provides services to clients.
>>>    SVM - Secondary VM, a hot standby and replication of PVM.
>>>    PN - Primary Node, the host which PVM runs on
>>>    SN - Secondary Node, the host which SVM runs on
>>>
>>> ## Our Idea ##
>>>
>>> COLO-Proxy
>>> COLO-Proxy is a part of COLO,based on qemu net filter and it's a
>>> plugin for
>>> qemu net filter.the function keep SVM connect normal to PVM and compare
>>> PVM's packets to SVM's packets.if difference,notify COLO do checkpoint.
>>>
>>> == Workflow ==
>>>
>>> +--+                                      +--+
>>> |PN|                                      |SN|
>>> +-----------------------+                 +-----------------------+
>>> | +-------------------+ |                 | +-------------------+ |
>>> | |                   | |                 | |                   | |
>>> | |        PVM        | |                 | |        SVM        | |
>>> | |                   | |                 | |                   | |
>>> | +--+-^--------------+ |                 | +-------------^----++ |
>>> |    | |                |                 |               |    |  |
>>> |    | | +------------+ |                 | +-----------+ |    |  |
>>> |    | | |    COLO    | |    (socket)     | |    COLO   | |    |  |
>>> |    | | | CheckPoint +---------------------> CheckPoint| |    |  |
>>> |    | | |            | |      (6)        | |           | |    |  |
>>> |    | | +-----^------+ |                 | +-----------+ |    |  |
>>> |    | |   (5) |        |                 |               |    |  |
>>> |    | |       |        |                 |               |    |  |
>>> | +--v-+--------------+ | Forward(socket) | +-------------+----v+ |
>>> | |COLO Proxy  |      +-------+(1)+--------->seq&ack adjust(2)| | |
>>> | |      +-----+------+ |                 | +-----------------+ | |
>>> | |      | Compare(4) <-------+(3)+---------+     COLO Proxy    | |
>>> | +-------------------+ | Forward(socket) | +-------------------+ |
>>> ++Qemu+-----------------+                 ++Qemu+-----------------+
>>>             | ^
>>>             | |
>>>             | |
>>>    +--------v-+--------+
>>>    |                   |
>>>    |      Client       |
>>>    |                   |
>>>    +-------------------+
>>>
>>>
>>> (1)When PN receive client packets,PN COLO-Proxy copy and forward
>>> packets to
>>> SN COLO-Proxy.
>>> (2)SN COLO-Proxy record PVM's packet inital seq & adjust client's
>>> ack,send
>>> adjusted packets to SVM
>>> (3)SN Qemu COLO-Proxy recieve SVM's packets and forward to PN Qemu
>>> COLO-Proxy.
>>> (4)PN Qemu COLO-Proxy enqueue SVM's packets and enqueue PVM's
>>> packets,then
>>> compare PVM's packets data with SVM's packets data. If packets is
>>> different, compare
>>> module notify COLO CheckPoint module to do a checkpoint then send
>>> PVM's packets to
>>> client and drop SVM's packets, otherwise, just send PVM's packets to
>>> client and
>>> drop SVM's packets.
>>> (5)notify COLO-Checkpoint module checkpoint is needed
>>> (6)Do COLO-Checkpoint
>>>
>>> ### QEMU space TCP/IP stack(Based on SLIRP) ###
>>> We need a QEMU space TCP/IP stack to help us to analysis packet. After
>>> looking
>>> into QEMU, we found that SLIRP
>>>
>>> http://wiki.qemu.org/Documentation/Networking#User_Networking_.28SLIRP.29
>>>
>>>
>>> is a good choice for us. SLIRP proivdes a full TCP/IP stack within
>>> QEMU, it can
>>> help use to handle the packet written to/read from backend(tap) device
>>> which is
>>> just like a link layer(L2) packet.
>>>
>>> ### Packet enqueue and compare ###
>>> Together with QEMU space TCP/IP stack, we enqueue all packets sent by
>>> PVM and
>>> SVM on Primary QEMU, and then compare the packet payload for each
>>> connection.
>>>
> Thanks for review ~
>> Hi:
>>
>> Just have the following questions in my mind (some has been raised in
>> the previous rounds of discussion without a conclusion):
>>
>> - What's the plan for management layer? The setup seems complicated so
>> we could not simply depend on user to do each step. (And for security
>> reason, qemu was usually run as unprivileged user)
> -We don't need to run as privileged user, colo-proxy just run like
> filter-buffer. usage: primary: -netdev tap,id=bn0 -device
> e1000,netdev=bn0 -object
> colo-proxy,id=f0,netdev=bn0,queue=all,side=primary,host=3.3.3.8,port=xxx
> secondary: -netdev tap,id=bn0 -device e1000,netdev=bn0 -object
> colo-proxy,id=f0,netdev=bn0,queue=all,side=secondary,server=tcp:xxxx:port

Ok.

>> - What's the plan for vhost? Userspace network in qemu is rather slow,
>> most user will choose vhost.
> colo-proxy in qemu space don't support vhost. people who want to use
> colo must disable vhost,  but virtio-net is another choice which is
> enough in most case.

Ok for function but not for performance :) There're lots of users that
cares about performance.

>> - What if application generate packet based on hwrng device? This will
>> produce always different packets.
> just like hailiang said.
>> - Not sure SLIRP is perfect matched for this task. As has been raised,
>> another method is to decouple the packet comparing from qemu. In this
>> way, lots of open source userspace stack could be used.
> -we just need the some capabilities(such as IP frag/defrag) of SLIRP。
> We have investigated some open source userspace stack,but not find one
> better to SLIRP. if you know,please tell me.

I think it's ok to use SLIRP. But it has some drawbacks:

- Lacking ipv6 support. Which means you need implement this (there're
rfc posted in the list) even if you may only want the (de)fragmentation.
- Not used in any production environment AFAIK, so maybe buggy and you
need to fix the bugs.

So if possible choosing an certified ip stack which may save lots of
attentions.

For userspace IP implementation, not very familiar, google gives me this
something like uip, lwip and libuinet.

>> - Haven't read the code of packet comparing, but if it needs to keep
>> track the state of each connection, it could be easily DOS from guest.
>
> -We think preventDOS from guest is out of our focus,it should be
> firewall to concerned.

Maybe net clear for the question. I mean e.g if you need to track the
state of each connection, is there a limitation of the maximum
connections that is accepted?

If yes, what if guest have more connections than this, switch to
periodic mode?
If no, guest could exhaust the host memory by faking connections in guest.

>
>> Thanks
>> .
>>
>




reply via email to

[Prev in Thread] Current Thread [Next in Thread]