qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [POC]colo-proxy in qemu


From: Dr. David Alan Gilbert
Subject: Re: [Qemu-devel] [POC]colo-proxy in qemu
Date: Tue, 10 Nov 2015 09:41:30 +0000
User-agent: Mutt/1.5.24 (2015-08-30)

* Jason Wang (address@hidden) wrote:
> 
> 
> On 11/10/2015 01:26 PM, Tkid wrote:
> > Hi,all
> >
> > We are planning to reimplement colo proxy in userspace (Here is in
> > qemu) to
> > cache and compare net packets.This module is one of the important
> > components
> > of COLO project and now it is still in early stage, so any comments and
> > feedback are warmly welcomed,thanks in advance.
> >
> > ## Background
> > COLO FT/HA (COarse-grain LOck-stepping Virtual Machines for Non-stop
> > Service)
> > project is a high availability solution. Both Primary VM (PVM) and
> > Secondary VM
> > (SVM) run in parallel. They receive the same request from client, and
> > generate
> > responses in parallel too. If the response packets from PVM and SVM are
> > identical, they are released immediately. Otherwise, a VM checkpoint
> > (on demand)
> > is conducted.
> > Paper:
> > http://www.socc2013.org/home/program/a3-dong.pdf?attredirects=0
> > COLO on Xen:
> > http://wiki.xen.org/wiki/COLO_-_Coarse_Grain_Lock_Stepping
> > COLO on Qemu/KVM:
> > http://wiki.qemu.org/Features/COLO
> >
> > By the needs of capturing response packets from PVM and SVM and
> > finding out
> > whether they are identical, we introduce a new module to qemu
> > networking called
> > colo-proxy.
> >
> > This document describes the design of the colo-proxy module
> >
> > ## Glossary
> >   PVM - Primary VM, which provides services to clients.
> >   SVM - Secondary VM, a hot standby and replication of PVM.
> >   PN - Primary Node, the host which PVM runs on
> >   SN - Secondary Node, the host which SVM runs on
> >
> > ## Our Idea ##
> >
> > COLO-Proxy
> > COLO-Proxy is a part of COLO,based on qemu net filter and it's a
> > plugin for
> > qemu net filter.the function keep SVM connect normal to PVM and compare
> > PVM's packets to SVM's packets.if difference,notify COLO do checkpoint.
> >
> > == Workflow ==
> >
> >
> > +--+                                      +--+
> > |PN|                                      |SN|
> > +-----------------------+                 +-----------------------+
> > | +-------------------+ |                 | +-------------------+ |
> > | |                   | |                 | |                   | |
> > | |        PVM        | |                 | |        SVM        | |
> > | |                   | |                 | |                   | |
> > | +--+-^--------------+ |                 | +-------------^----++ |
> > |    | |                |                 |               |    |  |
> > |    | | +------------+ |                 | +-----------+ |    |  |
> > |    | | |    COLO    | |    (socket)     | |    COLO   | |    |  |
> > |    | | | CheckPoint +---------------------> CheckPoint| |    |  |
> > |    | | |            | |      (6)        | |           | |    |  |
> > |    | | +-----^------+ |                 | +-----------+ |    |  |
> > |    | |   (5) |        |                 |               |    |  |
> > |    | |       |        |                 |               |    |  |
> > | +--v-+--------------+ | Forward(socket) | +-------------+----v+ |
> > | |COLO Proxy  |      +-------+(1)+--------->seq&ack adjust(2)| | |
> > | |      +-----+------+ |                 | +-----------------+ | |
> > | |      | Compare(4) <-------+(3)+---------+     COLO Proxy    | |
> > | +-------------------+ | Forward(socket) | +-------------------+ |
> > ++Qemu+-----------------+                 ++Qemu+-----------------+
> >            | ^
> >            | |
> >            | |
> >   +--------v-+--------+
> >   |                   |
> >   |      Client       |
> >   |                   |
> >   +-------------------+
> >
> >
> >
> >
> > (1)When PN receive client packets,PN COLO-Proxy copy and forward
> > packets to
> > SN COLO-Proxy.
> > (2)SN COLO-Proxy record PVM's packet inital seq & adjust client's ack,send
> > adjusted packets to SVM
> > (3)SN Qemu COLO-Proxy recieve SVM's packets and forward to PN Qemu
> > COLO-Proxy.
> > (4)PN Qemu COLO-Proxy enqueue SVM's packets and enqueue PVM's packets,then
> > compare PVM's packets data with SVM's packets data. If packets is
> > different, compare
> > module notify COLO CheckPoint module to do a checkpoint then send
> > PVM's packets to
> > client and drop SVM's packets, otherwise, just send PVM's packets to
> > client and
> > drop SVM's packets.
> > (5)notify COLO-Checkpoint module checkpoint is needed
> > (6)Do COLO-Checkpoint
> >
> > ### QEMU space TCP/IP stack(Based on SLIRP) ###
> > We need a QEMU space TCP/IP stack to help us to analysis packet. After
> > looking
> > into QEMU, we found that SLIRP
> >
> > http://wiki.qemu.org/Documentation/Networking#User_Networking_.28SLIRP.29
> >
> > is a good choice for us. SLIRP proivdes a full TCP/IP stack within
> > QEMU, it can
> > help use to handle the packet written to/read from backend(tap) device
> > which is
> > just like a link layer(L2) packet.
> >
> > ### Packet enqueue and compare ###
> > Together with QEMU space TCP/IP stack, we enqueue all packets sent by
> > PVM and
> > SVM on Primary QEMU, and then compare the packet payload for each
> > connection.
> >
> 
> Hi:
> 
> Just have the following questions in my mind (some has been raised in
> the previous rounds of discussion without a conclusion):
> 
> - What's the plan for management layer? The setup seems complicated so
> we could not simply depend on user to do each step. (And for security
> reason, qemu was usually run as unprivileged user)

It's certainly easier than the current COLO code that relies on a very
complex set of bridges, extra network interfaces and kernel modules.
UMU  (cc'd) have been working on a libvirt set that starts COLO up, although
one bit that's very messy is the curretn kernel based network comparison
code.

> - What's the plan for vhost? Userspace network in qemu is rather slow,
> most user will choose vhost.
> - What if application generate packet based on hwrng device? This will
> produce always different packets.

Yes, there are cases this happens - COLO's worst case is similar to simple
checkpointing (because it has a limit to the smallest checkpoint), but it's
best case is much better, on a compute heavy load, it ends up taking
a checkpoint very rarely.
Actually the big problem is where randomness occurs in unexpected places,
e.g. where things like Perl's hash randomisation means that the two
hosts produce the same data in different orders. 

> - Not sure SLIRP is perfect matched for this task. As has been raised, 
> another method is to decouple the packet comparing from qemu. In this
> way, lots of open source userspace stack could be used.
> - Haven't read the code of packet comparing, but if it needs to keep
> track the state of each connection, it could be easily DOS from guest.

The guest can only break it's own networking; so shooting itself in the foot
is no big deal.

Dave

> 
> Thanks
--
Dr. David Alan Gilbert / address@hidden / Manchester, UK



reply via email to

[Prev in Thread] Current Thread [Next in Thread]