qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] Re: [ANNOUNCE] Sheepdog: Distributed Storage System for KVM


From: MORITA Kazutaka
Subject: [Qemu-devel] Re: [ANNOUNCE] Sheepdog: Distributed Storage System for KVM
Date: Fri, 23 Oct 2009 19:41:03 +0900

On Fri, Oct 23, 2009 at 12:30 AM, Avi Kivity <address@hidden> wrote:
> On 10/21/2009 07:13 AM, MORITA Kazutaka wrote:
>>
>> Hi everyone,
>>
>> Sheepdog is a distributed storage system for KVM/QEMU. It provides
>> highly available block level storage volumes to VMs like Amazon EBS.
>> Sheepdog supports advanced volume management features such as snapshot,
>> cloning, and thin provisioning. Sheepdog runs on several tens or hundreds
>> of nodes, and the architecture is fully symmetric; there is no central
>> node such as a meta-data server.
>
> Very interesting!  From a very brief look at the code, it looks like the
> sheepdog block format driver is a network client that is able to access
> highly available images, yes?

Yes. Sheepdog is a simple key-value storage system that
consists of multiple nodes (a bit similar to Amazon Dynamo, I guess).

The qemu Sheepdog driver (client) divides a VM image into fixed-size
objects and store them on the key-value storage system.

> If so, is it reasonable to compare this to a cluster file system setup (like
> GFS) with images as files on this filesystem?  The difference would be that
> clustering is implemented in userspace in sheepdog, but in the kernel for a
> clustering filesystem.

I think that the major difference between sheepdog and cluster file
systems such as Google File system, pNFS, etc is the interface between
clients and a storage system.

> How is load balancing implemented?  Can you move an image transparently
> while a guest is running?  Will an image be moved closer to its guest?

Sheepdog uses consistent hashing to decide where objects store; I/O
load is balanced across the nodes. When a new node is added or the
existing node is removed, the hash table changes and the data
automatically and transparently are moved over nodes.

We plan to implement a mechanism to distribute the data not randomly
but intelligently; we could use machine load, the locations of VMs, etc.

> Can you stripe an image across nodes?

Yes, a VM images is divided into multiple objects, and they are
stored over nodes.

> Do you support multiple guests accessing the same image?

A VM image can be attached to any VMs but one VM at a time; multiple
running VMs cannot access to the same VM image.

> What about fault tolerance - storing an image redundantly on multiple nodes?

Yes, all objects are replicated to multiple nodes.


-- 
MORITA, Kazutaka

NTT Cyber Space Labs
OSS Computing Project
Kernel Group
E-mail: address@hidden




reply via email to

[Prev in Thread] Current Thread [Next in Thread]