qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC V6 00/33] QCOW2 deduplication core functionality


From: Stefan Hajnoczi
Subject: Re: [Qemu-devel] [RFC V6 00/33] QCOW2 deduplication core functionality
Date: Mon, 11 Feb 2013 09:10:57 +0100
User-agent: Mutt/1.5.21 (2010-09-15)

On Mon, Feb 11, 2013 at 03:50:10AM +0100, Benoît Canet wrote:
> As you can read dedup keep me awake at night.
> 
> I still think that there is a need for a deduplication implementation
> that would perform nearly as fast as regular qcow2.
> 
> I though about this: http://en.wikipedia.org/wiki/Normal_distribution.
> 
> Not all block are equals for deduplication.
> Some will deduplicate well and some won't.
> 
> My idea would be to run periodically a filter on the in ram tree in order to
> drop the less performing and the less promising block.
> 
> The less performing block involved on a deduplication operation since the last
> run of the filter would be kept because they are promising so they would
> survive and have a chance to climb among the top performers.
> 
> The less performing block not involved in a deduplication operation since the
> last run of the filter would be definitively dropped from the HashNode tree
> since they are loosers.
> 
> The center of the bell curve would be kept since they are champions.
> 
> This way this ram based implementation could offer speed while it's memory 
> usage
> being limited.

This means inline dedup is opportunistic and not guaranteed to catch
every dedup.

There needs to be a trade-off between a hash's dedup score and its age.
Young hashes are allowed to stay for a while, even with low dedup scores
so they have a chance to accumulate dedups.

I still think a lookup data structure that spills to disk is better, but
perhaps you have data that shows it's reasonable to expect decent dedup
rates with the opportunistic approach?

Stefan



reply via email to

[Prev in Thread] Current Thread [Next in Thread]