Re: [Qemu-devel] [RFC V6 00/33] QCOW2 deduplication core functionality

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC V6 00/33] QCOW2 deduplication core functionality

From:	Stefan Hajnoczi
Subject:	Re: [Qemu-devel] [RFC V6 00/33] QCOW2 deduplication core functionality
Date:	Mon, 11 Feb 2013 09:10:57 +0100
User-agent:	Mutt/1.5.21 (2010-09-15)

On Mon, Feb 11, 2013 at 03:50:10AM +0100, Benoît Canet wrote:
> As you can read dedup keep me awake at night.
> 
> I still think that there is a need for a deduplication implementation
> that would perform nearly as fast as regular qcow2.
> 
> I though about this: http://en.wikipedia.org/wiki/Normal_distribution.
> 
> Not all block are equals for deduplication.
> Some will deduplicate well and some won't.
> 
> My idea would be to run periodically a filter on the in ram tree in order to
> drop the less performing and the less promising block.
> 
> The less performing block involved on a deduplication operation since the last
> run of the filter would be kept because they are promising so they would
> survive and have a chance to climb among the top performers.
> 
> The less performing block not involved in a deduplication operation since the
> last run of the filter would be definitively dropped from the HashNode tree
> since they are loosers.
> 
> The center of the bell curve would be kept since they are champions.
> 
> This way this ram based implementation could offer speed while it's memory 
> usage
> being limited.

This means inline dedup is opportunistic and not guaranteed to catch
every dedup.

There needs to be a trade-off between a hash's dedup score and its age.
Young hashes are allowed to stay for a while, even with low dedup scores
so they have a chance to accumulate dedups.

I still think a lookup data structure that spills to disk is better, but
perhaps you have data that shows it's reasonable to expect decent dedup
rates with the opportunistic approach?

Stefan

[Prev in Thread]

Current Thread

[Next in Thread]

[Qemu-devel] [RFC V6 14/33] qcow2: Create qcow2_is_cluster_to_dedup., (continued)
- [Qemu-devel] [RFC V6 14/33] qcow2: Create qcow2_is_cluster_to_dedup., Benoît Canet, 2013/02/06
  - Re: [Qemu-devel] [RFC V6 14/33] qcow2: Create qcow2_is_cluster_to_dedup., Stefan Hajnoczi, 2013/02/07
- [Qemu-devel] [RFC V6 28/33] qcow2: Add check_dedup_l2 in order to check l2 of dedup table., Benoît Canet, 2013/02/06
- [Qemu-devel] [RFC V6 09/33] qcow2: Implement qcow2_compute_cluster_hash., Benoît Canet, 2013/02/06
  - Re: [Qemu-devel] [RFC V6 09/33] qcow2: Implement qcow2_compute_cluster_hash., Stefan Hajnoczi, 2013/02/07
- [Qemu-devel] [RFC V6 10/33] qcow2: Extract qcow2_dedup_grow_table, Benoît Canet, 2013/02/06
  - Re: [Qemu-devel] [RFC V6 10/33] qcow2: Extract qcow2_dedup_grow_table, Stefan Hajnoczi, 2013/02/07
- Re: [Qemu-devel] [RFC V6 00/33] QCOW2 deduplication core functionality, Stefan Hajnoczi, 2013/02/08
  - Re: [Qemu-devel] [RFC V6 00/33] QCOW2 deduplication core functionality, Eric Blake, 2013/02/08
  - Re: [Qemu-devel] [RFC V6 00/33] QCOW2 deduplication core functionality, Benoît Canet, 2013/02/10
    - Re: [Qemu-devel] [RFC V6 00/33] QCOW2 deduplication core functionality, Stefan Hajnoczi <=

Prev by Date: [Qemu-devel] [PATCH v1 1/2] qom/object.c: Reset interface list on inheritance
Next by Date: Re: [Qemu-devel] kvm segfaulting
Previous by thread: Re: [Qemu-devel] [RFC V6 00/33] QCOW2 deduplication core functionality
Next by thread: [Qemu-devel] [RFC V2 00/16] QCOW2 deduplication metrics
Index(es):
- Date
- Thread