Re: [Gluster-devel] [RFC ] dictionary optimizations

gluster-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gluster-devel] [RFC ] dictionary optimizations

From:	Xavier Hernandez
Subject:	Re: [Gluster-devel] [RFC ] dictionary optimizations
Date:	Wed, 04 Sep 2013 15:37:36 +0200
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130804 Thunderbird/17.0.8

Al 04/09/13 14:05, En/na Jeff Darcy ha escrit:

On 09/04/2013 04:27 AM, Xavier Hernandez wrote:
I would also like to note that each node can store multiple elements.
Current implementation creates a node for each byte in the key. In my
implementation I only create a node if there is a prefix coincidencebetween
2 or more keys. This reduces the number of nodes and the number of
indirections.
Whatever we do, we should try to make sure that the changes are profiled
against real usage. When I was making my own dict optimizations backin Marchof last year, I started by looking at how they're actually used. Atthat time,a significant majority of dictionaries contained just one item. That'swhy Ionly implemented a simple mechanism to pre-allocate the firstdata_pair instead
of doing something more ambitious.  Even then, the difference in actual
performance or CPU usage was barely measurable.  Dict usage has certainly
changed since then, but I think you'd still be hard pressed to find acasewhere a single dict contains more than a handful of entries, andapproachesthat are optimized for dozens to hundreds might well perform worsethan simple
ones (e.g. because of cache aliasing or branch misprediction).
If you're looking for other optimization opportunities that mightprovide evenbigger "bang for the buck" then I suggest that stack-frame orframe->local
allocations are a good place to start.  Or string copying in places like
loc_copy. Or the entire fd_ctx/inode_ctx subsystem. Let me know andI'll comeup with a few more. To put a bit of a positive spin on things, theGlusterFS
code offers many opportunities for improvement in terms of CPU and memory
efficiency (though it's surprisingly still way better than Ceph inthat regard).

Yes. The optimizations on dictionary structures are not a bigimprovement in the overall performance of GlusterFS. I tried it on areal situation and the benefit was only marginal. However I didn't testnew features like an atomic lookup and remove if found (because I wouldhave had to review all the code). I think this kind of functionalitiescould improve a bit more the results I obtained.

However this is not the only reason to do these changes. While I've beenwriting code I've found that it's tedious to do some things just becausethere isn't such functions in dict_t. Some actions require multiplecalls, having to check multiple errors and adding complexity andlimiting readability of the code. Many of these situations could besolved using functions similar to what I proposed.

On the other side, if dict_t must be truly considered a concurrentstructure, there are a lot of race conditions that might appear whendoing some operations. It would require a great effort to take care ofall these possibilities everywhere. It would be better to pack most ofthese situations into functions inside the dict_t itself where it iseasier to combine some operations.

By the way, I've made some tests with multiple bricks and it seems thatthere is a clear speed loss on directory listings as the number ofbricks increases. Since bricks should be independent and they can workin parallel, I didn't expected such a big performance degradation.However the tests have not been exhaustive nor made in best conditionsso they might be misleading. Anyway it seems to me that there might be aproblem with some mutexes that force too much serialization of requests(though I have no real proves it's only a feeling). Maybe some more"asynchronousity" on calls between translators could help.


Only some thoughts...

Best regards,

Xavi


_______________________________________________
Gluster-devel mailing list
address@hidden
https://lists.nongnu.org/mailman/listinfo/gluster-devel

[Prev in Thread]

Current Thread

[Next in Thread]

[Gluster-devel] [RFC ] dictionary optimizations, Xavier Hernandez, 2013/09/02
- Re: [Gluster-devel] [RFC ] dictionary optimizations, Anand Avati, 2013/09/03
  - Re: [Gluster-devel] [RFC ] dictionary optimizations, Xavier Hernandez, 2013/09/03
    - Re: [Gluster-devel] [RFC ] dictionary optimizations, Anand Avati, 2013/09/03
    - Re: [Gluster-devel] [RFC ] dictionary optimizations, Xavier Hernandez, 2013/09/04
    - Re: [Gluster-devel] [RFC ] dictionary optimizations, Xavier Hernandez, 2013/09/04
    - Re: [Gluster-devel] [RFC ] dictionary optimizations, Jeff Darcy, 2013/09/04
    - Re: [Gluster-devel] [RFC ] dictionary optimizations, Xavier Hernandez <=
    - Re: [Gluster-devel] [RFC ] dictionary optimizations, Anand Avati, 2013/09/04
    - Re: [Gluster-devel] [RFC ] dictionary optimizations, Xavier Hernandez, 2013/09/06
- Re: [Gluster-devel] [RFC ] dictionary optimizations, Jeff Darcy, 2013/09/06

Prev by Date: Re: [Gluster-devel] [RFC ] dictionary optimizations
Next by Date: Re: [Gluster-devel] [RFC ] dictionary optimizations
Previous by thread: Re: [Gluster-devel] [RFC ] dictionary optimizations
Next by thread: Re: [Gluster-devel] [RFC ] dictionary optimizations
Index(es):
- Date
- Thread