[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
GSoC idea proposal
From: |
Glen Lenker |
Subject: |
GSoC idea proposal |
Date: |
Fri, 27 Mar 2009 16:45:40 -0700 |
User-agent: |
Mutt/1.5.18 (2008-05-17) |
I saw that the GNU project is a mentoring organization for this years
Google Summer of Code. Would it be possible to extend my work on
optimizing sort to this summer?
There are a few ideas on the table:
1) Reducing the amount of time wasted with pthread_join
This idea is outlined in the TODO. This would be an optimization off
of the patch Paul Eggert and I submitted a few days ago. The general
idea is increase the number of threads and use inter-thread
communication so that the merge process is more streamlined than it is
now.
2) Merge Insertion and List Merge sort
This is also outlined in the TODO. I could try my hand at merge
insertion and try to see why list merge sort did not improve performance.
3) Integrating our work with the other groups work
As you may know, we only parallelized the internal sort algorithm, the
other group was in charge of the external sort algorithm. They opted
to implement theirs in OpenMP.
4) "In-between sort"
As Jim suggested here:
http://lists.gnu.org/archive/html/bug-coreutils/2009-02/msg00119.html
and P?draig Brady a moment later, we should also try doing something
more "UNIXy", where the general idea looks like this:
sort -m \
<(sort <(dice -k1,4 INPUT)) \
<(sort <(dice -k2,4 INPUT)) \
<(sort <(dice -k3,4 INPUT)) \
<(sort <(dice -k4,4 INPUT)) \
> out
where dice split INPUT into 4 slices "dice -k1,4 INPUT" grabbing the
first.
I would be willing to work on all 4 of these this summer. These last
ten weeks I think I've accustomed myself to the sort code base and
these additions seem very doable. Does this sound like a good idea?
--
Glen Lenker
- GSoC idea proposal,
Glen Lenker <=