bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] sort: Add --threads option, which parallelizes internal sort


From: Jim Meyering
Subject: Re: [PATCH] sort: Add --threads option, which parallelizes internal sort.
Date: Fri, 03 Apr 2009 22:45:21 +0200

Paul Eggert wrote:
> Jim Meyering <address@hidden> writes:
>> Ramping up to 5M lines, the resulting test takes almost 2 minutes and
>> the sort itself took 34s on this particular quad-core system.  ...  A
>> more interesting test would be to ensure that when run on a multi-core
>> system sorting with --threads=2 is at least X% faster than sorting
>> with --threads=1.
>
> The patch I submitted used a small test because I'm developing on an
> old, slow machine--a 2.4 GHz Pentium 4 with 0.5 MiB cache and 1 GiB RAM.
> (I really need to upgrade it, but we have this little budget problem in
> California state institutions right now....)
>
> More important, it's not clear to me what the role of the test suite
> ought to be.  Should the test really fail if it doesn't get enough
> performance improvement with 2 threads?  How do we decide what's
> "enough"?  None of our other tests are performance tests so we are in a
> bit of a new ground here.

Hi Paul,

Yes, performance tests are problematic.
Recently I reclassified a test as "very expensive"

    tests: mark the rm/ext3-perf test as "very" expensive
    http://git.sv.gnu.org/cgit/coreutils.git/commit/?id=9b6538aa8dbac

so that its now-frequent failures (when building with -jN)
don't clutter normally-clean "make check" results.
Now, that long-running test is run only when I run "make check"
with the RUN_VERY_EXPENSIVE_TESTS=yes set in the environment.
And then I'm careful to use -j1.

Interestingly, when I first added that test, it passed very
consistently.  Then after a couple months, started failing more
and more frequently.
The disk was formatted not long before that test was added.
I suspect ext3 "aging" (the disk is nowhere near full), but haven't had
time to investigate.

Putting performance-related tests on a separate target (and make sure
that they are run sequentially) would avoid that problem.

> Given the costs involved I'm inclined to think that "make check" should
> focus on correctness tests, and a new target ("make benchmark", say?)
> should be used for performance tests.  But perhaps my view is colored by
> my using such a slow machine.
>
>> I'm a little reluctant to apply this patch in its current
>> state, since it's not achieving reasonable efficiency.
>
> We'll look into speeding it up.  I've asked Glen to generate profiling

Great!

> information.  Obviously there's something screwy going on.  If it is a
> CPU bottleneck in pthread_join, though, that suggests that there's
> something wrong with pthread_join.  Pthread_join should not busy-wait.
...
>> Also, it'd be nice to add the following:
>>   - a test that uses --threads=N to exercise the new option-parsing code,
>>       though if you adjust as suggested to compare --threads=N for
>>       N=1&2, that's not needed.
>>   - a NEWS item
>
> OK, we'll look into that too.

Thanks!




reply via email to

[Prev in Thread] Current Thread [Next in Thread]