coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [coreutils] Memory usage of parallel sort


From: Pádraig Brady
Subject: Re: [coreutils] Memory usage of parallel sort
Date: Sat, 18 Dec 2010 05:52:05 +0000
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3

On 17/12/10 19:29, Assaf Gordon wrote:
> Hello,
> 
> A question regarding the memory usage requirements of the parallel sort:
> It seems that the memory usage (resident size) increases with the number of 
> threads used.
> 
> It also seems to me (but not verified) that the increased memory usage 
> happens not at the sorting phase, but at the output phase (when writing the 
> sorted results to STDOUT).
> 
> I'm wondering if this is the intended behavior, because I'm sorting big files 
> in memory, and with a single-threaded sort, the rule of thumb is to use 
> --buffer-size of 150% the file size to do the sorting complete in memory 
> without temporary files.
> 
> Here's an example:
> =====
> ## directory without write permission, used as temporary-directory - 
> ## sort will fail if it tries to use temporary files
> $ ls -lod /data/gordon/forbidden/
> dr-xr-xr-x. 2 gordon 4096 Dec 17 12:03 /data/gordon/forbidden/
> 
> ## Big file to sort, created with "gensort -a 2000000"
> $ ls -lhos /data/gordon/ramdisk/gensort-2m 
> 1.9G -rw-r--r--. 1 gordon 1.9G Dec 17 11:46 /data/gordon/ramdisk/gensort-2m
> 
> ## Sort with single thread, in-memory - works OK
> $ src/sort --parallel=1 -T /data/gordon/forbidden/ -S 4G \
>       /data/gordon/ramdisk/gensort-2m > /dev/null
> 
> ## Sort with two threads, in-memory, still works OK
> $ src/sort --parallel=2 -T /data/gordon/forbidden/ -S 4G \
>      /data/gordon/ramdisk/gensort-2m > /dev/null
> 
> ## sort with 16 threads, sort tries to use temporary files,
> ## meaning 4GB is not enough to sort a 2GB file.
> $ src/sort --parallel=16 -T /data/gordon/forbidden/ -S 4G \
>      /data/gordon/ramdisk/gensort-2m > /dev/null 
> src/sort: cannot create temporary file in `/data/gordon/forbidden/': 
> Permission denied
> =====
> 
> The reason I think it happens in the output phase, is because it seems memory 
> usage stays the same while the output file has zero size, 
> and it goes up once the output file starts increasing in size (not very 
> scientific observation, but still...).
> 
> Checking resident size with "top", shows:
> --parallel    RES (GB)
> 1             2.8
> 2             3.1
> 4             3.7
> 6             3.9
> 8             4.2
> 10            4.4
> 12            4.5
> 14            4.7
> 16            4.8
> 18            4.9
> 20            5
> 22            5
> 24            5.1
> 26            5.2
> 28            5.3
> 
> If this happens by design, then no problem (perhaps just document it, to warn 
> about increased memory requirements).

Thanks for looking at this!

I've not looked into the memory details,
but this is another reason to restrict
the default number of threads to 8,
which we talked about previously...

commit 69ef9deef087b0447c022225d9a29825d7a714a1
Author: Pádraig Brady <address@hidden>
Date:   Sat Dec 18 05:27:46 2010 +0000

    sort: use at most 8 threads by default

    * src/sort.c (main): If --parallel isn't specified,
    restrict the number of threads to 8 by default.
    If the --parallel option is specified, then
    allow any number of threads to be set, independent
    of the number of processors on the system.

diff --git a/NEWS b/NEWS
index 484ed5c..7eda1b2 100644
--- a/NEWS
+++ b/NEWS
@@ -27,6 +27,12 @@ GNU coreutils NEWS                                    -*- 
outline -*-

   sort -m -o f f ... f no longer dumps core when file descriptors are limited.

+** Changes in behavior
+
+  sort will not create more than 8 threads by default due to diminishing
+  performance gains.  Also the --parallel option is no longer restricted
+  to the number of available processors.
+
 ** New features

   split accepts the --number option to generate a specific number of files.
diff --git a/src/sort.c b/src/sort.c
index 54dd815..9d668c0 100644
--- a/src/sort.c
+++ b/src/sort.c
@@ -116,6 +116,10 @@ struct rlimit { size_t rlim_cur; };
    this number has any practical effect.  */
 enum { SUBTHREAD_LINES_HEURISTIC = 4 };

+/* The number of threads after which there are
+   diminishing performance gains.  */
+enum { DEFAULT_MAX_THREADS = 8 };
+
 /* Exit statuses.  */
 enum
   {
@@ -4595,14 +4599,15 @@ main (int argc, char **argv)
     }
   else
     {
-      unsigned long int np2 = num_processors (NPROC_CURRENT_OVERRIDABLE);
-      if (!nthreads || nthreads > np2)
-        nthreads = np2;
+      if (!nthreads)
+        {
+          nthreads = MIN (DEFAULT_MAX_THREADS,
+                          num_processors (NPROC_CURRENT_OVERRIDABLE));
+        }

       /* Avoid integer overflow later.  */
       size_t nthreads_max = SIZE_MAX / (2 * sizeof (struct merge_node));
-      if (nthreads_max < nthreads)
-        nthreads = nthreads_max;
+      nthreads = MIN (nthreads, nthreads_max);

       sort (files, nfiles, outfile, nthreads);
     }



reply via email to

[Prev in Thread] Current Thread [Next in Thread]