coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[coreutils] [PATCH] sort: do not generate thousands of subprocesses for


From: Paul Eggert
Subject: [coreutils] [PATCH] sort: do not generate thousands of subprocesses for 16-way merge
Date: Thu, 16 Dec 2010 22:42:17 -0800
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.13) Gecko/20101208 Thunderbird/3.1.7

Well I thought I was done for now fixing 'sort' bugs, but I found one
more while testing my previous patch some more (after pushing it; I
know, a bad habit :-).  I noticed that 'sort' generated over 10,000
simultaneous subprocesses on the misc/sort-compress-hang benchmark,
even though it would have worked just fine with only 17 simultaneous
subprocesses since it was a 16-way merge.  This behavior is antisocial
as it can exhaust process slots and therefore make unrelated
applications fail.  The 'sort' code was trying to limit the number of
subprocesses, but its attempt was not working.  I pushed the following
patch to fix the problem.  (And now I really must stop looking at
'sort' for a bit!)


>From f7afbe67f1587c79b43779aa95502fcee564c20e Mon Sep 17 00:00:00 2001
From: Paul Eggert <address@hidden>
Date: Thu, 16 Dec 2010 22:31:56 -0800
Subject: [PATCH] sort: do not generate thousands of subprocesses for 16-way 
merge

Without this change, tests/misc/sort-compress-hang would consume
more than 10,000 process slots on my RHEL 5.5 x86-64 server,
making it likely for other applications to fail due to lack of
process slots.  With this change, the same benchmark causes 'sort'
to consume at most 19 process slots.  The change also improved
wall-clock time by 2% and user+system time by 14% on that benchmark.
* NEWS: Document this.
* src/sort.c (MAX_PROCS_BEFORE_REAP): Remove.
(reap_exited): Renamed from reap_some; this is a more accurate name,
since "some" incorrectly implies that it reaps at least one process.
All uses changed.
(reap_some): New function: it *does* reap at least one process.
(pipe_fork): Do not allow more than NMERGE + 2 subprocesses.
(mergefps, sort): Omit check for exited processes: no longer needed,
and anyway the code consumed too much CPU per line when 2 < nprocs.
---
 NEWS       |    3 ++-
 src/sort.c |   34 +++++++++++++++++++++-------------
 2 files changed, 23 insertions(+), 14 deletions(-)

diff --git a/NEWS b/NEWS
index a69ef54..484ed5c 100644
--- a/NEWS
+++ b/NEWS
@@ -22,7 +22,8 @@ GNU coreutils NEWS                                    -*- 
outline -*-
   into the stack of an expired thread. [bug introduced in coreutils-8.6]
 
   sort --compress no longer mishandles subprocesses' exit statuses,
-  and no longer hangs indefinitely due to a bug in waiting for subprocesses.
+  no longer hangs indefinitely due to a bug in waiting for subprocesses,
+  and no longer generates many more than NMERGE subprocesses.
 
   sort -m -o f f ... f no longer dumps core when file descriptors are limited.
 
diff --git a/src/sort.c b/src/sort.c
index 6bce49b..54dd815 100644
--- a/src/sort.c
+++ b/src/sort.c
@@ -659,9 +659,6 @@ proctab_comparator (void const *e1, void const *e2)
 /* The number of unreaped child processes.  */
 static pid_t nprocs;
 
-/* The number of child processes we'll allow before we try to reap some. */
-enum { MAX_PROCS_BEFORE_REAP = 2 };
-
 static bool delete_proc (pid_t);
 
 /* If PID is positive, wait for the child process with that PID to
@@ -743,12 +740,21 @@ wait_proc (pid_t pid)
    already exited.  */
 
 static void
-reap_some (void)
+reap_exited (void)
 {
   while (0 < nprocs && reap (0))
     continue;
 }
 
+/* Reap at least one exited child, waiting if necessary.  */
+
+static void
+reap_some (void)
+{
+  reap (-1);
+  reap_exited ();
+}
+
 /* Reap all children, waiting if necessary.  */
 
 static void
@@ -969,6 +975,16 @@ pipe_fork (int pipefds[2], size_t tries)
   if (pipe (pipefds) < 0)
     return -1;
 
+  /* At least NMERGE + 1 subprocesses are needed.  More could be created, but
+     uncontrolled subprocess generation can hurt performance significantly.
+     Allow at most NMERGE + 2 subprocesses, on the theory that there
+     may be some useful parallelism by letting compression for the
+     previous merge finish (1 subprocess) in parallel with the current
+     merge (NMERGE + 1 subprocesses).  */
+
+  if (nmerge + 1 < nprocs)
+    reap_some ();
+
   while (tries--)
     {
       /* This is so the child process won't delete our temp files
@@ -991,7 +1007,7 @@ pipe_fork (int pipefds[2], size_t tries)
         {
           xnanosleep (wait_retry);
           wait_retry *= 2;
-          reap_some ();
+          reap_exited ();
         }
     }
 
@@ -2968,10 +2984,6 @@ mergefps (struct sortfile *files, size_t ntemps, size_t 
nfiles,
           ord[j] = ord[j + 1];
         ord[count_of_smaller_lines] = ord0;
       }
-
-      /* Free up some resources every once in a while.  */
-      if (MAX_PROCS_BEFORE_REAP < nprocs)
-        reap_some ();
     }
 
   if (unique && savedline)
@@ -3829,10 +3841,6 @@ sort (char *const *files, size_t nfiles, char const 
*output_file,
 
           xfclose (tfp, temp_output);
 
-          /* Free up some resources every once in a while.  */
-          if (MAX_PROCS_BEFORE_REAP < nprocs)
-            reap_some ();
-
           if (output_file_created)
             goto finish;
         }
-- 
1.7.2





reply via email to

[Prev in Thread] Current Thread [Next in Thread]