bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#18420: 24.3; interaction with external process hangs emacs


From: Eli Zaretskii
Subject: bug#18420: 24.3; interaction with external process hangs emacs
Date: Tue, 09 Sep 2014 17:17:12 +0300

> From: Stephen Leake <stephen_leake@stephe-leake.org>
> Date: Mon, 08 Sep 2014 22:10:26 -0500
> 
> thread 6 : gdb interface
> thread 5 : child process reader_thread
>     used by select emulation to check for child process output
>     can't see contents of 'cp' object; optimized out
>         the only child process of this emacs instance is the external
>         parser; ada_mode_wisi_parse
> 
>     hung at "Wait until our input is acknowledged before reading again"
>         waiting for 'select()' to ack?
> 
> thread 4 : windows message handler
> thread 3 : itimer thread
> thread 2 : waiting on some startup?
> thread 1 : main, runing lisp code, process_send_string, emacs_full_write, 
> sys_write _write
>     sys_write can only be interrupted by interrupts/signals
>     emacs_full_write checks for that, handles them
>     should allow for read write, unless it there's a race condition
> 
> So there is a deadlock:
> 
> thread 5 is waiting for thread 1 to ack
> thread 1 is waiting for child process to read
> child process is waiting for thread 5 to read

Almost: the thread that reads is not thread 5 (the "reader thread"),
it's thread 1, the main (a.k.a. "Lisp") thread.  The reader thread
reads only a single byte from the pipe, and when it succeeds to do so,
it signals the main thread that input is available.  That signal is
supposed to be noticed by sys_select (the Windows implementation of
pselect), and then the main thread should read that stuff as usual.

But since the main thread is blocked inside a 'write' call, it never
gets to sys_select, never notices that input is available, and doesn't
read it.

> We need to understand how we get here.

It turns out to be very simple: the current machinery of handling the
pipe I/O vis-a-vis subprocesses is not designed well enough to
robustly handle the case of massive exchange of material in both
directions.  Recall that the subprocesses with which Emacs exchanges
data, such as the speller and GDB, only send relatively small amounts
of data each way.  And subprocesses that send large quantities of
data, such as Grep, only use the pipe in one direction.  So we never
see these deadlocks in "normal" operations.

Add to this the fact that the default buffer size of the pipe on
Windows is 4K, and you will understand how this deadlock happens.

I can easily get Emacs to deadlock on Windows with the following
simple Lisp:

(defun pipe-torture (file)
  ""
  (let ((proc (start-process "pipe-reader" "pipe-read" "cat.exe"))
        pt)
    (with-temp-buffer
      (insert-file-contents file)
      (setq pt (goto-char (point-min)))
      (while (< pt (point-max))
        (process-send-region proc pt (min (+ pt 8000) (point-max)))
        (message "wrote %d" (1- (min (+ pt 8000) (point-max))))
        ;(sit-for 1)
        (setq pt (min (+ pt 8000) (point-max)))))))

;; Invoke with (pipe-torture "src/xdisp.c")

Uncommenting sit-for lets this run without locking up, even if I
reduce the amount of time to wait to 0.02 sec, but it does hang if I
set that to 0.01 sec.

Of course, 'cat' is very light on processing its input before it sends
it back, so the 0.02 sec might not be good enough for subprocesses
with heavier processing.

Anyway, it turns out we can make our pipes non-blocking on write,
which then allows us to use the EWOULDBLOCK/EAGAIN handling in
send_process.  Please try the patch below, and if it gives good
results, I will install it shortly.  Using it, I can pipe the whole
xdisp.c to 'cat' even if I increase the chunk size to 400KB, and leave
sit-for commented out.

--- src/w32.c~0 2014-08-26 15:59:22 +0300
+++ src/w32.c   2014-09-09 16:42:47 +0300
@@ -7720,15 +7720,15 @@ fcntl (int s, int cmd, int options)
   if (cmd == F_DUPFD_CLOEXEC)
     return sys_dup (s);
 
-  if (winsock_lib == NULL)
-    {
-      errno = ENETDOWN;
-      return -1;
-    }
-
   check_errno ();
   if (fd_info[s].flags & FILE_SOCKET)
     {
+      if (winsock_lib == NULL)
+       {
+         errno = ENETDOWN;
+         return -1;
+       }
+
       if (cmd == F_SETFL && options == O_NONBLOCK)
        {
          unsigned long nblock = 1;
@@ -7745,6 +7745,29 @@ fcntl (int s, int cmd, int options)
          return SOCKET_ERROR;
        }
     }
+  else if ((fd_info[s].flags & (FILE_PIPE | FILE_WRITE))
+          == (FILE_PIPE | FILE_WRITE))
+    {
+      /* Force our writes to pipes be non-blocking.  */
+      if (cmd == F_SETFL && options == O_NONBLOCK)
+       {
+         HANDLE h = (HANDLE)_get_osfhandle (s);
+         DWORD pipe_mode = PIPE_NOWAIT;
+
+         if (!SetNamedPipeHandleState (h, &pipe_mode, NULL, NULL))
+           {
+             DebPrint (("SetNamedPipeHandleState: %lu\n", GetLastError ()));
+             return SOCKET_ERROR;
+           }
+         fd_info[s].flags |= FILE_NDELAY;
+         return 0;
+       }
+      else
+       {
+         errno = EINVAL;
+         return SOCKET_ERROR;
+       }
+    }
   errno = ENOTSOCK;
   return SOCKET_ERROR;
 }
@@ -8371,6 +8394,21 @@ sys_write (int fd, const void * buffer,
          nchars += n;
          if (n < 0)
            {
+             /* When there's no buffer space in a pipe that is in
+                non-blocking mode, _write returns ENOSPC.  We return
+                EAGAIN instead, which should trigger the logic in
+                send_process that enters waiting loop and calls
+                wait_reading_process_output to allow process input to
+                be accepted during the wait.  Those calls to
+                wait_reading_process_output allow sys_select to
+                notice when process input becomes available, thus
+                avoiding deadlock whereby each side of the pipe is
+                blocked on write.  */
+             if (errno == ENOSPC
+                 && fd < MAXDESC
+                 && ((fd_info[fd].flags & (FILE_PIPE | FILE_NDELAY))
+                     == (FILE_PIPE | FILE_NDELAY)))
+               errno = EAGAIN;
              nchars = n;
              break;
            }





reply via email to

[Prev in Thread] Current Thread [Next in Thread]