bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#25172: 26.0.50; Concurrency feature, sit-for doesn't work (crashing


From: Elias Mårtenson
Subject: bug#25172: 26.0.50; Concurrency feature, sit-for doesn't work (crashing and unexpected behaviour)
Date: Tue, 13 Dec 2016 10:38:09 +0800

I was about to test this, but I have been unable to reproduce the problem as of the current version: 8db7b65d66f01e90a05cc9f11c67667233d84ca0

Has a fix for this been explicitly committed, or did the behaviour change unexpectedly because of some other change?

Regards,
Elias

On 13 December 2016 at 01:37, Eli Zaretskii <eliz@gnu.org> wrote:
> From: Elias Mårtenson <lokedhs@gmail.com>
> Date: Mon, 12 Dec 2016 12:50:24 +0800
> Cc: Eli Zaretskii <eliz@gnu.org>, 25172@debbugs.gnu.org
>
> I tried with the latest version (a92a027d58cb4df5bb6c7e3c546a72183a192f45)
> and I'm still getting the same error.
>
> The stack trace is as follows:
> [...]
> #34 0x0000000000578a22 in emacs_abort () at sysdep.c:2342
> #35 0x0000000000564247 in unblock_input_to (level=-1) at keyboard.c:7167
> #36 0x000000000056425e in unblock_input () at keyboard.c:7183
> #37 0x000000000069c5e4 in xg_select (fds_lim=15, rfds=0x7fffe59e19a0,
> wfds=0x7fffe59e1920, efds=0x0, timeout=0x7fffe59e1900, sigmask=0x0) at
> xgselect.c:162

xg_select uses block_input/unblock_input, something other *select
implementations used by Emacs don't do (as those others are system
APIs).  block_input/unblock_input manipulate a global variable that is
not incremented and decremented atomically, so it's fundamentally
thread-unsafe.  Moreover, some places in Emacs reset that global
variable to zero (although I don't believe those places are part of
your scenario).

The above is especially important because the calls to the *select
functions are about the only place in Emacs where several threads can
run in parallel, because they are called by thread_select like this:

  release_global_lock ();
  sa->result = (sa->func) (sa->max_fds, sa->rfds, sa->wfds, sa->efds,
                           sa->timeout, sa->sigmask);
  acquire_global_lock (self);

So between the call to release_global_lock, which allows another
thread to grab the lock, and the subsequent call to
acquire_global_lock several threads could run and more or less
simultaneously call the *select function.  If that function is
xg_select, these threads might step on each other's toes by calling
block_input/unblock_input in parallel.  This could easily cause the
global variable to become negative, which then causes the above abort.

Long story short, could you please try removing the calls to
block_input/unblock_input from xgselect.c, and see if that solves
these crashes?  (These calls were introduced to fix a rare and elusive
bug, but I don't think you will see that bug unless you do what that
bug's recipe calls for.  And anyway, this removal is just so we see
whether this is indeed the reason for the problem, I don't really
suggest to remove them for good.)

Thanks.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]