emacs-pretest-bug
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: emacs aborts (almost always?) when pressing C-g in gnus group mode


From: David Hunter
Subject: Re: emacs aborts (almost always?) when pressing C-g in gnus group mode
Date: Mon, 02 May 2005 03:05:36 -0400
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.2) Gecko/20040804 Netscape/7.2 (ax)

Timmy Douglas wrote:
(I'm not on the list so please CC replies)

When I press C-g in gnus, emacs almost always aborts (pretty much
crashing). I'm not sure exactly why waiting_for_input is 1, or the
meaning of it being 1 at this instance.

I've been following your crash debugging efforts, and I think I have a working 
hypothesis.  All of your backtraces look valid starting in send_process.  To 
get to that point, longjmp(send_process_frame) must be called by 
send_process_trap.  It seems that send_process is installing its SIGPIPE 
handler but failing to restore the old handler.  Two suspect execution paths 
exist: (1) emacs_write throws SIGPIPE, and (2) sendto fails and errno == 
EMSGSIZE.  In both cases, emacs uses longjmp to throw to an error handler or 
top-level.

When send_process_trap is called due to SIGPIPE, the (process? thread?) signal 
mask is set to block SIGPIPE.  Returning from the signal handler would normally 
unblock the signal, but (according to GNU/Linux man pages) longjmp does not.  
So, this first SIGPIPE invocation works as expected, but two unwanted side 
effects remain:  SIGPIPE is blocked, and the send_process_trap signal handler 
is still installed.

The problem is compounded the second time send_process writes to a disconnected 
file or socket.  Since SIGPIPE is still blocked, send_process_trap is not 
called, and the signal remains pending.  But send_process still called 
setjmp(send_process_frame), storing a stack pointer that becomes invalid as 
soon as send_process returns.  Now the scene is set for disaster:  a blocked 
signal is pending which, when unblocked, will longjmp to a bad stack.

The fuse is lit by calling sigfree.  Pressing C-g is guaranteed to do this (see 
quit_throw_to_read_char).  This clears the signal mask, unblocking the pending 
SIGPIPE.  Before sigfree returns, send_process_trap is called, which longjmp's 
to send_process with a bad stack.  This part is proven by all of Timmy's 
backtraces:  The frames before send_process (#0-#n) are valid, but the 
parameters to send_process are garbage, and the frame after send_process (#n+1) 
can't possibly have called it.  If unwind_to_catch is called (say, by Fsignal), 
the lisp backtrace is also garbaged.

It seems to me that there are two fixes required:

1.  Ensure that all execution paths from send_process restore the old SIGPIPE 
handler, preferably immediately after each write function returns, and after 
send_process_trap exits.  Hint: report_file_error doesn't return!

2.  Unblock SIGPIPE after restoring the old signal handler.

-Dave




reply via email to

[Prev in Thread] Current Thread [Next in Thread]