[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Intermittent unexec failures on Linux >= 2.6.25
From: |
Jan Djärv |
Subject: |
Re: Intermittent unexec failures on Linux >= 2.6.25 |
Date: |
Tue, 21 Oct 2008 08:32:49 +0200 |
User-agent: |
Thunderbird 2.0.0.17 (X11/20080925) |
I got it from the kernel source at the time.
I see now that there is no lower limit on the heap gap produced by
randomization. I guess we must exec every time to be sure.
I think it is only heap randomization that unexec has problems with. Other
address randomizations are ok. But we will unconditionally turn off all of
them when we exec and dump.
I have checked in a fix.
Jan D.
Chong Yidong skrev:
> Hi Jan,
>
>>> Building of Emacs 22.2.92 (also 22.2) on Linux 2.6.25 (or later)
>>> sometimes fails with a segmentation fault in dump-emacs / unexec.
>>>
>>> This was reported by Jan Hrabe as Gentoo bug 236579,
>>> <http://bugs.gentoo.org/236579>.
>>>
>>> I've investigated and found that indeed temacs fails in dump-emacs
>>> intermittently. For my test, I have run "make; rm src/emacs" 250 times
>>> in a loop, and in 3 cases a segmentation fault of temacs occured.
>>>
>>> The problem seems to be that heap_bss_diff is too large for unexec
>>> to succeed (due to kernel heap randomisation, see
>>> <http://lkml.org/lkml/2007/10/23/435>).
>>>
>>> On the other hand, it is (in case of the 3 failures) not large enough
>>> to fulfill the condition (heap_bss_diff > MAX_HEAP_BSS_DIFF) which
>>> would trigger the correct behaviour, namely setting the personality
>>> and calling execve of itself.
>
> Do you remember the rationale for setting
>
> #define MAX_HEAP_BSS_DIFF (1024*1024)
>
> in emacs.c? This variable was introduced by you on 2004-10-20, and I'm
> not too familiar with this part of the code.
>
>>> In the 247 successful cases, heap_bss_diff first had a large value
>>> (up to about 32 MiB), and in the exec'd temacs its value was constant,
>>> namely 1887 bytes.
>>>
>>> The 3 failures had heap_bss_diff = 575327, 911199, and 268127, which
>>> are all smaller than MAX_HEAP_BSS_DIFF (1024*1024), so execvp was
>>> _not_ called.
>>>
>>> Where does that value of MAX_HEAP_BSS_DIFF = 1 MiB come from? Could it
>>> be decreased, or could temacs execve itself unconditionally on Linux?
>>> In my opinion, a failure rate of about 1 % is too high.
>>>
>>> (The problem doesn't exist for Linux 2.6.24, or if heap randomisation
>>> is turned off, i.e. with /proc/sys/kernel/randomize_va_space < 2.)