gcl-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 1/2] Fix libX11 early init causing mmap layout changed


From: Jinyang He
Subject: Re: [PATCH 1/2] Fix libX11 early init causing mmap layout changed
Date: Fri, 12 Jul 2024 16:16:04 +0800
User-agent: Mozilla/5.0 (X11; Linux loongarch64; rv:68.0) Gecko/20100101 Thunderbird/68.7.0

On 2024-07-12 00:11, Camm Maguire wrote:

Greetings, and thank you so much for your explanation!  Please correct
me if any of the following are in error, as I am having trouble locating
an arch with this sequence of calls to test.  (amd64 does not reexec,
i386 does not call malloc before main).
Enmmm, it is need libX11 1.8+ with "--enable-thread-safety-constructor".
In Linux LoongArch machine, the "readelf -d raw_pre_gcl | grep NEEDED"
shows, ">>>libX11.so.6<<<, libopcodes-2.42.so, libm.so.6, libgmp.so.10,"
 "libtirpc.so.3, libreadline.so.8, libc.so.6" are needed. While I saw
another Debian12 amd64 machine, "libm.so.6, libgmp.so.10, libtirpc.so.3,"
"libc.so.6" are needed. Why amd64 not need libX11...? (enable_xgcl? My
machine seems cannot use "--enable-xgcl=yes")
(I observed that libX11 init_array calls XInitThreads on it.)
1) The primary problem is the need for a persistent load address of
libboot.so across raw_pre and saved_pre images.  (This requirement
itself is a weakness in the bootstrap process in the master branch and
should be addressed directly.)
yes
2) The problem only exists on arches which must call execve to reset the
personality to avoid sbrk address randomization, and which call malloc
before main. (Presumably lacking the former, the layout is already
selected before we even get to malloc before main, and lacking the
latter, we have not set the rlimit_stack by the time we get to execve.)
yes
If this is right, my remaining mystery is why the raw_ and saved_ images
differ in the layout selection.  My understanding was:

a) Both either call malloc before main or do not.
The malloc path,
  "malloc->malloc_internal->
    gcl_alloc_initialized ? gcl_init_alloc->set rlimit_stack
                          : not set rlimit_stack
The first time raw_ set rlimit_stack and set gcl_alloc_initialized = true.
Then saved_ not set rlimit_stack before reexec although called malloc
because the gcl_alloc_initialized=true and we skip gcl_init_alloc.
b) Both either reexec to set the personality or do not

therefore the layout should be the same by the time we get to
init_boot().  It seems clean that either a) or b) are in error.  Do you
know which?

It does seem that the simplest solution is to simply leave the stack
limits alone when calling gcl_init_alloc before main.  This will work,
right?
I think this works, but may it break something and where set stack
rlimit after main (or not set stack rlimit anymore)? I'm unfamiliar
with gcl, so I worry about something broken.
This is obviously exceedingly fragile and wrong headed for us to try to
keep up with linux kernel mmap layout algorithms.  As I said before the
underlying issue is we have a bootstrap problem.  The idea is to have
some small set of code implementing some core lisp functions in C which
are later provided in lisp.  We want to jettison the memory used at the
right time, so dlopen comes to mind.
I think libX11.so can also use dlopen after main. It also seems to
solve this problem, but not clear for me.
  But the right way to do this is to
hardlink this library in for the pre_ images only, and arrange all
others not to need it at all.  This appears to call for a delicate
reordering of the image initialization process and is involved.  In the
meantime, if you agree, we will just skip rlimit_stack in gcl_init_alloc
before main.
That works well on LoongArch. I think it is necessary to test xgcl
on amd64 with libX11.

Thanks for your patience and detailed reply.
Jinyang

Take care,


Jinyang He <hejinyang@loongson.cn> writes:

On 2024-07-11 09:03, Camm Maguire wrote:
Greetings!  In followup to my earlier query, can you please detail what
happens when resource limits are set before execve as you mention in
your comment?
For clarity, I'll briefly describe the mmap mechanism on Linux. It provides
two memory layouts usually, "arch_get_unmapped_{area, area_topdown}".
When user mode does mmap, it calls, (source based 6.10.0-rc6)
do_mmap->__get_unmapped_area->mm_get_unmapped_area_vmflags->arch_get_unmapped_{area,
area_topdown}
It choose topdown or other by MMF_TOPDOWN bit.
When user mode does execve, it calls,
...->load_elf_binary->setup_new_exec->arch_pick_mmap_layout->...
The arch_pick_mmap_layout clear or set MMF_TOPDOWN bit by "mmap_is_legacy".
The CONFIG_STACK_GROWSUP is always false except parisc. Thus, if the
"rlim_stack->rlim_cur == RLIM_INFINITY" it always returns true and clear
the MMF_TOPDOWN bit. Once the MMF_TOPDOWN bit is set in do_execve, we
have little chance to change it in user mode.

The above details means, once the "rlim_stack->rlim_cur" set to -1ull before
execve. The new exec mmap layout is not topdown.
The quickly check cmd,
  "$ cat /proc/self/maps; ulimit -s unlimited; cat /proc/self/maps"

In another story, I'll give the interaction between gcl and libX11.
The call trace is follows, (libX11 with 1.8+
--enable-thread-safety-constructor)
#0  gcl_init_alloc (cs_start=0x7ffffffef9a8) at alloc.c:1203
#1  0x0000000000153df4 in malloc_internal (size=40) at alloc.c:1665
#2  0x0000000000153ed8 in malloc (size=40) at alloc.c:1682
#3  0x00007ffff7e63474 in XInitThreads () at
  /usr/src/debug/libx11/libX11-1.8.7/src/locking.c:614
#4  0x00007ffff7fbdc84 in call_init (l=<optimized out>, argc=1,
  argv=0x7ffffffefa98, env=0x7ffffffefaa8) at dl-init.c:74
#5  call_init (l=<optimized out>, argc=<optimized out>,
  argv=0x7ffffffefa98, env=0x7ffffffefaa8) at dl-init.c:26
#6  0x00007ffff7fbddac in _dl_init (main_map=0x7ffff7fed360, argc=1,
  argv=0x7ffffffefa98, env=0x7ffffffefaa8) at dl-init.c:121
#7  0x00007ffff7fd1bb8 in _start () from
  /lib64/ld-linux-loongarch-lp64d.so.1
The gcl depends on libX11.so so when load libX11.so, it calls XInitThreads.

In the past, the order of call gcl_init_alloc is
   main->execve->main->gcl_init_alloc(set rlimit)
Now the order is
   XInitThreads(gcl_init_alloc(set rlimit))->main->execve->main...
In short, the glibc::malloc is preempted by gcl::malloc and causing
earlier set stack rlimit before doing execve.

When make raw_pre_gcl to saved_pre_gcl, the raw_pre_gcl calls
gcl_init_alloc due to "gcl_alloc_initialized" is false and then
the mmap memory layout changed after execve. When exec saved_pre_gcl,
the "gcl_alloc_initialized" is true so mmap memory layout is not changed.
The "libboot.so" has been mapped to diffrent virtuall address,
finally the function call is broken.


   Isn't this still the case with this patch?
This patch restore the stack limits to keep the same stack limits after
execve itself. It is some like we should keep the same environments except
setting "GCL_UNRANDOMIZE" and setting "ADDR_NO_RANDOMIZE". However, the
stack limits has been changed before execve, it changes the environments.
   Rather the
patch appears to intend to reassert this setting immediately before
execve.

Take care,

Greetings, and thanks so much again for these patches!
Can you please give me a bit of explanation on this one?  What alters
the resource limits between gcl_init_alloc 'before main' and main
itself?
The raw_pre_gcl calling gcl_init_alloc but saved_pre_gcl not calling it
when do initcall(XInitThreads), it happens before do execve.
   Would this patch work just as well if the saved resource limits
were restored in the second invocation of gcl_init_alloc from main
instead of immediately prior to execve in unrandomize.h?
Badly, the mmap memory layout is determined at execve.

Thanks,
Jinyang
Take care,
Jinyang He <hejinyang@loongson.cn> writes:

After libX.so release 1.8 with `--enable-thread-safety-constructor`,
the libX.so calls `malloc` indirectly by calling XInitThreads() in
init_array. The raw_ preempts `malloc` in EXE and calls gcl_init_alloc()
before main. It set rlimit_stack before execve itself, and in linux
kernel this action changes the mmap way from topdown to downtop
(in many archs). Thus, saved the rlimit_stack if before_main and
restore it if it need re-execve itself.
---
   gcl/h/unrandomize.h | 16 ++++++++++++++++
   gcl/o/alloc.c       | 10 ++++++++++
   gcl/o/main.c        |  2 ++
   3 files changed, 28 insertions(+)

diff --git a/gcl/h/unrandomize.h b/gcl/h/unrandomize.h
index 04b7c507e..9ca66d0a9 100644
--- a/gcl/h/unrandomize.h
+++ b/gcl/h/unrandomize.h
@@ -5,6 +5,9 @@
   #include <string.h>
   #include <alloca.h>
   #include <errno.h>
+#if defined(__linux__) && defined(RLIMIT_STACK)
+#include <sys/resource.h>
+#endif
     {
     errno=0;
@@ -53,6 +56,19 @@
        errno=0;
   #ifdef HAVE_GCL_CLEANUP
        gcl_cleanup(0);
+#endif
+#if defined(__linux__) && defined(RLIMIT_STACK)
+       {
+         extern struct rlimit rl_stack_saved;
+         /* Reset the rlim_cur incase*/
+         if (rl_stack_saved.rlim_cur != 0 && rl_stack_saved.rlim_max != 0)
+           if(setrlimit(RLIMIT_STACK, &rl_stack_saved)) {
+             printf("restore rlimit_stack failure %d\n",errno);
+             exit(-1);
+           }
+         rl_stack_saved = (struct rlimit){0, 0};
+         errno=0;
+       }
   #endif
        execve(*a,a,n);
        printf("execve failure %d\n",errno);
diff --git a/gcl/o/alloc.c b/gcl/o/alloc.c
index 8620df495..4d29ec860 100644
--- a/gcl/o/alloc.c
+++ b/gcl/o/alloc.c
@@ -1189,6 +1189,10 @@ init_tm(enum type t, char *name, int elsize, int nelts, 
int sgc,int distinct) {
      simplicity.  set_maxpage is overloaded, and the positioning of its
      call is too fragile.  20050115 CM*/
   static int gcl_alloc_initialized;
+#if defined(__linux__) && defined(RLIMIT_STACK)
+extern int before_main;
+struct rlimit rl_stack_saved;
+#endif
     object malloc_list=Cnil;
   @@ -1238,6 +1242,12 @@ gcl_init_alloc(void *cs_start) {
       }
         massert(!getrlimit(RLIMIT_STACK, &rl));
+#ifdef __linux__
+    if (before_main)
+      rl_stack_saved = rl;
+    else
+      rl_stack_saved = (struct rlimit){0, 0};
+#endif
       if (rl.rlim_cur!=RLIM_INFINITY && (rl.rlim_max == RLIM_INFINITY || 
rl.rlim_max > rl.rlim_cur)) {
         rl.rlim_cur = rl.rlim_max; /* == RLIM_INFINITY ? rl.rlim_max : 
rl.rlim_max/64; */
         massert(!setrlimit(RLIMIT_STACK,&rl));
diff --git a/gcl/o/main.c b/gcl/o/main.c
index 6621c3a16..be241d1af 100644
--- a/gcl/o/main.c
+++ b/gcl/o/main.c
@@ -574,8 +574,10 @@ 
DEFUN("KCL-SELF",object,fSkcl_self,SI,0,0,NONE,OO,OO,OO,OO,(void),"") {
     }
   +int before_main=1;
   int
   main(int argc, char **argv, char **envp) {
+  before_main=0;
       GET_FULL_PATH_SELF(kcl_self);
     *argv=kcl_self;







reply via email to

[Prev in Thread] Current Thread [Next in Thread]