[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [RFC PATCH v3 5/6] hurd: Make it possible to call memcpy very early
From: |
Samuel Thibault |
Subject: |
Re: [RFC PATCH v3 5/6] hurd: Make it possible to call memcpy very early |
Date: |
Mon, 1 May 2023 01:21:32 +0200 |
User-agent: |
NeoMutt/20170609 (1.8.3) |
Applied, thanks!
Sergey Bugaev, le sam. 29 avril 2023 23:18:21 +0300, a ecrit:
> Normally, in static builds, the first code that runs is _start, in e.g.
> sysdeps/x86_64/start.S, which quickly calls __libc_start_main, passing
> it the argv etc. Among the first things __libc_start_main does is
> initializing the tunables (based on env), then CPU features, and then
> calls _dl_relocate_static_pie (). Specifically, this runs ifunc
> resolvers to pick, based on the CPU features discovered earlier, the
> most suitable implementation of "string" functions such as memcpy.
>
> Before that point, calling memcpy (or other ifunc-resolved functions)
> will not work.
>
> In the Hurd port, things are more complex. In order to get argv/env for
> our process, glibc normally needs to do an RPC to the exec server,
> unless our args/env are already located on the stack (which is what
> happens to bootstrap processes spawned by GNU Mach). Fetching our
> argv/env from the exec server has to be done before the call to
> __libc_start_main, since we need to know what our argv/env are to pass
> them to __libc_start_main.
>
> On the other hand, the implementation of the RPC (and other initial
> setup needed on the Hurd before __libc_start_main can be run) is not
> very trivial. In particular, it may (and on x86_64, will) use memcpy.
> But as described above, calling memcpy before __libc_start_main can not
> work, since the GOT entry for it is not yet initialized at that point.
>
> Work around this by pre-filling the GOT entry with the baseline version
> of memcpy, __memcpy_sse2_unaligned. This makes it possible for early
> calls to memcpy to just work. The initial value of the GOT entry is
> unused on x86_64, and changing it won't interfere with the relocation
> being performed later: once _dl_relocate_static_pie () is called, the
> baseline version will get replaced with the most suitable one, and that
> is what subsequent calls of memcpy are going to call.
>
> Checked on x86_64-gnu.
>
> Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
> ---
> Changes since v1:
> - drop the stpncpy, since it's apparently not required during early
> startup;
> - as a result of the above, there are no longer any changes to the
> i386 version;
> - drop the PIC/non-PIC split, we can always use %rip-relative addressing
> on x86_64;
> - as mentioned somewhere in the v1 thread, I have, since posting the v1,
> actually gone and checked that the relocations do work and the proper,
> more effecient memcpy version does get installed into the GOT slot and
> invoked whenever anything calls memcpy;
> - convinced myself that this is not a terrible hack but rather an OK
> solution;
> - worked out how this would be done on an architecture that (like i386,
> unlike x86_64) does need the original value in the GOT to perform the
> relocation, but (unlike i386, like x86_64) still uses an ifunc-selected
> memcpy in static builds: namely, we'd simply put the original ifunc
> address back into the GOT slot a few lines below, after the call to
> _hurd_stack_setup.
>
> sysdeps/mach/hurd/x86_64/static-start.S | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/sysdeps/mach/hurd/x86_64/static-start.S
> b/sysdeps/mach/hurd/x86_64/static-start.S
> index 982d3d52..cc8e2410 100644
> --- a/sysdeps/mach/hurd/x86_64/static-start.S
> +++ b/sysdeps/mach/hurd/x86_64/static-start.S
> @@ -19,6 +19,9 @@
> .text
> .globl _start
> _start:
> +
> + leaq __memcpy_sse2_unaligned(%rip), %rax
> + movq %rax, memcpy@GOTPCREL(%rip)
> call _hurd_stack_setup
> xorq %rdx, %rdx
> jmp _start1
> --
> 2.40.1
>
--
Samuel
---
Pour une évaluation indépendante, transparente et rigoureuse !
Je soutiens la Commission d'Évaluation de l'Inria.
- [PATCH v3 0/6] The remaining x86_64-gnu patches, Sergey Bugaev, 2023/04/29
- [PATCH v3 1/6] hurd: Implement sigreturn for x86_64, Sergey Bugaev, 2023/04/29
- [RFC PATCH v3 3/6] hurd: Replace reply port with a dead name on failed interruption, Sergey Bugaev, 2023/04/29
- [PATCH v3 2/6] hurd: Implement longjmp for x86_64, Sergey Bugaev, 2023/04/29
- [RFC PATCH v3 5/6] hurd: Make it possible to call memcpy very early, Sergey Bugaev, 2023/04/29
- Re: [RFC PATCH v3 5/6] hurd: Make it possible to call memcpy very early,
Samuel Thibault <=
- [DO NOT PUSH PATCH v3 6/6] TMP hurd: Lower BRK_START, Sergey Bugaev, 2023/04/29
- [PATCH v3 4/6] hurd: Add expected abilist files for x86_64, Sergey Bugaev, 2023/04/29