bug-hurd
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [VERY RFC PATCH 2/2] hurd: Make it possible to call memcpy very earl


From: Adhemerval Zanella Netto
Subject: Re: [VERY RFC PATCH 2/2] hurd: Make it possible to call memcpy very early
Date: Thu, 20 Apr 2023 17:38:01 -0300
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.10.0


On 20/04/23 17:25, H.J. Lu via Libc-alpha wrote:
> On Thu, Apr 20, 2023 at 11:43 AM Sergey Bugaev <bugaevc@gmail.com> wrote:
>>
>> Normally, in static builds, the first code that runs is _start, in e.g.
>> sysdeps/x86_64/start.S, which quickly calls __libc_start_main, passing
>> it the argv etc. Among the first things __libc_start_main does is
>> initializing the tunables (based on env), then CPU features, and then
>> calls _dl_relocate_static_pie (). Specifically, this runs ifunc
>> resolvers to pick, based on the CPU features discovered earlier, the
>> most suitable implementation of "string" functions such as memcpy.
>>
>> Before that point, calling memcpy (or other ifunc-resolved functions)
>> will not work.
>>
>> In the Hurd port, things are more complex. In order to get argv/env for
>> our process, glibc normally needs to do an RPC to the exec server,
>> unless our args/env are already located on the stack (which is what
>> happens to bootstrap processes spawned by GNU Mach). Fetching our
>> argv/env from the exec server has to be done before the call to
>> __libc_start_main, since we need to know what our argv/env are to pass
>> them to __libc_start_main.
>>
>> On the other hand, the implementation of the RPC (and other initial
>> setup needed on the Hurd before __libc_start_main can be run) is not
>> very trivial. In particular, it may (and on x86_64, will) use memcpy.
>> But as described above, calling memcpy before __libc_start_main can not
>> work, since the GOT entry for it is not yet initialized at that point.
>>
>> Work around this by pre-filling the GOT entry with the baseline version
>> of memcpy, __memcpy_sse2_unaligned. This makes it possible for early
>> calls to memcpy to just work. Once _dl_relocate_static_pie () is called,
>> the baseline version will get replaced with the most suitable one, and
>> that's what subsequent calls of memcpy are going to call.
>>
>> Also, apply the same treatment to __stpncpy, which can also be used by
>> the RPCs (see mig_strncpy.c), and is an ifunc-resolved function on both
>> x86_64 and i386.
>>
>> Tested on x86_64-gnu (!).
>>
>> Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
>> ---
>>
>> Please tell me:
>>
>> * if the approach is at all sane
>> * if there's a better way to do this without hardcoding
>>   "__memcpy_sse2_unaligned"
>> * are the GOT entries for indirect functions supposed to be statically
>>   initialized to anything (in the binary)? if yes, why? if not, why is
>>   PROGBITS and not NOBITS?
>> * am I doing all this _GLOBAL_OFFSET_TABLE_, @GOT, @GOTOFF, @GOTPCREL
>>   correctly?
>> * should there be a !PIC version as well? does the GOT exist under
>>   !PIC (to access indirect functions), and if it does then how do I
>>   access it? it would seem gcc just generates a direct $function even
>>   for indirect functions in this case.
>>
>>  sysdeps/mach/hurd/i386/static-start.S   | 7 +++++++
>>  sysdeps/mach/hurd/x86_64/static-start.S | 8 ++++++++
>>  2 files changed, 15 insertions(+)
>>
>> diff --git a/sysdeps/mach/hurd/i386/static-start.S 
>> b/sysdeps/mach/hurd/i386/static-start.S
>> index c5d12645..1b1ae559 100644
>> --- a/sysdeps/mach/hurd/i386/static-start.S
>> +++ b/sysdeps/mach/hurd/i386/static-start.S
>> @@ -19,6 +19,13 @@
>>         .text
>>         .globl _start
>>  _start:
>> +#ifdef PIC
>> +       call __x86.get_pc_thunk.bx
>> +       addl $_GLOBAL_OFFSET_TABLE_, %ebx
>> +       leal __stpncpy_ia32@GOTOFF(%ebx), %eax
>> +       movl %eax, __stpncpy@GOT(%ebx)
>> +#endif
>> +
>>         call _hurd_stack_setup
>>         xorl %edx, %edx
>>         jmp _start1
>> diff --git a/sysdeps/mach/hurd/x86_64/static-start.S 
>> b/sysdeps/mach/hurd/x86_64/static-start.S
>> index 982d3d52..81b3c0ac 100644
>> --- a/sysdeps/mach/hurd/x86_64/static-start.S
>> +++ b/sysdeps/mach/hurd/x86_64/static-start.S
>> @@ -19,6 +19,14 @@
>>         .text
>>         .globl _start
>>  _start:
>> +
>> +#ifdef PIC
>> +       leaq __memcpy_sse2_unaligned(%rip), %rax
>> +       movq %rax, memcpy@GOTPCREL(%rip)
>> +       leaq __stpncpy_sse2_unaligned(%rip), %rax
>> +       movq %rax, __stpncpy@GOTPCREL(%rip)
>> +#endif
>> +
>>         call _hurd_stack_setup
>>         xorq %rdx, %rdx
>>         jmp _start1
>> --
>> 2.40.0
>>
> 
> Doesn't it disable IFUNC for memcpy and stpncpy?
> 


Can't you use a similar strategy done by 
5355f9ca7b10183ce06e8a18003ba30f43774858 ?



reply via email to

[Prev in Thread] Current Thread [Next in Thread]