[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v2] Add io space MMAP for memory mapping devices and files
|
From: |
Jose E. Marchesi |
|
Subject: |
Re: [PATCH v2] Add io space MMAP for memory mapping devices and files |
|
Date: |
Mon, 08 Jan 2024 19:55:52 +0100 |
|
User-agent: |
Gnus/5.13 (Gnus v5.13) |
> Hi Jose,
>
> "Jose E. Marchesi" <jemarch@gnu.org> schrieb am Mo, 08. Jan 18:06:
>>
>> > Hi Jose,
>> >
>> > "Jose E. Marchesi" <jemarch@gnu.org> schrieb am Mo, 08. Jan 15:08:
>> >>
>> >> > Hi Jose,
>> >> >
>> >> > "Jose E. Marchesi" <jemarch@gnu.org> schrieb am Mo, 08. Jan 00:03:
>> >> >>
>> >> >> > Hi again,
>> >> >> >
>> >> >> > Andreas Klinger <ak@it-klinger.de> schrieb am Sa, 06. Jan 20:42:
>> >> >> >> Hi Jose,
>> >> >> >>
>> >> >> >> "Jose E. Marchesi" <jemarch@gnu.org> schrieb am Sa, 06. Jan 20:14:
>> >> >> >> > > I think that the approach of patch v2 could be optimized by a
>> >> >> >> > > .set alignment
>> >> >> >> > > variable for those guys who are aware of such hardware
>> >> >> >> > > constraints on their
>> >> >> >> > > architecture and want to optimize this. But I would set the
>> >> >> >> > > default to a value
>> >> >> >> > > which works for all, which is the size of the address bus.
>> >> >> >> >
>> >> >> >> > I think it is perfectly ok to have IOD specific configuration
>> >> >> >> > parameters, sure.
>> >> >> >> >
>> >> >> >> > Let's make it a Poke variable that the user can customize, plus a
>> >> >> >> > setting in the registry so the .set dot-command works on it. I
>> >> >> >> > suppose
>> >> >> >> > it shall have the name mmap in it, like pk_mmap_alignment. WDYT?
>> >> >> >>
>> >> >> >> great. It makes sense to name it mmap as it only applies to mmap io
>> >> >> >> spaces.
>> >> >> > [...]
>> >> >> >
>> >> >> > After thinking about the problem again I came to a different
>> >> >> > solution which is
>> >> >> > much simpler, doesn't need an additional .set variable and avoids
>> >> >> > the signal
>> >> >> > handler as well as it is copying with the address bus size if
>> >> >> > possible, thus
>> >> >> > avoiding performance bottlenecks. The read function is as follows:
>> >> >> >
>> >> >> > static int
>> >> >> > ios_dev_mmap_pread (void *iod, void *buf, size_t count, ios_dev_off
>> >> >> > offset)
>> >> >> > {
>> >> >> > struct ios_dev_mmap *dev_map = iod;
>> >> >> > int align = sizeof(void*);
>> >> >> > char *m = buf;
>> >> >> >
>> >> >> > if (offset > dev_map->size || count > dev_map->size - offset)
>> >> >> > return IOD_EOF;
>> >> >> >
>> >> >> > /* copy unaligned bytes */
>> >> >> > while (count && offset % align)
>> >> >> > {
>> >> >> > memcpy (m, dev_map->addr + offset, 1);
>> >> >> > count--;
>> >> >> > offset++;
>> >> >> > m++;
>> >> >> > }
>> >> >> >
>> >> >> > /* copy with the address bus size */
>> >> >> > while (count >= align)
>> >> >> > {
>> >> >> > memcpy (m, dev_map->addr + offset, align);
>> >> >>
>> >> >> This is still copying a byte at a time, isn't it?
>> >> >
>> >> > There is no general answer to this question. It depends on what the C
>> >> > library
>> >> > and the compiler are making out of it. Thus it's also dependent on the
>> >> > architecture and the value of align.
>> >>
>> >> Actually, memcpy is guaranteed to work regardless of any alignment
>> >> requirement restrictions imposed by the architecture targetted by the
>> >> compiler, so the general answer is that you can always assume that
>> >> memcpy is equivalent to copying bytes by bytes.
>> >>
>> >> The compiler will only optimize a memcpy (foo, bar, 8) into a long word
>> >> move if it knows that foo and bar are properly aligned according to the
>> >> rules of the target.
>> >
>> > You're right.
>> >
>> > But I'm still wondering why I get the SIGBUS (on zynqmp) when doing a
>> > memcpy of
>> > 2 bytes from an odd address which is mmaped, like:
>> >
>> > memcpy(some-variable, odd-address, 2);
>>
>> I have no idea. I guess that's up to the driver providing the mmapped
>> contents? Is the driver documenting the rules on how to access the
>> contents?
>
> I wrote the driver and I'm doing a usual memory mapping.
Ok then it would be very strange to get SIGBUS when accessing a single
byte.
>
> I wrote a function which demonstrates the difference if memcpy is called with
> a
> variable a (for alignment) or with a fixed value:
>
> #include <string.h>
>
> int mycopy(void)
> {
> int a = 4;
> unsigned int x;
> unsigned char y[4];
>
> memcpy(&x, &y[0], 4);
>
> memcpy(&x, &y[0], a);
>
> return 0;
> }
>
> In the objdump we can see that when called with the fixed value of 4 bytes to
> be
> copied there are just 2 instructions and the function memcpy isn't called at
> all. So the compiler resolves the memcpy without the function call.
That is to be expected, since the compiler knows the addresses of both x
and y automatics, and therefore their alignment.
> I think this could really be optimized in the code snipped I provided earlier
> to
> avoid byte by byte copying.
In the code snippet earlier the address of the buffer where the stuff is
copied is passed as an argument to th function, so I don't see how the
compiler could optimize it that way.
Have you tried to disassemble the pread function in the mmap IOD to see
what the calls to memcpy are compiled to?
>
> Here is the objdump:
>
> mycopy.o: file format elf32-littlearm
>
> Disassembly of section .text:
>
> 00000000 <mycopy>:
> #include <string.h>
>
> int mycopy(void)
> {
> 0: e92d4800 push {fp, lr}
> 4: e28db004 add fp, sp, #4
> 8: e24dd010 sub sp, sp, #16
> c: e59f2070 ldr r2, [pc, #112] ; 84 <mycopy+0x84>
> 10: e08f2002 add r2, pc, r2
> 14: e59f306c ldr r3, [pc, #108] ; 88 <mycopy+0x88>
> 18: e7923003 ldr r3, [r2, r3]
> 1c: e5933000 ldr r3, [r3]
> 20: e50b3008 str r3, [fp, #-8]
> 24: e3a03000 mov r3, #0
> int a = 4;
> 28: e3a03004 mov r3, #4
> 2c: e50b3010 str r3, [fp, #-16]
> 30: e51b300c ldr r3, [fp, #-12]
> unsigned int x;
> unsigned char y[4];
>
> memcpy(&x, &y[0], 4);
> 34: e50b3014 str r3, [fp, #-20] ; 0xffffffec
>
> memcpy(&x, &y[0], a);
> 38: e51b2010 ldr r2, [fp, #-16]
> 3c: e24b100c sub r1, fp, #12
> 40: e24b3014 sub r3, fp, #20
> 44: e1a00003 mov r0, r3
> 48: ebfffffe bl 0 <memcpy>
>
> return 0;
> 4c: e3a03000 mov r3, #0
> [...]
>
>> >> > But yes I need to confess that when looking into the glibc for example
>> >> > this code
>> >> > snippet could be optimized to allow larger amounts of data to be passed
>> >> > to the
>> >> > memcpy function so that it's up to the C library and compiler to
>> >> > optimize it the
>> >> > best.
>> >> >
>> >> >
>> >> >>
>> >> >> > count -= align;
>> >> >> > offset += align;
>> >> >> > m += align;
>> >> >> > }
>> >> >> >
>> >> >> > /* copy remaining unaligned bytes */
>> >> >> > while (count)
>> >> >> > {
>> >> >> > memcpy (m, dev_map->addr + offset, 1);
>> >> >> > count--;
>> >> >> > offset++;
>> >> >> > m++;
>> >> >> > }
>> >> >> >
>> >> >> > return IOD_OK;
>> >> >> > }
>> >> >> >
>> >> >> > The write function is quite similar implemented in the same manner.
>> >> >> >
>> >> >> > WDYT?
>> >> >> >
>> >> >> > Andreas
>> >> >
>> >> > Best regards,
>> >> >
>> >> > Andreas
- Re: [PATCH v2] Add io space MMAP for memory mapping devices and files, (continued)
- Re: [PATCH v2] Add io space MMAP for memory mapping devices and files, Andreas Klinger, 2024/01/06
- Re: [PATCH v2] Add io space MMAP for memory mapping devices and files, Jose E. Marchesi, 2024/01/06
- Re: [PATCH v2] Add io space MMAP for memory mapping devices and files, Andreas Klinger, 2024/01/06
- Re: [PATCH v2] Add io space MMAP for memory mapping devices and files, Andreas Klinger, 2024/01/06
- Re: [PATCH v2] Add io space MMAP for memory mapping devices and files, Jose E. Marchesi, 2024/01/07
- Re: [PATCH v2] Add io space MMAP for memory mapping devices and files, Andreas Klinger, 2024/01/08
- Re: [PATCH v2] Add io space MMAP for memory mapping devices and files, Jose E. Marchesi, 2024/01/08
- Re: [PATCH v2] Add io space MMAP for memory mapping devices and files, Andreas Klinger, 2024/01/08
- Re: [PATCH v2] Add io space MMAP for memory mapping devices and files, Jose E. Marchesi, 2024/01/08
- Re: [PATCH v2] Add io space MMAP for memory mapping devices and files, Andreas Klinger, 2024/01/08
- Re: [PATCH v2] Add io space MMAP for memory mapping devices and files,
Jose E. Marchesi <=
- Re: [PATCH v2] Add io space MMAP for memory mapping devices and files, Andreas Klinger, 2024/01/09
- Re: [PATCH v2] Add io space MMAP for memory mapping devices and files, Jose E. Marchesi, 2024/01/09
- Re: [PATCH v2] Add io space MMAP for memory mapping devices and files, Andreas Klinger, 2024/01/09
- Re: [PATCH v2] Add io space MMAP for memory mapping devices and files, Jose E. Marchesi, 2024/01/09
- Re: [PATCH v2] Add io space MMAP for memory mapping devices and files, Jose E. Marchesi, 2024/01/09
- Re: [PATCH v2] Add io space MMAP for memory mapping devices and files, Andreas Klinger, 2024/01/09
- Re: [PATCH v2] Add io space MMAP for memory mapping devices and files, Jose E. Marchesi, 2024/01/09
- Re: [PATCH v2] Add io space MMAP for memory mapping devices and files, Jose E. Marchesi, 2024/01/09
- Re: Re: [PATCH v2] Add io space MMAP for memory mapping devices and files, Mohammad-Reza Nabipoor, 2024/01/07
- Re: Re: [PATCH v2] Add io space MMAP for memory mapping devices and files, Andreas Klinger, 2024/01/08