[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v2] Add io space MMAP for memory mapping devices and files
|
From: |
Andreas Klinger |
|
Subject: |
Re: [PATCH v2] Add io space MMAP for memory mapping devices and files |
|
Date: |
Mon, 8 Jan 2024 19:33:14 +0100 |
Hi Jose,
"Jose E. Marchesi" <jemarch@gnu.org> schrieb am Mo, 08. Jan 18:06:
>
> > Hi Jose,
> >
> > "Jose E. Marchesi" <jemarch@gnu.org> schrieb am Mo, 08. Jan 15:08:
> >>
> >> > Hi Jose,
> >> >
> >> > "Jose E. Marchesi" <jemarch@gnu.org> schrieb am Mo, 08. Jan 00:03:
> >> >>
> >> >> > Hi again,
> >> >> >
> >> >> > Andreas Klinger <ak@it-klinger.de> schrieb am Sa, 06. Jan 20:42:
> >> >> >> Hi Jose,
> >> >> >>
> >> >> >> "Jose E. Marchesi" <jemarch@gnu.org> schrieb am Sa, 06. Jan 20:14:
> >> >> >> > > I think that the approach of patch v2 could be optimized by a
> >> >> >> > > .set alignment
> >> >> >> > > variable for those guys who are aware of such hardware
> >> >> >> > > constraints on their
> >> >> >> > > architecture and want to optimize this. But I would set the
> >> >> >> > > default to a value
> >> >> >> > > which works for all, which is the size of the address bus.
> >> >> >> >
> >> >> >> > I think it is perfectly ok to have IOD specific configuration
> >> >> >> > parameters, sure.
> >> >> >> >
> >> >> >> > Let's make it a Poke variable that the user can customize, plus a
> >> >> >> > setting in the registry so the .set dot-command works on it. I
> >> >> >> > suppose
> >> >> >> > it shall have the name mmap in it, like pk_mmap_alignment. WDYT?
> >> >> >>
> >> >> >> great. It makes sense to name it mmap as it only applies to mmap io
> >> >> >> spaces.
> >> >> > [...]
> >> >> >
> >> >> > After thinking about the problem again I came to a different solution
> >> >> > which is
> >> >> > much simpler, doesn't need an additional .set variable and avoids the
> >> >> > signal
> >> >> > handler as well as it is copying with the address bus size if
> >> >> > possible, thus
> >> >> > avoiding performance bottlenecks. The read function is as follows:
> >> >> >
> >> >> > static int
> >> >> > ios_dev_mmap_pread (void *iod, void *buf, size_t count, ios_dev_off
> >> >> > offset)
> >> >> > {
> >> >> > struct ios_dev_mmap *dev_map = iod;
> >> >> > int align = sizeof(void*);
> >> >> > char *m = buf;
> >> >> >
> >> >> > if (offset > dev_map->size || count > dev_map->size - offset)
> >> >> > return IOD_EOF;
> >> >> >
> >> >> > /* copy unaligned bytes */
> >> >> > while (count && offset % align)
> >> >> > {
> >> >> > memcpy (m, dev_map->addr + offset, 1);
> >> >> > count--;
> >> >> > offset++;
> >> >> > m++;
> >> >> > }
> >> >> >
> >> >> > /* copy with the address bus size */
> >> >> > while (count >= align)
> >> >> > {
> >> >> > memcpy (m, dev_map->addr + offset, align);
> >> >>
> >> >> This is still copying a byte at a time, isn't it?
> >> >
> >> > There is no general answer to this question. It depends on what the C
> >> > library
> >> > and the compiler are making out of it. Thus it's also dependent on the
> >> > architecture and the value of align.
> >>
> >> Actually, memcpy is guaranteed to work regardless of any alignment
> >> requirement restrictions imposed by the architecture targetted by the
> >> compiler, so the general answer is that you can always assume that
> >> memcpy is equivalent to copying bytes by bytes.
> >>
> >> The compiler will only optimize a memcpy (foo, bar, 8) into a long word
> >> move if it knows that foo and bar are properly aligned according to the
> >> rules of the target.
> >
> > You're right.
> >
> > But I'm still wondering why I get the SIGBUS (on zynqmp) when doing a
> > memcpy of
> > 2 bytes from an odd address which is mmaped, like:
> >
> > memcpy(some-variable, odd-address, 2);
>
> I have no idea. I guess that's up to the driver providing the mmapped
> contents? Is the driver documenting the rules on how to access the
> contents?
I wrote the driver and I'm doing a usual memory mapping.
I wrote a function which demonstrates the difference if memcpy is called with a
variable a (for alignment) or with a fixed value:
#include <string.h>
int mycopy(void)
{
int a = 4;
unsigned int x;
unsigned char y[4];
memcpy(&x, &y[0], 4);
memcpy(&x, &y[0], a);
return 0;
}
In the objdump we can see that when called with the fixed value of 4 bytes to be
copied there are just 2 instructions and the function memcpy isn't called at
all. So the compiler resolves the memcpy without the function call.
I think this could really be optimized in the code snipped I provided earlier to
avoid byte by byte copying.
Here is the objdump:
mycopy.o: file format elf32-littlearm
Disassembly of section .text:
00000000 <mycopy>:
#include <string.h>
int mycopy(void)
{
0: e92d4800 push {fp, lr}
4: e28db004 add fp, sp, #4
8: e24dd010 sub sp, sp, #16
c: e59f2070 ldr r2, [pc, #112] ; 84 <mycopy+0x84>
10: e08f2002 add r2, pc, r2
14: e59f306c ldr r3, [pc, #108] ; 88 <mycopy+0x88>
18: e7923003 ldr r3, [r2, r3]
1c: e5933000 ldr r3, [r3]
20: e50b3008 str r3, [fp, #-8]
24: e3a03000 mov r3, #0
int a = 4;
28: e3a03004 mov r3, #4
2c: e50b3010 str r3, [fp, #-16]
30: e51b300c ldr r3, [fp, #-12]
unsigned int x;
unsigned char y[4];
memcpy(&x, &y[0], 4);
34: e50b3014 str r3, [fp, #-20] ; 0xffffffec
memcpy(&x, &y[0], a);
38: e51b2010 ldr r2, [fp, #-16]
3c: e24b100c sub r1, fp, #12
40: e24b3014 sub r3, fp, #20
44: e1a00003 mov r0, r3
48: ebfffffe bl 0 <memcpy>
return 0;
4c: e3a03000 mov r3, #0
[...]
> >> > But yes I need to confess that when looking into the glibc for example
> >> > this code
> >> > snippet could be optimized to allow larger amounts of data to be passed
> >> > to the
> >> > memcpy function so that it's up to the C library and compiler to
> >> > optimize it the
> >> > best.
> >> >
> >> >
> >> >>
> >> >> > count -= align;
> >> >> > offset += align;
> >> >> > m += align;
> >> >> > }
> >> >> >
> >> >> > /* copy remaining unaligned bytes */
> >> >> > while (count)
> >> >> > {
> >> >> > memcpy (m, dev_map->addr + offset, 1);
> >> >> > count--;
> >> >> > offset++;
> >> >> > m++;
> >> >> > }
> >> >> >
> >> >> > return IOD_OK;
> >> >> > }
> >> >> >
> >> >> > The write function is quite similar implemented in the same manner.
> >> >> >
> >> >> > WDYT?
> >> >> >
> >> >> > Andreas
> >> >
> >> > Best regards,
> >> >
> >> > Andreas
--
Andreas Klinger
Grabenreith 27
84508 Burgkirchen
+49 8623 919966
ak@it-klinger.de
www.it-klinger.de
www.grabenreith.de
signature.asc
Description: PGP signature
- Re: [PATCH v2] Add io space MMAP for memory mapping devices and files, (continued)
- Re: [PATCH v2] Add io space MMAP for memory mapping devices and files, Jose E. Marchesi, 2024/01/06
- Re: [PATCH v2] Add io space MMAP for memory mapping devices and files, Andreas Klinger, 2024/01/06
- Re: [PATCH v2] Add io space MMAP for memory mapping devices and files, Jose E. Marchesi, 2024/01/06
- Re: [PATCH v2] Add io space MMAP for memory mapping devices and files, Andreas Klinger, 2024/01/06
- Re: [PATCH v2] Add io space MMAP for memory mapping devices and files, Andreas Klinger, 2024/01/06
- Re: [PATCH v2] Add io space MMAP for memory mapping devices and files, Jose E. Marchesi, 2024/01/07
- Re: [PATCH v2] Add io space MMAP for memory mapping devices and files, Andreas Klinger, 2024/01/08
- Re: [PATCH v2] Add io space MMAP for memory mapping devices and files, Jose E. Marchesi, 2024/01/08
- Re: [PATCH v2] Add io space MMAP for memory mapping devices and files, Andreas Klinger, 2024/01/08
- Re: [PATCH v2] Add io space MMAP for memory mapping devices and files, Jose E. Marchesi, 2024/01/08
- Re: [PATCH v2] Add io space MMAP for memory mapping devices and files,
Andreas Klinger <=
- Re: [PATCH v2] Add io space MMAP for memory mapping devices and files, Jose E. Marchesi, 2024/01/08
- Re: [PATCH v2] Add io space MMAP for memory mapping devices and files, Andreas Klinger, 2024/01/09
- Re: [PATCH v2] Add io space MMAP for memory mapping devices and files, Jose E. Marchesi, 2024/01/09
- Re: [PATCH v2] Add io space MMAP for memory mapping devices and files, Andreas Klinger, 2024/01/09
- Re: [PATCH v2] Add io space MMAP for memory mapping devices and files, Jose E. Marchesi, 2024/01/09
- Re: [PATCH v2] Add io space MMAP for memory mapping devices and files, Jose E. Marchesi, 2024/01/09
- Re: [PATCH v2] Add io space MMAP for memory mapping devices and files, Andreas Klinger, 2024/01/09
- Re: [PATCH v2] Add io space MMAP for memory mapping devices and files, Jose E. Marchesi, 2024/01/09
- Re: [PATCH v2] Add io space MMAP for memory mapping devices and files, Jose E. Marchesi, 2024/01/09
- Re: Re: [PATCH v2] Add io space MMAP for memory mapping devices and files, Mohammad-Reza Nabipoor, 2024/01/07