poke-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2] Add io space MMAP for memory mapping devices and files


From: Jose E. Marchesi
Subject: Re: [PATCH v2] Add io space MMAP for memory mapping devices and files
Date: Mon, 08 Jan 2024 18:06:25 +0100
User-agent: Gnus/5.13 (Gnus v5.13)

> Hi Jose,
>
> "Jose E. Marchesi" <jemarch@gnu.org> schrieb am Mo, 08. Jan 15:08:
>> 
>> > Hi Jose,
>> >
>> > "Jose E. Marchesi" <jemarch@gnu.org> schrieb am Mo, 08. Jan 00:03:
>> >> 
>> >> > Hi again,
>> >> >
>> >> > Andreas Klinger <ak@it-klinger.de> schrieb am Sa, 06. Jan 20:42:
>> >> >> Hi Jose,
>> >> >> 
>> >> >> "Jose E. Marchesi" <jemarch@gnu.org> schrieb am Sa, 06. Jan 20:14:
>> >> >> > > I think that the approach of patch v2 could be optimized by a .set 
>> >> >> > > alignment
>> >> >> > > variable for those guys who are aware of such hardware constraints 
>> >> >> > > on their
>> >> >> > > architecture and want to optimize this. But I would set the 
>> >> >> > > default to a value
>> >> >> > > which works for all, which is the size of the address bus.
>> >> >> > 
>> >> >> > I think it is perfectly ok to have IOD specific configuration
>> >> >> > parameters, sure.
>> >> >> > 
>> >> >> > Let's make it a Poke variable that the user can customize, plus a
>> >> >> > setting in the registry so the .set dot-command works on it.  I 
>> >> >> > suppose
>> >> >> > it shall have the name mmap in it, like pk_mmap_alignment.  WDYT?
>> >> >> 
>> >> >> great. It makes sense to name it mmap as it only applies to mmap io 
>> >> >> spaces.
>> >> > [...]
>> >> >
>> >> > After thinking about the problem again I came to a different solution 
>> >> > which is
>> >> > much simpler, doesn't need an additional .set variable and avoids the 
>> >> > signal
>> >> > handler as well as it is copying with the address bus size if possible, 
>> >> > thus
>> >> > avoiding performance bottlenecks. The read function is as follows:
>> >> >
>> >> > static int
>> >> > ios_dev_mmap_pread (void *iod, void *buf, size_t count, ios_dev_off 
>> >> > offset)
>> >> > {
>> >> >   struct ios_dev_mmap *dev_map = iod;
>> >> >   int align = sizeof(void*);
>> >> >   char *m = buf;
>> >> >
>> >> >   if (offset > dev_map->size || count > dev_map->size - offset)
>> >> >     return IOD_EOF;
>> >> >
>> >> >   /* copy unaligned bytes */
>> >> >   while (count && offset % align)
>> >> >     {
>> >> >       memcpy (m, dev_map->addr + offset, 1);
>> >> >       count--;
>> >> >       offset++;
>> >> >       m++;
>> >> >     }
>> >> >
>> >> >   /* copy with the address bus size */
>> >> >   while (count >= align)
>> >> >     {
>> >> >       memcpy (m, dev_map->addr + offset, align);
>> >> 
>> >> This is still copying a byte at a time, isn't it?
>> >
>> > There is no general answer to this question. It depends on what the C 
>> > library
>> > and the compiler are making out of it. Thus it's also dependent on the
>> > architecture and the value of align.
>> 
>> Actually, memcpy is guaranteed to work regardless of any alignment
>> requirement restrictions imposed by the architecture targetted by the
>> compiler, so the general answer is that you can always assume that
>> memcpy is equivalent to copying bytes by bytes.
>> 
>> The compiler will only optimize a memcpy (foo, bar, 8) into a long word
>> move if it knows that foo and bar are properly aligned according to the
>> rules of the target.
>
> You're right.
>
> But I'm still wondering why I get the SIGBUS (on zynqmp) when doing a memcpy 
> of
> 2 bytes from an odd address which is mmaped, like:
>
> memcpy(some-variable, odd-address, 2);

I have no idea.  I guess that's up to the driver providing the mmapped
contents?  Is the driver documenting the rules on how to access the
contents?

>
>
>> > But yes I need to confess that when looking into the glibc for example 
>> > this code
>> > snippet could be optimized to allow larger amounts of data to be passed to 
>> > the
>> > memcpy function so that it's up to the C library and compiler to optimize 
>> > it the
>> > best.
>> >
>> >
>> >> 
>> >> >       count -= align;
>> >> >       offset += align;
>> >> >       m += align;
>> >> >     }
>> >> >
>> >> >   /* copy remaining unaligned bytes */
>> >> >   while (count)
>> >> >     {
>> >> >       memcpy (m, dev_map->addr + offset, 1);
>> >> >       count--;
>> >> >       offset++;
>> >> >       m++;
>> >> >     }
>> >> >
>> >> >   return IOD_OK;
>> >> > }
>> >> >
>> >> > The write function is quite similar implemented in the same manner.
>> >> >
>> >> > WDYT?
>> >> >
>> >> > Andreas
>> >
>> > Best regards,
>> >
>> > Andreas



reply via email to

[Prev in Thread] Current Thread [Next in Thread]