chicken-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Chicken-users] SRFI 4 vectors size limitation.


From: Peter Bex
Subject: Re: [Chicken-users] SRFI 4 vectors size limitation.
Date: Mon, 5 May 2014 20:06:57 +0200
User-agent: Mutt/1.4.2.3i

On Mon, May 05, 2014 at 07:33:28PM +0200, Jubjub wrote:
> I've noticed that srfi-4 homogeneous numeric vectors are limited to 0xFFFFF
> bytes, regardless of their type.
> 
> This amounts to 16MB of data, which I've found insufficient since they're
> the most widely used and convenient way to manipulate raw bytes, and the
> most common way to move data through the FFI.
> 
> Examples of places where I've run into this limitation include:
> - Storing raw audio data: 16MB fits around a minute and a half of s16
> stereo audio at 44.1 khz.
> - Storing raw texture data: 16MB fits at most a square 2048px RGBA image.
> 
> Is this limitation by design, an issue with my usage of the srfi-4 unit or
> an oversight?

It's by design.  Srfi-4 vectors are simply bytevectors wrapped in a
record type.  Bytevectors are a non-immediate type, which consists of a
header followed by data.  The header is a full word in size, of which
the top byte is used for tagging the types: in chicken.h that's
C_HEADER_BITS_MASK.  The size is encoded in the remaining bytes:
C_HEADER_SIZE_MASK, which is 0xffffff on 32-bit platforms (that's one
more f than you mentioned in your mail: could you double-check?)
The allocation and type tagging happens in C_allocate_vector in
runtime.c (where bvecf is true).

On 64-bit platforms the limit is 0xffffffffffffff, which is 65536 TB
if I didn't mess up my calculation.  Unless I'm mistaken, a 32-bit system
can't address anything beyond 4 GB anyway, and by dropping a byte for
tagging, this is divided by 256, leaving as little as 16 MB which is
indeed a little short.  I don't see a way around this without overhauling
our type tagging system.  At first I thought it might be possible by
complicating the type tags a bit, but then I realised that strings, blobs
and symbol names have the same limitation, and that a proper fix should
attempt to fix it for all these types.  The simplest fix would be to make
every non-immediate field one word longer, and store the length there.
Unfortunately this cost would then be paid for every single type,
including pairs.  This would increase GC pressure and generally increase
the memory usage of all CHICKEN programs.

Are you able to switch to a 64-bit machine?  That should help :)

> If it's the latest, would a patch removing this limitation be welcome?

Generally speaking, patches are always welcome.

Cheers,
Peter
-- 
http://www.more-magic.net



reply via email to

[Prev in Thread] Current Thread [Next in Thread]