chicken-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Chicken-users] using mmap files as strings?


From: Alan Post
Subject: Re: [Chicken-users] using mmap files as strings?
Date: Wed, 27 Oct 2010 09:23:18 -0600

On Wed, Oct 27, 2010 at 02:02:15PM +0200, Jörg F. Wittenberger wrote:
> Am Donnerstag, den 21.10.2010, 15:01 -0600 schrieb Alan Post:
> > So far so good, that is what I would expect.  I'd like to work with
> > an mmap buffer like a string.  Is it possible to create an object
> > that will treat the mmap area as a string that I can run regular
> > string operations on without copying the mmap buffer?  I'm
> > specifically interested in running regular expressions across the
> > mmap space.
> 
> Among other reasons this is one why I've been contemplating how one
> could intercept chicken's string handling.
> 
> Another application would be shared substrings.  Or the combination of
> both.  Example: feed a file content to a port, formatted as HTTP chunked
> encoding.  A shared substring pointing right into the mmaped file could
> save all copying.  The expense would be one object allocation holding
> #{pointer, start, end}.
> 
> However this would somehow have to overwrite the basic string handling.
> I have not yet tried that, but at least the utf8 egg hints that it must
> be possible to do so.
> 

In a lisp system I worked on a few years ago, I have both a string
type and a "static" type.  They both acted like strings, but the
static type had a pointer+len to non-memory-pool memory, rather than
allocating the string data with the object.  I believe also the
static object was read-only, for my own simplicity.

I had also wanted to implement substrings, but the problem I ran
into was that I may have a pointer *only* to a substring when the GC
was called, and I had no way to access the full string and make sure
it was properly copied.  I wasn't using your suggestiong above of
having a pointer, start, and end, though I'm not sure why.  Probably
because I needed things to be just so.  :-p

I had been thinking about this feature for the core system, mostly
because I'd like to try my hand at working on the C code in Chicken,
which isn't a promise for anything, I'm still working on my first
egg here!

In that egg, I do need to store substrings, and am doing so with a
pointer to the string and an index.  I rarely need to know the full
length of the string, so compute it from the string object when I
need.  This allowed me to avoid a whole bunch of string copying
without having to incur much/any overhead.  It is conceptually
cleaner to work with substrings, but the effect and performance
should be about the same.

-Alan
-- 
.i ko djuno fi le do sevzi



reply via email to

[Prev in Thread] Current Thread [Next in Thread]