[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Chicken-users] using mmap files as strings?
From: |
Alan Post |
Subject: |
Re: [Chicken-users] using mmap files as strings? |
Date: |
Wed, 27 Oct 2010 09:23:18 -0600 |
On Wed, Oct 27, 2010 at 02:02:15PM +0200, Jörg F. Wittenberger wrote:
> Am Donnerstag, den 21.10.2010, 15:01 -0600 schrieb Alan Post:
> > So far so good, that is what I would expect. I'd like to work with
> > an mmap buffer like a string. Is it possible to create an object
> > that will treat the mmap area as a string that I can run regular
> > string operations on without copying the mmap buffer? I'm
> > specifically interested in running regular expressions across the
> > mmap space.
>
> Among other reasons this is one why I've been contemplating how one
> could intercept chicken's string handling.
>
> Another application would be shared substrings. Or the combination of
> both. Example: feed a file content to a port, formatted as HTTP chunked
> encoding. A shared substring pointing right into the mmaped file could
> save all copying. The expense would be one object allocation holding
> #{pointer, start, end}.
>
> However this would somehow have to overwrite the basic string handling.
> I have not yet tried that, but at least the utf8 egg hints that it must
> be possible to do so.
>
In a lisp system I worked on a few years ago, I have both a string
type and a "static" type. They both acted like strings, but the
static type had a pointer+len to non-memory-pool memory, rather than
allocating the string data with the object. I believe also the
static object was read-only, for my own simplicity.
I had also wanted to implement substrings, but the problem I ran
into was that I may have a pointer *only* to a substring when the GC
was called, and I had no way to access the full string and make sure
it was properly copied. I wasn't using your suggestiong above of
having a pointer, start, and end, though I'm not sure why. Probably
because I needed things to be just so. :-p
I had been thinking about this feature for the core system, mostly
because I'd like to try my hand at working on the C code in Chicken,
which isn't a promise for anything, I'm still working on my first
egg here!
In that egg, I do need to store substrings, and am doing so with a
pointer to the string and an index. I rarely need to know the full
length of the string, so compute it from the string object when I
need. This allowed me to avoid a whole bunch of string copying
without having to incur much/any overhead. It is conceptually
cleaner to work with substrings, but the effect and performance
should be about the same.
-Alan
--
.i ko djuno fi le do sevzi
Re: [Chicken-users] using mmap files as strings?, F. Wittenberger, 2010/10/27
Re: [Chicken-users] using mmap files as strings?, Kon Lovett, 2010/10/27