chicken-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Chicken-users] question on C types


From: Felix Winkelmann
Subject: Re: [Chicken-users] question on C types
Date: Tue, 25 Nov 2003 07:14:44 +0100
User-agent: Opera7.21/Win32 M2 build 3218

Am 24 Nov 2003 19:39:02 +0100 hat Joerg F. Wittenberger <address@hidden> geschrieben:

(in this case a pointer to C_words holding fixnums). Your example will
break, though, since you can not convert the numbers into a format
Scheme recognizes.

Hm, why not?  I actually need an opaque array, which is going to be
extensively used (I've seen 45min till I ^C'ed it) from C until there
is _one_ integer result.  Hence I don't want to bitshift a tag to each
int and remove it in such a long loop.


Oh, one can convert it. I just meant that your example didn't do it,
and the interface for doing so might be ugly. For that reason I suggest
number vectors (a la SRFI-4).

The "Levenshtein"-Egg of Lars Rustemeier is obviously derived from
Lorenzo Seidenari's C-Implementation (though it fails to mention).

It does so, now. Lorenzo appears not be found, and the webmaster
of www.merriampark.com assumes there are no restrictions on using
the code (Thanks to Lars for checking this out).

I
don't have a reference, but I remember that Lorenzo called that code
"not for production" and that's for a reason.  No, not for one, for
several:

a) The memory consumtion is O(n*m), while at most O(min(n,m)) is
actually required. (with n,m the length of the input strings)
See http://www.merriampark.com/ldc.htm look for

  d=malloc((sizeof(int))*(m+1)*(n+1));

b) The continues like only proof of concept code is allowed to: it
does *not* check "d" the result of the allocation!  Memory corruption
ahead.  Especally in presence of the excessive allocation.  Don't use
that code for real, please!

c) For long strings with a small Levenshtein distance (the most useful
case), much can be safed, by first skiping common pre- and suffixes.

d) It blocks the chicken scheduler.

Your points are all valid.


I hope a fixed most of the bugs.  But it still blocks the scheduler
for me, though I don't know why.  Additionally the code below provides
a short circuited (levenshtein< s t limit), which doesn't compute the
whole distance, in case you just want to check whether it exceeds a
certain limit.


Many thanks for the code! I'll check it out and try to find the
scheduler problem.

BTW, it would be *very* nice if someone has a replacement for an egg,
or wants to make some code available as one, to send me some documentation,
perhaps in HTML, perhaps in the format used for the eggs-site (templates
are available).


cheers,
felix




reply via email to

[Prev in Thread] Current Thread [Next in Thread]