bsf-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

token expiration and database


From: Alvaro Herrera
Subject: token expiration and database
Date: Fri, 25 Jul 2003 11:57:29 -0400
User-agent: Mutt/1.4i

Guys,

One idea I forgot to mention in the last meeting was token expiration.

Basically the hypothesis is that token usage on spam varies with time,
so tokens that are no longer used should be erradicated from the
database.  The idea would be to store the timestamp when the token was
last added to the database.  Periodically (weekly or daily) a process
would scan the entire database and delete those items which are older
than a certain threshold.

This would additionaly allow for cleanup of random strings.  We can't
afford to save all random strings in the universe, no matter how we
constrain the length.


On an unrelated note: if GDBM is so slow maybe we should really look at
implementing our own btree.  I could extract the btree machinery from
Postgres (it's BSD licensed, so we can use it however we like) if we
decide to go down this route.  This btree has the advantage that you
only need to read the needed pages.  I dunno if this is the case with
GDBM.

However, this assumes there's some locality in accesses.  If there isn't
then there's no gain in loading only certain pages.  Now that I think of
it, given that the words are mostly "a random sample", there probably
isn't any gain after all.

Another idea would be to store the database in a... uh... database.  I
mean a real database.  I _know_ Postgres is blindinly fast and we could
use it to store frequencies, timestamps, and whatever data we need,
without worring about implementing btrees, caching algorithm or other
stuff that we don't really care about.

We could also use BerkeleyDB (www.sleepycat.com, has an open source
license) or SQLite (www.sqlite.org, it is in the public domain), but I
don't think there's any gain to be had.

-- 
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"La naturaleza, tan fragil, tan expuesta a la muerte... y tan viva"




reply via email to

[Prev in Thread] Current Thread [Next in Thread]