[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: TODO
From: |
Aldrin Martoq |
Subject: |
Re: TODO |
Date: |
Fri, 4 Jul 2003 03:28:45 -0400 |
User-agent: |
Mutt/1.3.28i |
On Thu, Jul 03, 2003 at 04:57:29PM -0400, Ma?ungo wrote:
> I can't access CSV in the Savannah page, so here we go...
First, check:
http://savannah.nongnu.org/cvs/?group=bsf
http://savannah.nongnu.org/cgi-bin/viewcvs/bsf/
Next, download:
cvs -d:pserver:address@hidden:/cvsroot/bsf login
cvs -d:pserver:address@hidden:/cvsroot/bsf co testsuite
> INCOMPLETE TODO/WISHLIST/IDEAS/FOR DICUSSION:
[..]
I'm not sure of how much "improved" is my version. As stated somewhere
in this list, it's more a _testsuite_ rather than a full useful lovely
nice program.
Instead of this TODO, I would like to throw all away and start with a
new framework, which will allow us to change specific parts of the
program AND to use the program as a testsuite, as a daemon,
or as a MUA plugin. Say:
- Fetch: get a message. Differents ways:
- server (sendmail, exim, whatever plugin).
- unix mua (pine, mutt, you name it).
- gnome/kde/other mua's (evolution, ...).
- Outlook Express or whatever is using Windows people.
Output is a message, mbox format.
- Processing: reads mbox formated message, un-{mime,pgpencrypt,code,html}.
Extract tokens, keywors or any other useful data.
Output is a set of "streams":
- metadata (From header, signature..., ?)
- message body
- other metadata like message length, mime type, language, ...
- Core: reads streams. Calculate Prob. Use metadata. Output is:
- message body guilty percent (80%)
- metadata accepted rules (email whitelist, guilty percent,
accepted languages, ...)
- Output: do something with the results of core, return a code or
something to the caller. Some examples:
- Drop the message
- Add X-Spam-* headers
- Move to spam folder
- exit (0) or (1) or <0 in case of error :-)
- Database manager: handles all tokens, update/remove tokens. Keeps
whitelists. Keeps statistics of filter performance. May suggests some
tunnings during and after training. Examples:
- "It seems your guilty limit is too low(high) that we are
getting false negatives (positives). Would you like to set it 50%?"
- "Would you like to add these emails to the whitelist?"
If you _agree_ with this:
- Volunteers?
- I could start this in about 1 week.
OTOH, I'm quite dissapointed with Bayesian filters. While I worked here,
I "visioned" a full system which will simulate a secretary on your
desktop... Remember that? :-). Spam-filtering would be a part of that
system.
Some outlines are in the _January_ email:
http://mail.gnu.org/archive/html/bsf-devel/2003-01/msg00000.html
Some screenshots of remembrance-agent:
http://www.dcc.uchile.cl/~amartoq/info-agent/
But I think we should finish this, and then start a new project.
--
Aldrin.
- address@hidden: Evaluating bsf as a GNU project], Cristian Gutierrez, 2003/07/02
- Re: address@hidden: Evaluating bsf as a GNU project], Alvaro Herrera, 2003/07/02
- Re: address@hidden: Evaluating bsf as a GNU project], Cristian Gutierrez, 2003/07/03
- Re: address@hidden: Evaluating bsf as a GNU project], Alvaro Herrera, 2003/07/03
- Re: address@hidden: Evaluating bsf as a GNU project], Cristian Gutierrez, 2003/07/03
- Re: address@hidden: Evaluating bsf as a GNU project], Mañungo, 2003/07/03
- Re: address@hidden: Evaluating bsf as a GNU project], Alvaro Herrera, 2003/07/03
- Re: address@hidden: Evaluating bsf as a GNU project], Cristian Gutierrez, 2003/07/03
- TODO, Mañungo, 2003/07/03
- Re: TODO,
Aldrin Martoq <=
- Re: TODO, Cristian Gutierrez, 2003/07/05
- Antispam Idea in SlashDot, Mañungo, 2003/07/14
- Re: Antispam Idea in SlashDot, Aldrin Martoq, 2003/07/14
- Re: Antispam Idea in SlashDot, Alvaro Herrera, 2003/07/16
- Re: address@hidden: Evaluating bsf as a GNU project], Aldrin Martoq, 2003/07/04
- Re: address@hidden: Evaluating bsf as a GNU project], Cristian Gutierrez, 2003/07/04