[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: A couple of lisp questions
From: |
Stefan Monnier |
Subject: |
Re: A couple of lisp questions |
Date: |
Wed, 12 Nov 2003 18:28:27 GMT |
User-agent: |
Gnus/5.09 (Gnus v5.9.0) Emacs/21.3.50 |
Stefan> Take a look at how flyspell does it. Or maybe auto-fill.
> I will. I think auto-fill cheats though, as its tied directly in to
> the command loop. I seem to remember reading that somewhere.
Not the command loop, just the self-command-insert command (which is
implemented in C). You can hijack the auto-fill-function for
your own non-auto-fill use.
> usage-hash: "the" --> ("the" . 4)
> "and" --> ("and" . 6)
Why not just
"the" --> 4
"and" --> 6
> Then a suffix hash
> suffix-hash: "t" --> (("the" . 4) ("then" . 3) ("talk" . 2) etc)
> "th" --> (("the" . 4) etc )
> "the" --> (("the" . 4) etc )
Is `try-completion' too slow (because the usage-hash is too large?) to
build the suffixes on the fly ?
> In this case the cons cells for each word are shared between the
> hashes, so this is not a massive memory waste as the written version
> appears.
Each word of N letters has:
- one string (i.e. N + 16 bytes)
- one cons-cell (8 bytes)
- one hash-table entry (16 bytes)
in usage-hash, plus:
- N cons-cells (N*8 bytes)
- N hash entries shared with other words (at least 16 btes).
For a total of 9*N + 56 bytes per word. Probably not a big deal.
> Ideally I would want to build up these word usage statistics as they
> are typed, but as you say its hard to do this. I think a flyspell like
> approach combined with text properties should work okay.
How do you avoid counting the same instance of a word several times? Oh,
you mark them with a text-property, I see. More like font-lock than flyspell.
> Anyway the idea with the weakness is that I want to garbage collect
> the dictionary periodically, throwing away old, or rarely used words.
I don't think weakness gives you that. It seems difficult to use
weakness here to get even a vague approximation of what you want.
You can use a gc-hook to flush stuff every once in a while, but you
could just as well use an idle-timer for that.
> The serialization would be to enable saving across sessions. Most of
> the packages I know that do this depend on their objects having a read
> syntax, which doesn't work with hashes. I think the solution here is
> to convert the thing into a big alist to save it, and then reconstruct
> the hashes on loading.
Why not reconstruct the suffix upon loading? This way you have no sharing
to worry about and you can just dump the hash via maphash & pp.
> Anyway the idea for all of this was to do a nifty version of
> abbreviation expansion, something like dabbrev-expand, but instead of
> searching local buffers, it would grab word stats as its going, and
> use these to offer appropriate suggestions. I was thinking of a user
> interface a little bit like the buffer/file switching of ido.el, of
> which I have become a committed user.
Sounds neat.
> the way, building an decent UI around this will probably take 10 times
> as much code!
And even more time,
Stefan