emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Character group folding in searches


From: Eli Zaretskii
Subject: Re: Character group folding in searches
Date: Mon, 09 Feb 2015 19:39:24 +0200

> From: Stefan Monnier <address@hidden>
> Cc: address@hidden, address@hidden
> Date: Mon, 09 Feb 2015 11:33:29 -0500
> 
> > I guess I'm still struggling to understand your idea of using DFAs.
> > E.g., you talk about each node of a DFA being a char-table, but AFAIK
> > a DFA node is just a state of the automaton, so how can that be
> 
> A DFA node is a state with labeled arcs going out to other states.
> 
> It's usually implemented as a "table" (array, hash-table, char-table, ...)
> that maps the labels to the next state.
> 
> Does it make more sense, now?

Yes.  (Your "node" seems to be both a node and some part of the
transition function, which is not the usual terminology, AFAIR.)

> >> But how do you use current char-tables to handle multi-char input
> >> entities (i.e. to recognize things like "=>")?
> > I don't understand the question, sorry.  The simple answer is that a
> > char-table entry can be any Lisp object, including a string, but you
> > already know that.
> 
> That doesn't tell me how you'd use it.  Would the ?= char be mapped to
> a list of strings (one of them being "=>") and then you'd check if the
> next (few) chars match one of those strings?
> What I suggest is to map the ?= char to another char-table which then
> maps the ?> char to (say) ?⇒.

I thought doing it the other way around, starting with ?⇒.

> > If you mean how to compare "=>" with "⇒", then the latter will be
> > "folded" to the former using a char-table,
> 
> [ I always get confused by this terminology since "folding" to me
>   implies making things smaller, so I'd call it "unfolding" in that
>   direction.  ]
> 
> > and then the results will be compared, either as strings or character
> > by character.  Is this what you were asking?
> 
> But how would this handle an equivalence class that includes both "=>"
> and "->"?

Why should we?  "->" could be equivalent to ?→, but I see no reason to
make it equivalent to "=>".

> >> > Who and how will create such a DFA?
> >> They'd be mechanically constructed (by hand-written code), for example
> >> driven by the existing Unicode tables.
> > What would be the input language for specifying such a DFA?  I mean,
> > how would we specify which sequence of states are acceptable (yielding
> > a match for the search) and which aren't?
> 
> Depends.  For the Unicode-defined equivalence classes, we'd use the
> Unicode tables directly and build the DFA nodes from it without going
> through some intermediate "specification".
> 
> For other cases, we could specify the DFA with a list of strings.
> Or with regular expressions.

And the DFA will be in C or in Lisp?

Anyway, I hope to see something like that landing on master,
preferably sooner than later.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]