help-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: :name or 'name?


From: Pascal J. Bourguignon
Subject: Re: :name or 'name?
Date: Tue, 22 Jan 2013 00:47:01 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.2 (gnu/linux)

Oleksandr Gavenko <gavenkoa@gmail.com> writes:

> Is that usual use :name instead of 'name in elisp code (in places where you
> need symbol)?
>
> For example instead of defining error as:
>
>   (put 'mylib-error "Unknown mylib error")
>   (put 'mylib-error 'error-conditions '(mylib-error error))
>   ...
>   (condition-case 'var BODY (mylib-error ...))
>
> use:
>
>   (put :mylib-error "Unknown mylib error")
>   (put :mylib-error 'error-conditions '(:mylib-error error))
>   ...
>   (condition-case 'var BODY (:mylib-error ...))


In Lisp, there are two symbols that are defined as variables with a
suprizing value: the value of the variable is the name of the variable
itself!

As if we had:

    (defconst t   't)
    (defconst nil 'nil)

Therefore when you evaluate the variable t, you get the symbol t, and
when you evaluate the variable nil, you get the symbol nil:

    *** Welcome to IELM ***  Type (describe-mode) for help.
    ELISP> t
    t
    ELISP> nil
    nil
    ELISP> 



On the other hand, symbols are often used in lisp not to denote
variables or functions, but by themselves, to denote a concept, or as a
key in a dictionary (hash-table, a-list, p-list, etc).

When we use them as such, we must quote them since it's not their value
that matters, but their identity:

     (assoc 'meat '((vegetable . beans) (meat . beef) (drink . water)))
     --> (meat . beef)

So far so good.   But when we start using lisp amongst different
programmers and write bigger programs and libraries,  there may be
collision in the use of symbols.  (Notably because we use names for
symbols that are words in natural languages and there are a lot of
homonyms in natural languages).  Therefore we introduce namespaces, or
package systems, to distinguish different symbols with the same name.

This is something that emacs lisp is lacking but bear with me.  

So we may have a package named FRENCH where symbols will be interned to
represent French words, and those symbols will be associated with
linguistic information about those words.  For example, the word "car"
which means "because" could be represented by a symbol named "car",
ie. car.  And we may have a package named VEHICULE where symbols are
interned that represent different kinds of vehicules, like planes,
trains and cars.  So we may have a symbol named "car", ie. car, which
represents some kind of vehicule, with information about eg. the number
of wheels, or the kind of engine, etc this vehicule has.  This can
clearly pose a problem now when we write a program that must deal with
French vehicules, or a program that deals with vehicules that has some
natural language interface in French.

So we intern the first symbol in the package FRENCH, and the second
symbol in the package vehicule, and we have two different symbols:
french:car and vehicule:car.  We also have a symbol lisp:car which is
the usual lisp car function.  Problem solved.


  (Notice that if you change the obarray in emacs lisp, you can intern
   another symbol named "car", different from emacs lisp car symbol.  It
   would be possible theorically by juggling with obarrays to implement
   a package system.  But to do so, one would have to patch the emacs
   lisp reader which is written in C (so that it can read french:car,
   vehicule:car or elisp:car as three different symbols named "car",
   interned in different obarrays).)



Another feature implemented by lispers is positional (and optional)
arguments.  For example, we could write a function that compare strings,
so that we may compare substrings without having to extract them first:

    (defun STRING= (s1 s2 start1 end1 start2 end2)
      (loop
         while (and (< start1 end1) (< start2 end2) 
                    (char= (aref s1 start1) (aref s2 start2)))
        do (incf start1) (incf start2)
        finally (return (and (= start1 end1) (= start2 end2)))))

    (STRING= "Hello world!" "Good bye cruel world" 6 11 15 20)
    --> t

But to compare the whole strings, this would be boring to pass always
those parameters.  Also if we want to compare a whole string with a
substring, we would want to pass only the start and end arguments
corresponding to the substring.

Hence:

    (defun STRING= (s1 s2 &rest keys)
      (let ((start1 (getf keys 'start1 0))
            (start2 (getf keys 'start2 0))
            (end1   (getf keys 'end1 (length s1)))
            (end2   (getf keys 'end2 (length s2))))
        (loop
           while (and (< start1 end1) (< start2 end2) 
                      (char= (aref s1 start1) (aref s2 start2)))
           do (incf start1) (incf start2)
           finally (return (and (= start1 end1) (= start2 end2))))))

    (STRING= "Hello world!" "world" 'start1 6 'end1 11) --> t
    (STRING= "Hello world!" "Good bye cruel world" 'start1 6 'end1 11 'start2 
15 'end2 20) --> t
    (STRING= "world" "Good bye cruel world" 'start2 15) --> t
    (STRING= "hello" "hello") --> t

(notice how getf has a default argument that is used here to set the
variable to a default value when the symbol is not found in the p-list
keys).

A lot of functions can benefit from these "keywords" arguments, so we
may define the defun macro to analyse the lambda lists containing the
symbol &key followed by a definition of the keywords and their default
value, and to wrap the body of the function in the let form that parses
the rest argument list into the different parameters.

This is something most lisp have, but elisp doesn't and we must use
defun* defined in (require 'cl) to get them:

    (require 'cl)
    (defun* STRING= (s1 s2 &key (start1 0) (start2 0) (end1 (length s1))
                                (end2 (length s2)))
      …)


And now we're back to our symbols from different packages.
If we define functions with keyword arguments in different packages,
those keyword symbols will be interned in those different packages.  And
when we call a function of one package from another package, we will
need to qualify the keyword symbols too.  For example, if STRING= is
defined in a package named string, then we would have to write:

   (STRING= s1 s2 'string:start1  2 'string:end1 4 'string:start2 33)

which becomes soon very boring.  

Actually here, we don't care that start1 is the symbol start1 from the
package string.  Just having a symbol named "start1" would be enough to
convey all the meaning. (There's no other information attached to that
symbol, as a keyword parameter).

Another problem is that we need to quote constantly those keyword
symbols.  This can be solved easily: just define their value to be
themselves:

    (defconst string:start1 'string:start1)

so when we call:

   (STRING= s1 s2 string:start1  2 string:end1 4 string:start2 33)

the symbol string:start1 is evaluated to its value, that is, to
string:start1 itself, which is passed as argument to the function.


A cleaner solution to both those problems is the introduction of the
KEYWORD package.  The KEYWORD package is special in two ways:

1- we don't need to write the package name to qualify symbols in it:
     
     :start1 is read as keyword:start1

2- when we intern a symbol in the keyword package, they are
   automatically defined as constant variables whose value is themself:
   
     (defconst :start1 ':start1)

  therefore when we evaluate :start1, we are evaluating keyword:start1,
  and its value is keyword:start1 = :start1.

The name of a keyword such as :start1 is "start1", the colon is only
used to separate the package name from the symbol name when we qualify a
symbol with its package (the package name is optional for the keyword
package).

So we may write:

   (STRING= s1 s2 :start1  2 :end1 4 :start2 33)

and defun* searches for those keywords instead of symbols interned in
the current package, in the list of rest arguments.



So now we may answer your question about whether you should use a
keyword or a symbol.

The point is that keywords are a common resources: all the keywords
symbols are shared by all the libraries.  If some lisp code attaches
some information to a keyword, then it may be in collision with
information attached to the same keyword by another lisp code.  (For one
thing, the symbol value is fixed, since keywords are constant variables,
but one could bind a macro or a function or put properties on a
keyword).  Therefore you should avoid doing that to keywords.

On the other hand, each library should define its own package, and
therefore has its own private symbols to play with, so they may bind
their symbols and attach information to them (thru their property list
or otherwise) as they wish.





Of course, emacs lisp lacks packages and must use a kludge to provide
keywords: in emacs lisp, keywords are actually symbols whose name starts
with a colon; in emacs lisp, (symbol-name ':foo) --> ":foo"

And since there's no package, all the symbols are common, keyword or
not, and therefore when you bind a symbol or set its property list, you
take the risk of collision with some other code loaded in emacs.  So
theorically, it doesn't make a difference whether you write:

   (put 'mylib-error 'error-conditions '(mylib-error error))
or:
   (put :mylib-error 'error-conditions '(mylib-error error))

both are equally bad in emacs lisp.

But if we write emacs lisp as if we had a real lisp, we could say that
it's better to attach library specific information to library specific
symbols instead of keywords, that are a shared resource.  So:

   (put 'mylib-error 'error-conditions '(mylib-error error))

is better.




-- 
__Pascal Bourguignon__                     http://www.informatimago.com/
A bad day in () is better than a good day in {}.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]