[-*- mode: indented-text -*-]

Notes and ideas for implementing lexical scoping in Elisp
=========================================================


Changes required
----------------

 + define 16 new byte compiler opcodes, Blocalref + 0-7 and Blocalset +
   0-7, for manipulating local variables (stored in a separate alloca'd
   vector in Fbyte_code, or perhaps at the beginning of the stack vector
   - it must be GCPRO'd).

   - It looks like opcodes 000-007 and 060-067 (48-55) can be used for
     this (they are presently unused in 19.x and 20.x - worth checking
     what XEmacs and Emacs 18.x use though).

     -> Emacs 18.x used fewer opcodes than 19.x (all of which are either
        still in use, or reserved if obsolete).

     -> XEmacs 20.4 uses 0266-0267 (182-183) which are free in Emacs
        20.3, and has reused some of the obsolete opcodes, but otherwise
	is the same.

	=> full list of unused opcodes (in octal):

	     Emacs 20.3	       XEmacs 20.4    Proposed use
	     ----------	       -----------    ------------
	     000-007	       000-007	      Blocalref0-7
	     060-067	       060-067	      Blocalset0-7
	     0200	       0200	      Bextend
	     0251	       0251	      Bfunction
	     0262	       0262	      Bmakeindirect
	     0263	       0263
	     0264	       0264
	     0265	       0265
	     0266-0267	       (used)

	Blocalref and Blocalset are the lexical counterparts to Bvarref
	and Bvarset.  Blocalalloc and Blocalfree are used to allocate
	and free binding slots for lexical variables that can escape
	into heap closures.  Bfunction is used to construct closures,
	which can be heap or stack allocated.

	Bextend is reserved to allow future opcode expansion, by serving
	as a prefix for extended (multibyte) opcodes.

   - How to reference lexical variables in nested bindings though?  Is
     it acceptable to "flatten" all bindings in a function?  If so, what
     happens if multiple closures are generated that refer to an inner
     binding that is destroyed and recreated in between (ie. the inner
     binding is different for each closure, but the outer bindings are
     shared)?

     -> If no closures can be created (or at least none that can escape
        the scope of the innermost binding they reference), it is okay
        to flatten all bindings for a function (as C does).  Nested
        functions still require some sort of "non-local" binding access
        (eg. using a display vector or stack).

     => The only bindings which cannot be migrated to the function level
        are those captured by a closure that escapes that binding's
        lexical scope; ie. bindings which need to be evacuated, and
        should therefore be in separate block(s) to be safe for space
        complexity (ie. to minimize garbage retention).

	Whether or not a given binding is allocated in a separate block
	is independent of the mechanism for accessing non-local
	bindings; ie. a closure should be able to reference both the
	main function-level block of an enclosing function, as well as
	any additional blocks it may allocate.  What is necessary is for
	the compiler to fix the "depth" of each binding, so there is no
	ambiguity about how to reference each binding.  (Obviously the
	depth of a given binding can be different at different points in
	the code, or for different nested functions.)

     Possible solutions:

     - include a display index in (at least some) localref/set
       instructions; each level of binding (within a function, as well
       as in nested functions) in principle allocates a new lexical
       environment block, and pushes a pointer to it on the display
       stack; on exit, the display stack is popped.  Indexing is from
       the top of the stack.  Generating a closure can evacuate some or
       all of the binding blocks into the heap.

       -> would only need to explicitly modify display stack when
          entering and exiting blocks whose bindings can escape.
          Otherwise, nested blocks can be collapsed and referenced from
          a single display entry.  In fact, inner bindings that are not
          captured by closures can even be migrated out through
          enclosing bindings that are captured into the function-level
          environment block.

       -> obviously we want as many of the localref/set instructions as
          possible to not include an explicit display index, since we
          expect to be able to place most bindings in the function-level
          block (which should then be kept as index 0 and not be
          "buried" by inner blocks dynamically created for captured
          bindings).  It is probably sufficient to use the first 7
          opcodes of each type to index the function-level block (giving
          access to the first 256 bindings), and extend the 8th (which
          already takes two extra bytes to form a 16-bit index) to
          include a display index as well, making it up to three extra
          bytes.

          If the local function-level block is always index 0, then it
          is not quite clear where to locate inherited binding blocks
          (ie. when we have nested functions).  Presumably it is easy
          enough to assign each enclosing block (both function-level and
          closure-specific blocks) successive indices followed by inner
          closure-specific blocks.

          -> The problem is that we want to represent the environment in
             a closure using a single env_ptr; therefore we need to
             arrange that the right blocks are chained together in the
             order expected, while (ideally) still keeping the main
             function block at index 0.

	     The solution is probably to insert dynamically created
	     envblocks in the current env_ptr chain after the head
	     block.  In the case that a closure already has several
	     inherited blocks in it's env chain after the main function
	     block, there is a choice between pushing inherited blocks
	     further down the chain (changing their index) or adding the
	     new local blocks after the inherited ones (whose number is
	     at least fixed).  The latter choice would allow for simpler
	     opcodes to add and remove blocks (since they would always
	     insert or remove immediately after the head block), while
	     hopefully not overly complicating the compiler's
	     book-keeping which must track the current depth of each
	     block.

     Proposed solution:

      - All bytecode objects have an environment slot, which must be an
	untagged pointer (struct Lisp_Envblock *) because most envblocks
	will be stack allocated, and hence the slot must also be hidden
	from direct lisp access (eg. aref), and handled specially by GC.

      - When a bytecode object is invoked, env_ptr is set to the value
        of the environment slot before pushing the new envblock for the
        parameters on the front.  The parameter envblock includes space
        for the main function-level bindings, either before or after the
        parameter values, whichever works out better.  [After would seem
        appropriate.]

        -> Fbyte_code should keep its existing interface if possible,
           for backwards compatibility (some existing bytecode contains
           direct invocations of `byte-code', in lieu of constructing a
           proper bytecode object).  So an internal interpret_byte_code
           function is needed, which takes the pre-constructed parameter
           envblock.

	=> Since extant bytecode doesn't have lexical bindings, and new
	   bytecode will be passed parameters as well as the closure
	   environment using env_ptr, there is no need to change
	   Fbyte_code's arguments.

      - If the invoked bytecode needs additional envblocks for bindings
        that may need to be evacuated, then it preallocates the blocks
        on the stack but doesn't link them into the env chain yet.
        Instead, it keeps private references to them.

      - When a binding not allocated to the main block comes into scope,
        the relevant extra block is pushed onto the env chain.  When the
        binding goes out of scope, the env chain is popped.

        -> If using the indirect value method (see next point), then
           instead of pushing new envblocks on the chain, the relevant
           slot pointer is (re)set to a private local slot provided for
           the purpose (the local slot is "allocated" by incrementing an
           index).  The index is decremented when the binding goes out
           of scope.  Either the number of required local slots is
           specified in the bytecode object (and alloca'd) or else
           included in the required environment size (with slots then
           being allocated from the end of the envblock; parameters and
           bindings taking a fixed amount from the beginning).  If the
           latter approach is taken, slots should be zeroed when freed
           to avoid retaining garbage.

      - To create a closure, the Bfunction opcode is invoked, which
        interprets a variable length "script" following the opcode.  The
        script specifies how many env blocks need to be evacuated.
        (This leaves an interesting problem of how to stop tracing the
        closure's env chain, when it reaches a block that is not
        evacuated.  If we have to evacuate the full chain, then we've
        gained nothing from using extra blocks.)

        -> One alternative is to use a local display vector which
           references the additional local blocks.  When a closure is
           created, the relevant blocks are linked together and
           evacuated if necessary.  I'm not entirely sure, but it may be
           possible to allocate bindings to blocks in such a way that it
           is safe to chain blocks to previously evacuated blocks
           referenced by other closures.  In this scheme, the bytecode
           must then setup the display vector on invocation to reference
           the appropriate blocks from the env chain.

        -> A second, more radical, alternative is to distinguish those
           bindings which can be evacuated, and access them indirectly
           through a pointer stored in the relevant envblock.  Thus each
           function has just one local envblock (envblocks are chained
           in the normal way).  Nothing needs to happen when entering or
           exiting the scope of bindings that can escape; instead
           creating a closure evacuates the relevant bindings (if not
           already evacuated), updating the owning envblock pointers.

           -> Since indirect values can either be stack or heap
              allocated, we can't really use a mechanism like setting
              markbit on those slots that are value pointers (and in any
              case that would slow down access for all variables).
              Instead, it seems reasonable to set aside one bit from
              Blocalref/set7 (eg. one of the 8 nesting depth bits) to
              indicate that an indirection is required.

	      At the same time, it would probably be worth while to use
	      a couple of bits from Blocalref/set6 to specify the
	      envblock nesting, with the remaining bits specifying the
	      envblock slot index.  This provides reasonably efficient
	      access to bindings in the first three nested blocks, and
	      still leaves enough bits for the slot index that
	      Blocalref/set7 ops will almost never be needed to access
	      direct bindings.

	      => In fact, it would seem worth it to set aside a bit of
		 the Blocalref/set6 instruction to control indirection.

           If the closure can escape, then a new envblock is (heap)
           allocated for it, containing pointers to all the (presumably
           evacuated) bindings it needs, and the bare bytecode object
           being closed over is cloned.  If the closure cannot escape,
           then the the bare bytecode object embedded in the enclosing
           function is (re)used directly as the closure value, with the
           current env_ptr stored in it's environment slot.  (This is
           okay, because only one distinct closure of the bytecode
           object can exist at a time; thus the bytecode object embedded
           in the parent code can be serially reused.)

	   For garbage collection, it is necessary to distinguish those
	   envblock entries which are actually pointers to the real
	   data; either they can follow any immediate values (with a
	   separate size field to indicate how many slots are
	   immediate), or perhaps the markbit can be set on those slots
	   which are pointers.

           -> Since parameter bindings might need to be indirected,
              which the caller will not know when it constructs the main
              envblock, the compiler will need to allocate "proxy" slots
              for such bindings (emitting code to initialize them in the
              function prolog).  Thus it is not difficult for the
              compiler to locate indirect slots after direct slots, if
              that is useful.

           Evacuated values could be stored in special blocks (similar
           to cons blocks), with GC performed by setting markbits on the
           live values (even immediate ones) as they are scanned.  Free
           cells are linked into a block free list.

 + will need at least one new opcode (Bfunction) for creating closures
   in byte code; a reasonable approach would be for the compiler to emit
   a small "script" following Bfunction which indicates which
   environment slots are needed by the closure.  It is not quite clear
   how to optimize or choose the environment representation to minimize
   the number of bindings that will be unnecessarily captured.  It might
   be reasonable for the compiler to place bindings that can be captured
   into separate environment blocks.

   -> Yet more opcodes will be needed to dynamically push and pop
      envblocks on the local environment chain, to hold those bindings
      which can be captured by closures.

      => The extra opcodes are still needed when using the indirected
	 slot representation, since we need to be able to setup an
	 arbitrary number of indirection pointers, and if necessary
	 reset them to local slots when reestablishing another instance
	 of a captured binding that can escape.

 + Handle the possibility that (interactive ...) form might follow
   declarations at the beginning of a lambda body.

 + Would like to define an additional 8 "fastcall" opcodes, to speed
   up function invocation.  Could perhaps use 184-191.  Will want to
   reserve at least one opcode as an "extended opcode" prefix, to
   provide room for future expansion.

   => Note that rewriting bytecode ops would not be allowed with
      purified strings, ie. when functions are preloaded, because
      purified strings must be treated as read-only.

 + extend definition of byte-compiled function objects to include size
   of local variable vector (ie. the primary function-level binding
   block).  Might as well include an environment entry in all bytecode
   objects, rather than having to use separate closure objects.

   - extend Lisp reader to accept extra value(s) in bytecode vector
     (implicitly inserting a value of 0/nil for old bytecode
     definitions, which obviously will not have any Blocalset/ref
     opcodes).

 + to support lexical scoping of function arguments, remove specbind'ing
   from funcall_lambda, at least when calling new-style bytecode
   functions.  Instead, Fbyte_code should receive its arguments as an
   array (as Ffuncall does) - it should copy them into the local
   variable array, relying on the byte compiler to emit bytecode to bind
   any parameters which are declared special.

   - Or perhaps new-style bytecode should use nil in the argument list for
     parameters which are lexically scoped - that way funcall_lambda can
     still bind special parameters (which will be all parameters for
     old-style bytecode, since none of the parameter names will be nil).
     Fbyte_code still receives the full arg list as a vector.

   - To support lexical scoping for interpreted code, funcall_lambda
     would need to check which parameters are declared special in the
     case of calling lambda expressions.  Special parameters would be
     specbound as now, while normal parameters would be added to a newly
     created environment dictionary (added to the head of the
     environment list).

     -> correction: funcall_lambda would create a new environment
        consisting initially of just the parameter bindings.  I believe
        it is correct to say that Ffuncall always hides the existing
        environment stack.  When calling a closure, Ffuncall will
        obviously reinstate the saved environment before binding
        parameters.

        => Partly right; see next main point for more details.  However,
           it is not true that funcall_lambda should always hide the
           current env; an exception is when a lambda expression occurs
           in the function position of a form being evalled.  In this
           case, the body of the lambda expression should have access to
           lexically apparent bindings.

 + lexical scoping for interpreted code is achieved by creating an
   environment list, which works rather like dictstack in Postscript
   (except we start a new stack for each top-level lexical environment -
   ie. each top-level form).

   - Special forms like Flet push a new dictionary on the current
     dictstack populated with local bindings (while specbinding those
     variables which are declared special); variable lookups (eg. Feval
     or Fset) are done by searching the current dictstack for a binding
     before using the global value.

     -> To cater for the possibility that an inner scope declares a
        variable special which has already been given a lexically
        visible binding, it may be necessary to add an entry to the
        environment which indicates that variable is special.  Either
        that, or every variable access must explicitly check whether the
        variable is special.

        => It is necessary to have an entry for bindings that are
           (locally) declared special (using "declare", as in CL) - the
           value for the binding would contain a distinguished value
           Qspecbound (analogous to Qunbound) which instructs the
           interpreter to reference the global value.

        -> Note that Flet must check for declare forms at the beginning
           of the body (first expanding macros as necessary!) before
           actually binding anything to determine whether any of the
           variables are declared special.

           => Other builtins that must also do this are: funcall_lambda
              (and anything else which does parameter bindings).

     -> I believe it will be necessary to specially tag defvar'd symbols
        somehow, since there doesn't currently seem to be any foolproof
        way to tell which symbols are special.  (Byte compiler seems to
        do its check by remembering which variables have been defined in
        the current file, or require'd files.)

        => A bit in the symbol record can hopefully be used for this
           (and specbind/unbind extended to allow "binding" the
           specialness of symbols).

           => A symbol bit is needed, but note that CL prohibits a
              symbol "proclaimed" (ie. globally declared) special
              (eg. by defvar) from _ever_ being lexically bound, so
              there is no need to specbind a symbol's (global)
              specialness.  Instead, the interpreter first checks the
              specialness of symbols before searching the environment
              for lexical bindings.  (Local specialness, on the other
              hand, is overriden by inner lexical bindings; every new
              binding must be explicitly declared special if those
              semantics are required.  This is handled by a Qspecbound
              entry in the environment.)

     -> Dictionaries created by Flet et al will normally be alloca'd,
        but with the possibility of being evicted/evacuated into the
        heap when necessary (ie. when Ffunction generates closures, or
        at least when a closure is about to "escape" its creating
        frame).

   - Multiple dictstacks can be achieved (I think) by specbinding a
     global envp which points to the current environment (ie. the top of
     the current dictstack).  Feval, Ffuncall, Fapply etc will then
     specbind envp, eg. to nil or to the saved env if invoking a
     closure.

     -> Note that the current Feval will be renamed to internal_eval (or
        something) - the various special forms which currently call
        Feval will therefore usually call internal_eval instead, as will
        the new Feval (after first specbinding envp to nil).

        => On second thought, I think Feval can stay as is, with other
           special forms calling it directly.  Lisp code that calls it
           explicitly will get an empty environment automatically,
           because Feval will be invoked via Ffuncall, which always
           hides the current environment.

     => Rather than coerce specbind into binding envp, it is more
        appropriate to store the current value in the backtrace struct
        for a new function (since that is when a new stack is started).
        Suitable access to hidden environments can be arranged for the
        benefit of the Lisp-level debugger (at the moment, there is no
        access for buried specbindings anyway - that would be good to
        have available to the debugger).

 + to implement lexical closures efficiently we want to avoid requiring
   Ffunction to always evacuate alloca'd environment blocks into the
   heap if possible, since we expect most uses of closures are immediate
   - ie. the closure is immediately passed to funcall or apply in the
   same block where it is created - effectively restricting the lexical
   bindings to dynamic extent.  (Of course, many closures will not even
   reference their environment at all.)

   -> Note: in the absence of proper closure analysis, it will be
      necessary to evacuate the entire current dictstack, since it is
      not possible to tell which bindings might actually be referenced
      in closed over lamda expressions (which could be arbitrarily
      nested, just to make things interesting).  Of course, Ffunction
      doesn't evacuate if its arg is a symbol.  (CL specifies that
      Ffunction returns the function binding if passed a symbol, but
      elisp just returns the symbol, to delay autoloading I think.)

   Possible implementation:

   - Flet will always add new alloca'd environment block to env chain

   - Ffunction will use pointer to alloca'd block in closure

     -> Fset (and others, eg. Feval, Ffset) will have to detect when a
	closure reference is being stored in a variable/function binding
	(or array slot etc), and cause the environment block (chain) to
	then be evacuated if still pointing to stack memory.  Also, Flet
	will need to check whether a closure is being returned as its
	value.

     -> Checks would also be required by functions that allocate,
        eg. Fcons, since there are lots of ways the closure could
        "escape" the enclosing Flet.

   => Because of the difficulty and run-time expense of detecting when
      closure's escape, it is probably best for Ffunction to always
      evacuate the environment - after all, this only happens in
      interpreted code (we expect the compiler to be more intelligent
      about identifying which variables need to be closed over, and
      whether the resulting closure can escape).

      -> What happens when multiple closures are created over the same
         env?  In interpreted code, presumably the first call to
         Ffunction evacuates the env chain, and subsequent calls just
         use the reference to the now heap-based env.

      -> What happens with nested calls to Ffunction?  Perhaps this just
         happens to work correctly without doing anything special: the
         outer call to Ffunction captures the entire env chain (when
         interpreted, at least); when later funcalled, that env will
         again be "active" (possible with additional bindings on the
         front) when an inner call to Ffunction happens.  It is only the
         compiler that has to perform closure analysis to "shake out"
         the environment if possible.

 + When calling bytecode, we would ideally like to merge the argument
   bindings with the function's basic lexical bindings (ie. most
   bindings in the function, since we expect to flatten most of them to
   the outer binding scope).  This maximizes the use of the quick access
   versions of Blocalref/set opcodes (see discussion about opcodes).

   To do this either requires copying the arg vector into space
   allocated in the main function binding block, or arranging for the
   arg vector itself to have the extra space needed for the function's
   basic local bindings.

   -> A hybrid seems reasonable for an initial implementation: the arg
      vector is copied into a fresh envblock, but the envblock includes
      space for the requested number of slots.


General Observations
--------------------

 + when modifying bytecomp.el, remember that cl*.el contains code to
   help compile CL special forms.

 + Some .el files use plain lambda expressions as functional values,
   eg. for functions like mapcar.  The current bytecode compiler appears
   to treat them as if they had been wrapped in (function ...); at
   least, it compiles the lambda expression into a bytecode object.

   This raises some interesting questions when closures are introduced:
   should plain lambda expressions in arg lists "evaluate" to closures?
   If so (which would help for backward compatibility with existing
   code), shouldn't the interpreter do the same?  Currently, I think
   such forms will (now) be evaluated in a NULL lexical environment.

   => It turns out that `lambda' names a macro that expands into a call
      to `function', so all is well after all.

   - The compiler also treats '(lambda ...) like #'(lambda ...).  That
     is problematic (and possibly going a bit too far) - unfortunately,
     there is existing code that relies on it, which will break under
     strict interpretation of lexical scoping rules.

     -> Should we declare that such code is broken?  The compiler could
        easily warn when it encounters '(lambda ...), but the
        interpreter can't really do that, so maybe we do have to make
        this a special case as well.

	-> Packages known to affected (in 19.34): allout, ange-ftp,
	   byte-opt, most of cal-*, calendar, complete, disass, gulp,
	   help, hideif, icomplete, lisp-mnt, 

	-> Packages known to use '(lambda ...), though apparently not
	   affected: avoid, byte-run (commented out), bytecomp,
	   completion, diary-lib, ediff*, emerge, facemenu, faces,
	   files, forms, gnus-gl, gnus-kill, gnus-msg, gnus-score,
	   gnus-srvr, gnus-uu, gnus, holidays, imenu, info, lmenu (?),
	   lunar, make-mode, 

 + Packages known to rely on dynamic binding, that will need suitable
   declarations: find-file, forms (possibly), gnus-gl (possibly), rmail
   (rmail-set-message-counters), 

 + gomoku.el has two lambda expressions with a duplicated parameter
   name!  So does mailalias, 

 + What to do about buffer-local variables?  In particular, variables
   that are not permanently buffer-local, but declared so dynamically
   using make-local-variable or make-variable-buffer-local.

   -> Presumably such declarations must be treated as proclamations, but
      it will probably be necessary for actual `proclaim' statements to
      be added so the compiler can know which variables have special
      bindings.

      -> Packages known to reference non-permanent buffer-local vars: 

   -> If a "declaim" (?) function is provided, to allow un-proclaiming
      the specialness of symbols, it must refuse to do so for permanent
      (builtin) buffer-local variables, nil, t etc.

   => I believe most if not all packages that make buffer-local
      variables already defvar the variables; this is certainly the
      correct thing for them to do.  So it is not strictly necessary for
      make-local-variable or make-variable-buffer-local to implicitly
      proclaim the symbol special, but to avoid needless code breakage,
      it is reasonable for the interpreter to do that.  The compiler
      will issue a warning if the symbol wasn't already defvar'd.

   -> A related problem is the handling of platform-specific DEFVAR_*
      declarations (eg. the Vw32_* variables) - these need to be
      declared or proclaimed on all platforms in order that byte-code
      compiled on another platform (eg. by the FSF) uses dynamic
      binding.  (Hopefully a compiler warning about unused local
      variables will be issued in most cases when this hasn't been
      done.)

      => As well as issuing a warning about unreferenced local variables
	 that aren't known to be special, it should assume the variable
	 is special and specbind it anyway.  The warning would suggest
	 it should be proclaimed.

	 To handle platform specific DEFVARs, makedoc should generate a
	 list of all known DEFVARs for the byte compiler.

      => A better solution might be to require that all files be
	 (re)compiled locally on installation.  It should be possible to
	 compile everything from scratch anyway (which probably will
	 necessitate adding proper require's to all sorts of packages).

 + Flet and funcall_lambda must not only check for special declarations
   when creating bindings, but must add Qspecbound entries for symbols
   that are not being bound, but are declared locally special.

   => Done.  [03-Dec-98]  No it isn't (maybe this code was lost).

 + Noticed some minor speed improvements that could be made to the
   interpreter:

   - various functions call Fcar then later Fcdr on the same value ->
     Fcar signals an error if the value is not a CONS cell, so Fcdr
     could be inlined to XCONS(val)->cdr to avoid function call.

     => Not quite right; Fcdr also accepts nil without complaint, which
        is relevant in a few places (eg. evalling args for subrs).  In
        any case, any efficiency gain is likely to be miniscule.

   - functions like Fcar don't directly signal an error; they call
     wrong_type_argument which has some mocklisp compatibility stuff
     that can coerce numbers to strings and vv, or signals an error.

     [nothing wrong with that though - no efficiency lost in Fcar et al.]

   - not sure why Flet et al include "QUIT;" in the loop that binds
     variables - surely this is unnecessary (since Feval does a QUIT
     anyway, and if not evalling, the time spent is minimal, though
     theoretically unbounded)?

 + Possible (minor) speed improvements to Fbyte_code:

   - don't zero stack slots at start; instead, update the gcpro length
     field based on stackp before doing Ffuncall or anything else which
     can cause GC - unbind_to, Fcondition_case, internal_catch,
     Findent_to (calls Finsert_char which runs before/after change fns),
     Fread_char (may execute macros, timer code, etc), Fdelete_region,
     and maybe others I've missed.

     -> This also helps make bytecode "safe for space complexity",
        ie. stops old references in unused stack slots from retaining
        garbage.  For the same reason, it may be appropriate for the
        compiler to generate code to nil out local bindings that are no
        longer needed, if the environment block might be long-lived.

   - only check for relocation of bytestr after executing opcodes that
     we know could have caused GC.

   - can we skip the SYMBOLP check on Bvarref?  Surely the compiler
     ensures that vectorp indices refer to symbols?  (If really
     paranoid, presumably we could implement a bytecode checker that is
     run when bytecode object is created.)

     -> We could skip it, but of course it won't matter much when most
        variables are lexically scoped.

 + A possible future optimization is for callers of Ffuncall to make the
   argument vector large enough to hold maxargs if subrp, or
   numargs+numlocals if compiledp.  (In the latter case, the argument
   vector would need to in fact be a properly formatted Lisp_Envblock.
   It could well work out better if Lisp_Envblock contained a pointer to
   a plain C vector, so bytecode wouldn't need to leave space on the
   value stack for the envblock header.)

   The benefit from doing this would be to avoid making a copy of the
   arg vector.

   -> The problem is that the actual function called at runtime might
      not match the number of arguments etc that the compiler thought it
      would.

 + not strictly related to lexical scoping, but when trying to switch
   Lisp_Object representation to have bottom 2 bits (say) as tag bits
   (mark bits stored to the side), will need to detect places in C code
   where Lisp_Object's are treated as ints, and `make_number' and
   XINT/XFASTINT aren't used for handling numbers.

   => Undef NO_UNION_TYPE in lisp.h, so that Lisp_Object is a struct
      (actually union).  This should go a long way to tracking down
      sloppiness in handling lisp values in C.

   => Make XFASTINT (and perhaps XINT too) expand to (0, value) to
      prevent them being used on the LHS.

 + consider supporting the dynamic-extent declaration from CL, which
   says that a variable is to be stack-allocated, even if referenced by
   closures (I guess - check the CLHS).


Functions to test interpreter and bytecode interpreter/compiler
===============================================================

Simple tests of interpreter semantics
-------------------------------------

;;; Eval in interaction buffer

;;; Define some simple closure generators
(defun make-adder-1 (addend)
  (lambda (val)
    (+ val addend)))

(defun make-adder-2 (addend)
  (let ((a addend))
    (lambda (val)
      (+ val a))))

(setq adder-1 (make-adder-1 3))
(setq adder-2 (make-adder-2 4))
(setq adder-3
      (let ((a 5))
	(lambda (val)
	  (+ val a))))

;;; Test that the closures work
(funcall adder-1 10)
(funcall adder-2 10)
(funcall adder-3 10)

(setq adder-1 nil)
(setq adder-2 nil)
(setq adder-3 nil)

;;; Test that gc of interpreted closures works
(garbage-collect)


;;; Create a matched pair of getter and setter methods
(defun make-safe-place (initval)
  (let ((place initval))
    (cons
     (lambda ()
       place)
     (lambda (newval)
       (setq place newval)))))

(setq place-1 (make-safe-place 1))
(setq place-2 (make-safe-place 2))

(funcall (car place-1))
(funcall (car place-2))
(funcall (cdr place-1) 15)
(funcall (car place-1))
(funcall (car place-2))
(funcall (cdr place-2) 29)
(funcall (car place-1))
(funcall (car place-2))


Manually compiled bytecode objects
----------------------------------

Example A
---------

(defun simple-fn (a)
  (let ((b 2))
    (+ a b)))

;;; The actual bytecode object from the traditional compiler:

#[
 ;; arglist
 (a)
 ;; bytecode
 "\302	\\)\207" ;; \302 \030 \011 \010 \134 \051 \207
 ;; constants
 [b a 2]
 ;; stack size
 2
]

;;; Disassembly of the bytecode:

byte code for simple-fn:
  args: (a)
0	constant  2	;; \302
1	varbind	  b	;; \030
2	varref	  a	;; \011
3	varref	  b	;; \010
4	plus		;; \134
5	unbind	  1	;; \051
6	return		;; \207

;;; Disassembly of the new bytecode with lexical scoping:

byte code for simple-fn:
  args: (a)
0	constant  2	;; \300
1	localset  b	;; \061
2	localref  a	;; \000
3	localref  b	;; \001
4	plus		;; \134
5	return		;; \207

(defalias 'simple-fn-lexical
#[
  ;; arg info (min max . numlocals)
  (1 1 . 1)
  ;; bytecode
  "\300\061\000\001\134\207"
  ;; constants
  [2]
  ;; stack size
  2
])

Example B
---------

(defun add-log-iso8601-time-zone (time)
  (let* ((utc-offset (or (car (current-time-zone time)) 0))
	 (sign (if (< utc-offset 0) ?- ?+))
	 (sec (abs utc-offset))
	 (ss (% sec 60))
	 (min (/ sec 60))
	 (mm (% min 60))
	 (hh (/ min 60)))
    (format (cond ((not (zerop ss)) "%c%02d:%02d:%02d")
		  ((not (zerop mm)) "%c%02d:%02d")
		  (t "%c%02d"))
	    sign hh mm ss)))

;;; The actual bytecode object from the traditional compiler:

#[
 ;; arglist
 (time)
 ;; bytecode
 "address@hidden	!\211\313\246\313\245\211\313\246\313\245\314\307U\2046\000\315\202B\000\307U\204A\000\316\202B\000\317\012%.\207"
 ;; constants
 [time utc-offset sign sec ss min current-time-zone 0 45 43 abs 60 format "%c%02d:%02d:%02d" "%c%02d:%02d" "%c%02d" mm hh]
 ;; required stack size
 7
 ;; docstring (not present)
 ;; interactive spec (not present)
]

;;; Disassembly of the bytecode:

byte code for add-log-iso8601-time-zone:
  args: (time)
0	constant  current-time-zone	;; \306
1	varref	  time			;; \010
2	call	  1			;; \041
3	car				;; \100
4	goto-if-not-nil-else-pop 1	;; \206 \010 \000
7	constant  0			;; \307
8:1	dup				;; \211
9	varbind	  utc-offset		;; \031
10	constant  0			;; \307
11	lss				;; \127
12	goto-if-nil 2			;; \203 \023 \000
15	constant  45			;; \310
16	goto	  3			;; \202 \024 \000
19:2	constant  43			;; \311
20:3	varbind	  sign			;; \032
21	constant  abs			;; \312
22	varref	  utc-offset		;; \011
23	call	  1			;; \041
24	dup				;; \211
25	varbind	  sec			;; \033
26	constant  60			;; \313
27	rem				;; \246
28	varbind	  ss			;; \034
29	varref	  sec			;; \013
30	constant  60			;; \313
31	quo				;; \245
32	dup				;; \211
33	varbind	  min			;; \035
34	constant  60			;; \313
35	rem				;; \246
36	varbind	  mm			;; \036 \020
38	varref	  min			;; \015
39	constant  60			;; \313
40	quo				;; \245
41	varbind	  hh			;; \036 \012
43	constant  format		;; \314
44	varref	  ss			;; \014
45	constant  0			;; \307
46	eqlsign				;; \125
47	goto-if-not-nil 4		;; \204 \066 \000
50	constant  "%c%02d:%02d:%02d"	;; \315
51	goto	  6			;; \202 \102 \000
54:4	varref	  mm			;; \016 \020
56	constant  0			;; \307
57	eqlsign				;; \125
58	goto-if-not-nil 5		;; \204 \101 \000
61	constant  "%c%02d:%02d"		;; \316
62	goto	  6			;; \202 \102 \000
65:5	constant  "%c%02d"		;; \317
66:6	varref	  sign			;; \012
67	varref	  hh			;; \016 \021
69	varref	  mm			;; \016 \020
71	varref	  ss			;; \014
72	call	  5			;; \045
73	unbind	  7			;; \056 \007
75	return				;; \207

;;; Disassembly of the new bytecode with lexical scoping:

Locals:
0	time		;; arg 0
1	utc-offset
2	sign
3	sec
4	ss
5	min
6	mm
7	hh

Constants:
0	current-time-zone
1	0
2	?-	;; 45
3	?+	;; 43
4	abs
5	60
6	format
7	"%c%02d:%02d:%02d"
8	"%c%02d:%02d"
9	"%c%02d"

byte code for add-log-iso8601-time-zone:
  args: (time)
0	constant  current-time-zone	;; \300
1	localref  time			;; \000
2	call	  1			;; \041
3	car				;; \100
4	goto-if-not-nil-else-pop 1	;; \206 \010 \000
7	constant  0			;; \301
8:1	dup				;; \211
9	localset  utc-offset		;; \061
10	constant  0			;; \301
11	lss				;; \127
12	goto-if-nil 2			;; \203 \023 \000
15	constant  45			;; \302
16	goto	  3			;; \202 \024 \000
19:2	constant  43			;; \303
20:3	localset  sign			;; \062
21	constant  abs			;; \304
22	localref  utc-offset		;; \001
23	call	  1			;; \041
24	dup				;; \211
25	localset  sec			;; \063
26	constant  60			;; \305
27	rem				;; \246
28	localset  ss			;; \064
29	localref  sec			;; \003
30	constant  60			;; \305
31	quo				;; \245
32	dup				;; \211
33	localset  min			;; \065
34	constant  60			;; \305
35	rem				;; \246
36	localset  mm			;; \066 \006
38	localref  min			;; \005
39	constant  60			;; \305
40	quo				;; \245
41	localset  hh			;; \066 \007
43	constant  format		;; \306
44	localref  ss			;; \004
45	constant  0			;; \301
46	eqlsign				;; \125
47	goto-if-not-nil 4		;; \204 \066 \000
50	constant  "%c%02d:%02d:%02d"	;; \307
51	goto	  6			;; \202 \102 \000
54:4	localref  mm			;; \006 \006
56	constant  0			;; \301
57	eqlsign				;; \125
58	goto-if-not-nil 5		;; \204 \101 \000
61	constant  "%c%02d:%02d"		;; \310
62	goto	  6			;; \202 \102 \000
65:5	constant  "%c%02d"		;; \311
66:6	localref  sign			;; \002
67	localref  hh			;; \006 \007
69	localref  mm			;; \006 \006
71	localref  ss			;; \004
72	call	  5			;; \045
73	return				;; \207

(defalias 'add-log-iso8601-time-zone-lexical
#[
 ;; arglist (min max . numlocals)
 (1 1 . 7)
 ;; bytecode
 "\300\000\041\100\206\010\000\301\211\061\301\127\203\023\000\302\202\024\000\303\062\304\001\041\211\063\305\246\064\003\305\245\211\065\305\246\066\006\005\305\245\066\007\306\004\301\125\204\066\000\307\202\102\000\006\006\301\125\204\101\000\310\202\102\000\311\002\006\007\006\006\004\045\207"
 ;; constants
 [current-time-zone 0 45 43 abs 60 format "%c%02d:%02d:%02d" "%c%02d:%02d" "%c%02d"]
 ;; required stack size
 7
 ;; docstring (not present)
 ;; interactive spec (not present)
])


Example C
---------

(defun map-add (list addend)
  (mapcar
    (lambda (val)
      (+ val addend))
    list))

#[
 (list addend)
 "\301\302\"\207"
 [
  list mapcar
  #[
    (val)
    "\010\011\134\134\207"
     [val addend]
     2
  ]
 ]
 3
]

byte code for map-add:
  args: (list addend)
0	constant  mapcar
1	constant  <compiled-function>
      args: (val)
    0	    varref    val
    1	    varref    addend
    2	    plus      
    3	    return    

2	varref	  list
3	call	  2
4	return	  

;;; Disassembly of the new bytecode with lexical scoping:

byte code for map-add:
  args: (list addend)
  locals:
  constants: [mapcar]
0	constant  mapcar		;; \300
1	constant  <compiled-function>	;; \301
      args: (val)
      locals:
      constants:
    0	    localref  val		;; \000
    1	    envref    addend		;; \006 \101
    3	    plus			;; \134
    4	    return			;; \207

2	make-stack-closure		;; \251 \000
4	localref  list			;; \000
5	call	  2			;; \042
6	return				;; \207

;; New bytecode

(defalias 'map-app-lexical
#[
 (2 2 . 0)
 "\300\301\251\000\000\042\207"
 [
  mapcar
  #[
    (1 1 . 0)
    "\000\006\101\134\207"
     []
     2
  ]
 ]
 3
])

Example D
---------

(defun make-adder-1 (addend)
  (lambda (val)
    (+ val addend)))

#[
 (addend)
 "\300\207"
 [
  #[(val)
    "\010\011\134\207"
    [val addend]
    2
   ]
 ]
 1
]

byte code for make-adder-1:
  args: (addend)
0	constant  <compiled-function>	;; \300
      args: (val)
    0	    varref    val		;; \010
    1	    varref    addend		;; \011
    2	    plus			;; \134
    3	    return			;; \207

1	return				;; \207

;;; Disassembly of the new bytecode with lexical scoping:

byte code for make-adder-1:
  args: (addend)
0	makeindirect  addend		;; \262 \000 \000
3	constant  <compiled-function>	;; \300
      args: (val)
    0	    localref  val		;; \000
    1	    ienvref   addend		;; \007 \003 \000 \000
    5	    plus			;; \134
    6	    return			;; \207

4	make-full-closure 1		;; \251 \002 \001 \000
		addend			;; \000 \000 \000
10	return				;; \207

(defalias 'make-adder-1-lexical
#[
 (1 1 . 0)
 "\262\000\000\300\251\002\001\000\000\000\000\207"
 [
  #[
    (1 1 . 0)
    "\000\007\003\000\000\134\207"
    []
    2]
 ]
 1])


Example E
---------

(defun make-safe-place (initval)
  (let ((place initval))
    (cons
     (lambda ()
       place)
     (lambda (newval)
       (setq place newval)))))

;;; Traditional bytecode for make-safe-place

#[
 (initval)
 "\300\301B\207"
 [
  #[nil
    "\207"
    [place] 1]
  #[(newval)
    "\211\207"
    [newval place] 2]
 ]
 2]

;;; Disassembly for make-safe-place

byte code for make-safe-place:
  args: (initval)
0	constant  <compiled-function>
      args: nil
    0	    varref    place
    1	    return    

1	constant  <compiled-function>
      args: (newval)
    0	    varref    newval
    1	    dup	      
    2	    varset    place
    3	    return    

2	cons	  
3	return	  

;;; Disassembly for make-safe-place-lexical

byte code for make-safe-place-lexical:
  args: (initval)
0	localref   initval	;; \000
1	localset   place	;; \061
2	makeindirect place	;; \262 \001 \000
5	constant  <compiled-function>	;; \300
      args: nil
    0	    ienvref   place		;; \007 \003 \000 \000
    4	    return			;; \207

6	make-full-closure  1	;; \251 \002 \001 \000
10	     place		;; \000 \001 \000
13	constant  <compiled-function>	;; \301
      args: (newval)
    0	    localref  newval		;; \000
    1	    dup				;; \211
    2	    ienvset   place		;; \067 \003 \000 \000
    6	    return			;; \207

14	make-full-closure  1	;; \251 \002 \001 \000
18	     place		;; \000 \001 \000
21	cons			;; \102
22	return			;; \207


(defalias 'make-safe-place-lexical
#[
 (1 1 . 1)
 "\000\061\262\001\000\300\251\002\001\000\000\001\000\301\251\002\001\000\000\001\000\102\207"
 [
  #[(0 0 . 0)
    "\007\003\000\000\207"
    [] 1]
  #[(1 1 . 0)
    "\000\211\067\003\000\000\207"
    [] 2]
 ]
 2])


Example F
---------