emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

A vision for multiple major modes: some design notes


From: Alan Mackenzie
Subject: A vision for multiple major modes: some design notes
Date: Wed, 20 Apr 2016 19:44:50 +0000
User-agent: Mutt/1.5.24 (2015-08-30)

Hello, Dmitry and Emacs.

This post describes my notion of how multiple major modes {c,sh}ould be
implemented.  Key notions are "islands", "island chains", and "chain
local" variable bindings.

In this scheme, "super modes" will not have to do anything to swap in/out
local variable bindings pertinent to islands; this will be done by the
underlying C code.  Narrowing/widening will not be (ab)used by the super
mode mechanism.  Major modes will continue to be able to use the entire
range of Emacs facilities.

Here are some design notes:

(i) Overview and motivation.
  o - The aim is to support several major modes simultaneously in a single
    buffer.
  o - The "super mode" will set up "chains of islands" (see below).
    * - Each chain will have its own major mode, key map, syntax table, etc.
    * - In each chain, "chain local" variable bindings will exist.  Such a
      binding will be current when point is within an island in the chain.
    * - The coordination of these bindings will be carried out by the
      mechanisms described below, without explicit coding in the super mode.
  o - To the user, the current major mode will be that of the island where
    point is.  All familiar commands will work without restriction.
  o - To the writer of major modes, a minimal set of restrictions will apply:
    * - For some major mode commands, the mode will have to bind the variable
      `in-islands' (see below) to non-nil.
    * - For regexps which recognise whitespace, the regexp must contain "\\s-"
      or "\\s " or "[[:space:]]" so that the regexp engine will handle
      "foreign" islands and gaps between chained islands as whitespace.
    * - All other Emacs facilities will be available for use, being adapted as
      necessary for the island mechanism.

(ii) Definitions and concepts.
  o - An @dfn{island} is a contiguous portion of a buffer marked at each end.
    Its attributes are those of the chain of islands of which it is an
    element.
  o - A @dfn{chain} of islands is a canonically ordered chain of islands in a
    single buffer.  An island chain has its own major mode; it has its own
    syntax table, abbreviation table, font lock settings, etc.  It has its own
    bindings of (most) "buffer" local variables.
  o - An island chain will have @dfn{chain local} variable bindings.  Such a
    binding will become current and accessible when point is within one of the
    chain's islands.  When point is not in an island, the buffer local binding
    of the variable will be current.  Most variables which are currently
    buffer local in Emacs 25 will become chain local.  Those (relatively few)
    variables which must retain a single value over an entire buffer will be
    marked as such with a non-nil value of the `entire-buffer' property.
  o - The variable `using-islands' will be set non-nil to indicate the current
    buffer is using the island mechanism.
  o - The variable `in-islands' will control island and island chain
    facilities.  When this variable is bound to non-nil, the facilities
    described here (such as chain local variables) are active.  When the
    variable is nil, (most of) the new facilities are inactive, and Emacs
    behaves as Emacs 25.

(iii) Island Chains.
  o - An island chain will be a Lisp object which is a C struct similar to
    struct buffer.  In particular, it will contain slots for common chain
    local variables, and an association list for bindings of other chain local
    variables.
  o - An island chain might contain pointers to the first and last of its
    islands (still to be decided).

(iv) Islands.
  o - An island will be delimited in two complementary ways:
    * - It will be enclosed syntactically by characters with "open island" and
      "close island" syntax (see section (v)).  Both of these syntactic
      markers will include a flag "chain" indicating whether there is a
      previous/next island in the chain.  The cdr of the syntax value will be
      the island chain to which the island belongs.
    * - It will be covered by the text property `island', whose value will be
      the pertinent island or island chain (see section (ii)) (not yet
      decided).  Note that if islands are enclosed inside other islands, the
      value is the innermost island.  There is the possibility of using an
      interval tree independent of the one for text properties to increase
      performance.
  o - An island might be represented by a C or Lisp structure, it might not
    (not yet decided).  This structure would hold the containing chain,
    markers pointing to the start and end of the chain, and the previous and
    next islands in the chain.

(v) Syntax, etc.
  o - Two new syntax classes, "open island" and "close island" will be
    introduced.  These will be designated by the characters "{" and "}".  Their
    "matching character" slots will contain the island's chain.  There will be
    an extra flag "chain" (denoted by "i") indicating whether there is a
    previous/next island in the chain.
  o - `scan-lists', `scan-sexps', etc. will treat a "foreign" island as
    whitespace, much as they do comments.  They will also treat as whitespace
    the gap between two islands in a chain.
  o - The (currently 11 element) parser state will be enhanced to support
    islands as follows:
    * - A twelfth element will be introduced.  This will contain an
      association list whose elements will have the form (island-chain
      . 12-element parse state); each element will contain the suspended state
      of parsing in the island chain which is the car of the element.  An
      element with a car of nil will represent the suspended parsing state of
      the buffer outside of islands.
    * - Elements 12, 13, .... will be island chains of the enclosing islands,
      elt 12 being that of the innermost enclosing island, etc.  An element
      with a value of nil indicates being outside all islands.
  o - `parse-partial-sexp' will create and use an enhanced parser state as
    described above.  Note that a two character construct (such as a C comment
    opener) can not enclose an island, and special handling will be required
    to exclude this.  The syntax table in use will change as the current
    position passes between islands.
  o - `syntax-ppss' will do the right thing with the extended parser state.
    Alternatively, `syntax-ppss' will have an independent 12-element state in
    each island chain, where elt. 11 is always nil.  Its cache mechanism will
    be enhanced such that buffer changes outside of an island chain need not
    invalidate the stored cache pertaining to the chain.
  o - The facilities in this section are active even when `in-islands' is
    nil.

(vi) Regexps.
  o - The regexp engine will be enhanced such that the regexps "\\s-", "\\s ",
    and "[[:space:]] will match an entire island.
  o - The gap between two islands in a chain will also be matched by the above
    regexps.
  o - This treatment of an island, and a gap between two islands, as WS will
    occur only when `in-islands' is non-nil.
  o - When `in-islands' is nil, there will be no reliable way of scanning over
    an island by regexps, since it is a potentially nested structure, and FSMs
    don't recognise arbitrarily nested structures.

(vii) Variables.
  o - Island chain local variable bindings will come into existence.  These
    bindings depend on the island point is in.  There will be lower level
    routines that will have "position" parameters as an alternative to using
    point.
  o - All variables which are currently buffer local will become chain local
    except for those whose symbols are given a non-nil `entire-buffer'
    property.  There will be no new functions like
    `make-chain-local-variable'.
  o - When the `entire-buffer' property is nil, the buffer local binding of a
    variable will hold the value pertinent to the areas of the buffer outside
    of islands.  When that property is non-nil, the binding holds the value
    for the entire buffer.
  o - When `in-islands' is nil, the chain local mechanism described here is
    not used - instead the familiar buffer local binding is used.
  o - The current binding for a local variable will be the chain local binding
    of the island chain of the island containing point.  If point is not in an
    island, the buffer local binding is current.
  o - If a chain local binding is current, and its value is unbound, the
    binding of an enclosing scope is NOT used in its place.  Probably the
    variable's default-value should be used when reading.
  o - In buffer.h, a new macro CVAR ("island chain variable") analogous to
    BVAR will be introduced.  It will use BVAR as a fall back.  Most
    invocations of BVAR will be changed to CVAR.
  o - In data.c, the mechanism for accessing local variable bindings
    (e.g. `swap_in_symval_forwarding') will be enhanced to test `in-islands'
    and handle chain local bindings appropriately.

(viii) Change hooks.
  o - There will be two additional abnormal hooks,
    `island-before-change-function' and `island-after-change-function', which
    will each hold a single function or nil.  These will take the same
    parameters as `before-change-functions' and `after-change-functions'
    respectively.
  o - The return value of these functions will be an association list with
    members whose car is an island chain (or nil, meaning "outside all
    islands") and whose cdr is the list of parameters to supply to
    `before/after-change-functions for that chain.  Usually, the alist will
    have just one member containing BEG, END, and for `after-..' OLD-LEN
    unchanged.
  o - After calling each of these functions, Emacs will invoke
    `before/after-change-functions' on each chain in the returned alist.  This
    will be in place of the standard calls to `before/after-change-functions'.
  o - The intention of these hooks is that super modes will use them to detect
    the deletion and insertion of islands, and to do the "de-islandification"
    and "islandification" as needed.
  o - `before/after-change-functions' will be normal chain local variables.
    A chain local binding will hold functions for the individual chain.  The
    buffer local binding will hold functions for the parts of the buffer
    outside of islands.

(ix) Miscellaneous commands and functions.
  o - `point-min' and `point-max' will, when `in-islands' is non-nil, return
    the max/min point in the visible region in the same chain of islands as
    point.
  o - `search-\(forward\|backward\)\(-regexp\)?' will restrict themselves to
    the current island chain when `in-islands' is non-nil.
  o - `skip-\(chars\|syntax\)-\(forward\|backward\)' will likewise operate in
    the current island chain (how?) when `in-islands' is non-nil.
  o - `\(next\|previous\)-\(single\|char\)-property-change', etc., will do the
    Right Thing in island chains when `in-islands' is non-nil.
  o - New functions `island-min', `island-max', `island-chain-min' and
    `island-chain-max' will do what their names say.
  o - There will be no restrictions on the use of widening/narrowing, as have
    been proposed for other support engines for multiple major modes.
  o - New commands like `beginning-of-island', `narrow-to-island', etc. will
    be wanted.  More difficultly, bindings for them will be needed.
  o - ??? Other commands to be amended.

(x) Emacs subsystems and `in-islands'.
  o - Redisplay will bind `in-islands' to non-nil, but will successfully
    display all islands wholly or partially in windows being displayed.
  o - Font Lock will bind `in-islands' to non-nil, but will successfully
    fontify all pertinent islands.
  o - `island-before/after-change-function' will be called with `in-islands'
    nil.
  o - `before/after-change-functions' will be called with `in-islands' bound
    to non-nil.
  o - Major modes will need to bind `in-islands' to non-nil for such things as
    indentation.
  o - For normal user interaction, `in-islands' will be nil.

-- 
Alan Mackenzie (Nuremberg, Germany).



reply via email to

[Prev in Thread] Current Thread [Next in Thread]