Re: enriched-mode and switching major modes.

emacs-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: enriched-mode and switching major modes.

From:	Oliver Scholz
Subject:	Re: enriched-mode and switching major modes.
Date:	Sat, 18 Sep 2004 18:57:13 +0200
User-agent:	Gnus/5.1006 (Gnus v5.10.6) Emacs/21.3.50 (windows-nt)

Richard Stallman <address@hidden> writes:

[...]
>        Instead the user says: "This piece of text is a paragraph, because
>        I hit RET when I finished writing it.  I want it to be a paragraph
>        of the type "Standard Text".  I want "Standard Text" paragraphs to
>        be indented on the left by 1 cm and to have a font size of 10 pt.
>        Exept here, I want this paragraph to have an indentation of 2 cm."
>
> Now you're talking about the user interface concepts.  That too is
> distinct from the question of representing text inside Emacs.  The
> Emacs text representation needs to include a representation of the
> specification of a type of paragraph, in order to implement this user
> interface.

Exactly!  However, it is important to discuss implementation issues
together with the user interface concepts in order to account for the
semantics of the representation.

> But it also needs to have spaces in the buffer where there is
> indentation, so as to be compatible with how the whole Emacs Lisp
> world understands indentation.

This is where I disagree.  The box model I was talking about is an
abstact model, it can be implemented by different means.  For
instance, its left margin can indeed be implemented by spaces.  I have
been working with nothing but that so far.  So, to a certain extent, I
agree with you here.  But a few notes:

-  The semantics of space characters used to implement a box model and
   space chars used for indentation in a non-WP Emacs buffer (like in
   this message buffer in this paragraph) are different.  The former
   don't belong to the abstract document, they are only a means of
   rendering it visually.  The latter are both part of the abstract
   document (the text/plain e-mail) and of its visual representation,
   because for text/plain there is no meaningful difference between
   both.

   Because of this semantic difference, existing Emacs Lisp functions
   would not do the right thing when working on them.  I wouldn't
   expect them either to do the right thing.  `fill-paragraph' does
   not work well in a dired buffer.  `query-replace' does not work in
   a Gnus *group* buffer.

-  Even when working with space characters for indentation, it would
   probably not be as you seem to expect.  When working on a
   graphical user interface, we would have to deal with proportional
   fonts of varying sizes.  This is what people expect from a word
   processor.

   For this reason, all implementation techniques I have examined so
   far work by putting a single space character with a display
   property specifying the indentation column in canonical character
   units.  Like this:

   (progn (switch-to-buffer (generate-new-buffer "*tmp*"))
          (insert (propertize " " 'display '(space :align-to 20)))
          (insert (propertize "lirum larum" 'face
                              '(face :height 140 :inherit variable-pitch))))

   Lisp functions, not intentionally written to deal with paragraphs
   in a WP buffer, would not expect this.  The reason being that
   proportional fonts are up to today are hardly used in Emacs.  (I
   might add, that it is also rather a bit tedious to use them
   without a box model supported by the display engine ...)

-  Implementing a box model by means of space and newline characters
   works for some common cases, but it doesn't scale well to the full
   capabilities of all abstract documents that can be expressed by
   various formats.  I already mentioned tables as a point where it
   does not scale at all.  But if we want to implement XML/HTML + CSS
   (and I definitely want), then some nested boxes with borders and
   background colours are not possible to display, but it would be an
   understatement that implementing a way to let the user interact
   with them (for example by typing text) would be "difficult".  It is
   not that I have not tried ...

   I don't worry about that much yet!  I am content with implementing
   a box model by means of space and newline characters for now.  The
   Emacs word processor can do useful work without it.  If I am not
   able to hack the display engine, then that's life.  The Emacs word
   processor can do useful work without it.  That's why I talked about
   "in the long run", meaning: years.  But it won't be complete!

I am not religious about the particular implementation of a box model!
All I do care about are the design principles of WP functionality.  I
must say that I have felt very frustrated in the last days, because it
seemed to me that we have a fundamental disagreement about concepts
and design principles.  Right now I am not sure.  Maybe what you
called a "hybrid solution" would be perfectly in order, depending on
the specific features that you have in mind.

Maybe we should discuss this on a concrete implementation.  I have
assembled a quick example from (now abandoned) code that I have
written almost a year ago:

wp-example.el
Description: application/emacs-lisp

Load this file and then type `M-x wp-example RET' you will then see a
buffer with some example text. You can type text; you can change some
of its properties with `M-x wp-set-paragraph-property'. `M-x
wp-example-encode RET' will encode the contents of that WP buffer and
show the encoded document in a temp-buffer.

Please bear with me.  The code has probably a lot of bugs; and the
`wp-.*' functions are just quickly hacked together right now.  I have
abandoned this approach, because the data structure is modelled for
RTF alone, which quickly seemed wrong to me.  I have been thinking
about a more general design since then.

But maybe it provides an example for what I have in mind. You have to
hit M-q to fill a paragraph -- I have never come to implement
something for auto-fill or even refill. M-q calls
`epos-fill-paragraph' which in turn calls
`epos-fill-region-as-paragraph'. This latter function is the heart of
the whitespace formatting. The important data-structure are two
defstructs stored in the text properties `epos-character' and
`epos-paragraph'.  Characters which are inserted by
`epos-fill-region-as-paragraph' are marked with a text property
`epos-transient', those characters are removed whenever appropriate.

You shouldn't hit RET, btw.  RET would have to be bound to a function
which would create a new paragraph.  I also never implemented this.
Right now RET would just insert a rogue newline, which will be
removed after M-q.

[It seems that this thread is becoming infamous as the "thread with
the 30 kb mails".  I am to blame for that. My apologies. I have been
thinking about word processing in Emacs for years, reading
specifications, writing prototypes and experiments, acquiring coding
skills.  Right now I'd like to find out whether I should abandon that
project altogether now, because I fear that it won't be welcome
because of its fundamental design principles and that it will never be
complete.]

>          b) For a given document, two different applications or the
>             same application on two different machines/operating
>             systems might render two different visual representations.
>
> This is not specifically a problem, and may even give us extra
> flexibility.

Yes.  In fact, I regard it as one of the strengths of word processing.
This is where the difference between word processing and "desktop
publishing" (DTP) becomes most manifest.  Emacs could really excel
here by carrying that principle further to media that are not catered
for yet by traditional word processors: character consoles.

[I didn't meant everything I said to be problematical.  I just wanted
to make the concepts clear.  The term "word processing" is used
ambigously in the wild and part of my work in the past has been to
draw clear conceptual distinctions.]

>     This is very important: If a user enters space characters into an
>     Emacs buffers, she wants there to be space characters.  Those
>     characters would have to become part of the character data in the
>     encoded file.  But if a user just specifies: I want this paragraph to
>     be indented, then the space characters used to display the left
>     margin _must_not_ become part of the encoded file.
>
> Why do you think so?  It seems to me that these two different user actions
> should both produce spaces in the buffer--in one case, inserted manually,
> in the other, caused by the format specification.
>
> We can recompute line breaks automatically, and represent them by
> newline characters in the buffer, which will be removed and replaced
> by the next recomputation.  In this mode, the user would not manually
> enter newline characters except to create breaks (hard newlines).

How do you want to solve the problem of distinguishing manually
inserted spaces from spaces added programatically for visual
rendering when the document is encoded and written to a file?  I
explained this problem in my last mail with an example RTF.  If the
user may create indentation both by inserting space characters and by
adding a text property via a command, then we have an ambiguity in
the user interface, which makes it hazardous.

About newlines: I have thought about using hard newlines for
paragraph separation.  The problem is that I need a place to store
properties belonging to the whole paragraph.  I have thought about
(and experimented with) putting those properties on the final hard
newline.  But I found it inelegant and there is also some hair in
there.  The example code identifies as paragraphs every sequence of
characters with an `eq' `epos-paragraph' property; the fill function
then takes care of the whitespace.  I think that this works better.

>     Erm, what does the concept of "what's really there" in that context
>     mean?  In the buffer, or more generally spoken: in the data structure
>     a containing block box, or a text property storing formatting
>     information is, of course, no less there than any space or newline
>     character added for whitespace formatting.
>
> Something is "really there" if other Lisp programs will see it in the
> way that they are written to look for it.

That's what I would regard as "being there" also.  But I don't see
why this would exclude a box model supported by the display engine
(i.e., as you stated more precisely: a box that is rendered at
display time).  No matter how it happens to be implemented, there
would have to be means to check for a box from Lisp and to examine
its properties.

[This goes actually well beyond the issue at hand: But I'd like to
note: we have an open issue here: we have no way to determine the
screen column of a buffer position (in canonical character units) from
Lisp.  This function is a requirement for implementing filling that
DTRT with proportional fonts or with fonts that have a different width
than the default font.  I am by no means competent here, so I may be
horribly wrong: but when I looked into it, it seemed to me that this
would require help from the display engine and thus short-circuit the
boundaries between the display engine and Lisp functions that examine
the buffer text.  Again: I may be horribly wrong.  Nevertheless, with
a box model that specifies how to render text at display time, we
would have no need for such a function.]

    Oliver
-- 
Oliver Scholz               Jour du Travail de l'Année 212 de la Révolution
Ostendstr. 61               Liberté, Egalité, Fraternité!
60314 Frankfurt a. M.

[Prev in Thread]

Current Thread

[Next in Thread]

Re: enriched-mode and switching major modes., (continued)

Prev by Date: Re: enriched-mode and switching major modes.
Next by Date: Re: enriched-mode and switching major modes.
Previous by thread: Re: enriched-mode and switching major modes.
Next by thread: Re: enriched-mode and switching major modes.
Index(es):
- Date
- Thread