emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Clean-up of forward-paragraph [Re: Beginingless paragraphs: second stab


From: Alan Mackenzie
Subject: Clean-up of forward-paragraph [Re: Beginingless paragraphs: second stab at a patch.]
Date: Wed, 19 Oct 2005 16:56:30 +0000 (GMT)

Hi, Richard, Hi Emacs!

On Thu, 8 Sep 2005, Richard M. Stallman wrote:

>    Do M-h on each of the lines "asdf".  The blank line is included in both
>    paragraphs.  This happens because the blank line isn't a separator here.
>    It is an ordinary line of the upper paragraph and the "heuristic" (sorry
>    about the word) blank line tacked on to the paragraph below.

>I think this is a bug.  The blank line should not be included in both
>paragraphs, only in one of them.  If something special is to be done
>to include the blank line in the following paragraph, then something
>special should also be done to exclude it from the preceding paragraph.

>Would you like to implement that?

I've been having a look at this.  There are actually several other bugs
in forward-paragraph too - for example:

(i) When there is a left margin, sometimes forward paragraph moves to
column zero, sometimes to after the margin.  I think it should always
move to after the margin if it can.  (It can't when the line doesn't
contain enough whitespace to constitute the margin.)

(ii) When `use-hard-newlines' is non-nil (i.e. Longlines Mode is
enabled), forward-paragraph can spuriously recognise a line "in the
middle of a paragraph" as a separator line when it "looks like" one.
(This only shows itself with a non-blank separator line.)

(iii) When there is a non-nil fill-prefix, f-p uses it in place of
paragraph-start when searching backwards, but not when searching
forwards.  Half of these cases is a bug.

#########################################################################

Incidentally:  The Elisp manual is a bit unclear on the page
"Paragraphs", where it says:

       When there is a fill prefix, then paragraphs are delimited by all
       lines which don't start with the fill prefix.

This doesn't make clear (? clear enough) whether this scheme of
terminating paragraphs is in addition to or in place of
paragraph-s\(tart\|eparate\).  I think one of the words "also" and
"instead" should be inserted before "delimited".  Which one?

#########################################################################

As a matter of interest, I think there are perilously many complications
in forward-paragraph.  There's left margins, fill-prefixes, hard
newlines (Longlines Mode), and paragraph-s\(tart\|eparate\).  Each of
these interacts with every other in some way, mostly in ways that have
never been formally specified.  For example, how does Emacs behave when
there's a left margin and Longlines Mode is enabled?  (I'll try this out
myself later).

AAMOI (2):  Having told people precisely how to set up
paragraph-s\(tart\|eparate\), why does forward-paragraph then go to such
lengths to correct incorrect settings:  It removes spurious leading "^"s
from these variables, and it combine the two into a regexp for searching,
even though p-start should match anything that p-separate does.

Is this really the Right Thing to do?  Would it not be better, having
checked the variables, to nag their developer with an output to
*Messages*?  The current method will conceal bugs rather than exposing
them.
#########################################################################

As a first step to fixing the bugs in forward-paragraph, I've cleaned up
the code somewhat and added some comments (though some of these are
prolix enough to need removing later on).  The patch below is intended
only to clean up the code without changing its function.  Should I commit
this patch?

If anybody wants to test the change, here is the test file I used to test
out forward-paragraph:

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
-*-fundamental-*-

In this file, hack-variables has set paragraph-start/separate and fill-column.
As appropriate, set fill-prefix to "* *" and the left margin to 3 spaces.  For
the last paragraphs, do M-x longlines-mode.



Paragraph following separator line, not matching paragraph-start, no margin.
Second line of this paragraph.


asdf
 Paragraph matching paragraph-start, not following separator, no margin.
Second line of this paragraph.
 Next paragraph

* *asdf
* *
* *Paragraph with fill-prefix "* *", following a bare fill-prefix, no margin.
* *Second line of this paragraph.
* *


* *Paragraph with fill-prefix "* *", following a separator line, no margin.
* *Second line of this paragraph.
* *


   Paragraph following separator line, not matching paragraph-start, 3-space
   margin.  The preceding and following blank lines are truly blank.

   
   Paragraph following separator line, not matching paragraph-start, 3-space
   margin.  The preceding and following "blank" lines are actually 3-space
   margins.  M-{ goes to column zero, however, not to after the margin.  This
   is probably a bug.
   
   
   
   asdf
    Paragraph matching paragraph-start, not following separator, 3-space
   margin.  Second line of this paragraph.
    Next paragraph.
   
   * *asdf
   * *
   * *Paragraph with fill-prefix "* *", following a bare fill-prefix, 3-space
   * *margin.
   
   
   * *Paragraph with fill-prefix "* *", following a separator line, 3-space
   * *margin.


This paragraph should be tested with Lonlines Mode enabled.  Its second line 
;sep; will look like a separator line without actually being one.  It also has 
a third line.

This paragraph should be tested with Longlines Mode enable.  Its second line
;sep; actually is a separator line.


This paragraph should be tested with Longlines Mode enabled.  Its second line, 
* *which follows a soft newline, looks like it begins with a fill-prefix, but 
doesn't really.  ;start;


Local Variables:
paragraph-start: ";s\\(tart\\|ep\\);\\|[ \t\n\f]"
paragraph-separate: ";sep;\\|[ \t\f]*$"
fill-column: 78
end:
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;


Here is the patch:

2005-10-19  Alan Mackenzie  <address@hidden>

        * paragraphs.el (forward-paragraph):  Clean up the code.  In
        particular:
        Eliminate the unused/frivolous variables found-start,
        multiple-lines.
        Reduce the complexity of some nested and/or/progn forms.
        Comment the code liberally.


Index: paragraphs.el
===================================================================
RCS file: /cvsroot/emacs/emacs/lisp/textmodes/paragraphs.el,v
retrieving revision 1.79
diff -c -r1.79 paragraphs.el
*** paragraphs.el       6 Aug 2005 17:41:15 -0000       1.79
--- paragraphs.el       19 Oct 2005 16:43:07 -0000
***************
*** 211,220 ****
--- 211,227 ----
         ;; starting at the left-margin.  This allows paragraph commands to
         ;; work normally with indented text.
         ;; This hack will not find problem cases like "whatever\\|^something".
+        ;; It only has any effect if paragraph-start has been given a value
+        ;; which violates its specification.
         (parstart (if (and (not (equal "" paragraph-start))
                            (equal ?^ (aref paragraph-start 0)))
                       (substring paragraph-start 1)
                     paragraph-start))
+        ;; `parsep' (pronounced "par-sepp", not "parse-pee") is a regexp
+        ;; which matches any separator line (including a bare fill-prefix),
+        ;; but NOT a paragraph starting line.
+        ;; Note that the next form has no effect if paragraph-separate has a
+        ;; valid value.
         (parsep (if (and (not (equal "" paragraph-separate))
                          (equal ?^ (aref paragraph-separate 0)))
                     (substring paragraph-separate 1)
***************
*** 224,259 ****
              (concat parsep "\\|"
                      fill-prefix-regexp "[ \t]*$")
            parsep))
!        ;; This is used for searching.
         (sp-parstart (concat "^[ \t]*\\(?:" parstart "\\|" parsep "\\)"))
!        start found-start)
      (while (and (< arg 0) (not (bobp)))
        (if (and (not (looking-at parsep))
               (re-search-backward "^\n" (max (1- (point)) (point-min)) t)
               (looking-at parsep))
          (setq arg (1+ arg))
-       (setq start (point))
-       ;; Move back over paragraph-separating lines.
        (forward-char -1) (beginning-of-line)
        (while (and (not (bobp))
                    (progn (move-to-left-margin)
                           (looking-at parsep)))
          (forward-line -1))
        (if (bobp)
            nil
          (setq arg (1+ arg))
          ;; Go to end of the previous (non-separating) line.
          (end-of-line)
!         ;; Search back for line that starts or separates paragraphs.
          (if (if fill-prefix-regexp
!                 ;; There is a fill prefix; it overrides parstart.
!                 (let (multiple-lines)
                    (while (and (progn (beginning-of-line) (not (bobp)))
                                (progn (move-to-left-margin)
                                       (not (looking-at parsep)))
                                (looking-at fill-prefix-regexp))
-                     (unless (= (point) start)
-                       (setq multiple-lines t))
                      (forward-line -1))
                    (move-to-left-margin)
                    ;; This deleted code caused a long hanging-indent line
--- 231,281 ----
              (concat parsep "\\|"
                      fill-prefix-regexp "[ \t]*$")
            parsep))
!        ;; `sp-parstart' matches a line that "looks like" white space
!        ;; followed by a paragraph separator or starter.  The WS is
!        ;; putatively a left margin.  Once a line matching this regexp has
!        ;; been found, more detailed checking is needed to say whether it
!        ;; actually is a separator/starter line.
!        ;; Note that if paragraph-start/separate have valid values, `parsep'
!        ;; is redundant in the following.
         (sp-parstart (concat "^[ \t]*\\(?:" parstart "\\|" parsep "\\)"))
!        start)
      (while (and (< arg 0) (not (bobp)))
+       ;; Crude kludge to do with a blank line being part of the following
+       ;; paragraph.  This is to be removed.  (ACM, 2005/10/19)
        (if (and (not (looking-at parsep))
               (re-search-backward "^\n" (max (1- (point)) (point-min)) t)
               (looking-at parsep))
          (setq arg (1+ arg))
        (forward-char -1) (beginning-of-line)
+ 
+       ;; Move back over paragraph-separating lines.
        (while (and (not (bobp))
                    (progn (move-to-left-margin)
                           (looking-at parsep)))
          (forward-line -1))
+       ;; We're at the margin of the first non-separator line we've met.
        (if (bobp)
            nil
          (setq arg (1+ arg))
          ;; Go to end of the previous (non-separating) line.
          (end-of-line)
! 
!         ;; Currently, we're at the EOL on the first non-separator line we've
!         ;; encountered.
!         ;;
!         ;; Search back over the meat of the paragraph for a starter or
!         ;; separator line.  The condition in the following big `if' form
!         ;; evaluates to t if we find such a line, nil if we hit BOB.
          (if (if fill-prefix-regexp
!                 ;; If there is a fill prefix, paragraph-start is ignored.
!                 ;; So is the Longlines mechanism - is this the Right Thing?
!                 ;; (ACM, 2005/10/19).
!                 (progn
                    (while (and (progn (beginning-of-line) (not (bobp)))
                                (progn (move-to-left-margin)
                                       (not (looking-at parsep)))
                                (looking-at fill-prefix-regexp))
                      (forward-line -1))
                    (move-to-left-margin)
                    ;; This deleted code caused a long hanging-indent line
***************
*** 265,302 ****
                    ;;      multiple-lines
                    ;;      (forward-line 1))
                    (not (bobp)))
                (while (and (re-search-backward sp-parstart nil 1)
!                           (setq found-start t)
!                           ;; Found a candidate, but need to check if it is a
!                           ;; REAL parstart.
!                           (progn (setq start (point))
!                                  (move-to-left-margin)
!                                  (not (looking-at parsep)))
!                           (not (and (looking-at parstart)
!                                     (or (not use-hard-newlines)
!                                         (bobp)
!                                         (get-text-property
!                                          (1- start) 'hard)))))
!                 (setq found-start nil)
!                 (goto-char start))
!               found-start)
!             ;; Found one.
              (progn
                ;; Move forward over paragraph separators.
                ;; We know this cannot reach the place we started
                ;; because we know we moved back over a non-separator.
                (while (and (not (eobp))
                            (progn (move-to-left-margin)
                                   (looking-at parsep)))
                  (forward-line 1))
                ;; If line before paragraph is just margin, back up to there.
                (end-of-line 0)
                (if (> (current-column) (current-left-margin))
                    (forward-char 1)
                  (skip-chars-backward " \t")
                  (if (not (bolp))
                      (forward-line 1))))
!           ;; No starter or separator line => use buffer beg.
            (goto-char (point-min))))))
  
      (while (and (> arg 0) (not (eobp)))
--- 287,336 ----
                    ;;      multiple-lines
                    ;;      (forward-line 1))
                    (not (bobp)))
+ 
+               ;; fill-prefix is currently null.
                (while (and (re-search-backward sp-parstart nil 1)
!                           ;; point is now at BOL of a candiate start/sep line.
!                           (setq start (point))
!                           (save-excursion
!                             (move-to-left-margin)
!                             (and (not (looking-at parsep))
!                                  (or (not (looking-at-parstart))
!                                      (and use-hard-newlines
!                                           (not (bobp))
!                                           (not (get-text-property
!                                                 (1- start) 'hard))))))))
!               (or (bobp) (move-to-left-margin))
!               (not (bobp)))
! 
!             ;; We've found a starter or separator line at or just before the
!             ;; start of paragraph we're looking for.  The start of paragraph
!             ;; can be either a starter line or the line following a
!             ;; separator.  We're at the line's left margin.
              (progn
                ;; Move forward over paragraph separators.
                ;; We know this cannot reach the place we started
                ;; because we know we moved back over a non-separator.
+               ;; 
+               ;; Surely we can move forward over at most one such line?
+               ;; (ACM, 2005/10/19)
                (while (and (not (eobp))
                            (progn (move-to-left-margin)
                                   (looking-at parsep)))
                  (forward-line 1))
                ;; If line before paragraph is just margin, back up to there.
+               ;; 
+               ;; This is the frivolous inclusion of a blank line in a
+               ;; paragraph it precedes.  This is to be taken out (ACM,
+               ;; 2005/10/19).
                (end-of-line 0)
                (if (> (current-column) (current-left-margin))
                    (forward-char 1)
                  (skip-chars-backward " \t")
                  (if (not (bolp))
                      (forward-line 1))))
! 
!           ;; We hit BOB instead of finding a starter or separator line.
            (goto-char (point-min))))))
  
      (while (and (> arg 0) (not (eobp)))
***************
*** 315,332 ****
                      (not (looking-at parsep))
                      (looking-at fill-prefix-regexp))
            (forward-line 1))
        (while (and (re-search-forward sp-parstart nil 1)
!                   (progn (setq start (match-beginning 0))
!                          (goto-char start)
!                          (not (eobp)))
!                   (progn (move-to-left-margin)
!                          (not (looking-at parsep)))
!                   (or (not (looking-at parstart))
!                       (and use-hard-newlines
!                            (not (get-text-property (1- start) 'hard)))))
          (forward-char 1))
        (if (< (point) (point-max))
!           (goto-char start))))
      (constrain-to-field nil opoint t)
      ;; Return the number of steps that could not be done.
      arg))
--- 349,375 ----
                      (not (looking-at parsep))
                      (looking-at fill-prefix-regexp))
            (forward-line 1))
+       ;; Search forward for the next separator or starter line.  Either of
+       ;; these counts as just after the current paragraph.  The left margin
+       ;; and hard newlines make this searching clumsy.
+       ;;
+       ;; What about a non-nil fill-prefix here?  Shouldn't that override
+       ;; paragraph-start here, the way it does for backward movement?  (ACM,
+       ;; 2005/10/19)
        (while (and (re-search-forward sp-parstart nil 1)
!                   (goto-char (match-beginning 0))
!                   (setq start (point)) ; Remove this later.
!                   (not (eobp))
!                   (save-excursion
!                     (move-to-left-margin)
!                     (and (not (looking-at parsep))
!                          (or (not (looking-at parstart))
!                              (and use-hard-newlines
!                                   (not (get-text-property (1- start) 
'hard)))))))
          (forward-char 1))
        (if (< (point) (point-max))
!           (goto-char start))))        ; This is BOL.  Why not to the left
!                                       ; margin?  (ACM, 2005/10/19)
      (constrain-to-field nil opoint t)
      ;; Return the number of steps that could not be done.
      arg))

-- 
Alan Mackenzie (Munich, Germany)






reply via email to

[Prev in Thread] Current Thread [Next in Thread]