bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#15107: [PATCH] Add replace-regexp-in-string regression test


From: Eli Zaretskii
Subject: bug#15107: [PATCH] Add replace-regexp-in-string regression test
Date: Thu, 01 Sep 2016 18:46:35 +0300

> From: Erik Anderson <erikbpanderson@gmail.com>
> Date: Tue, 30 Aug 2016 23:57:35 +0000
> 
> (replace-regexp-in-string "^.\\| ." #'upcase "foo bar")
> > "Foo bar"  (should be "Foo Bar")

It looks like an algorithmic design flaw.  Here's the relevant part of
replace-regexp-in-string:

      (while (and (< start l) (string-match regexp string start))
        (setq mb (match-beginning 0)
              me (match-end 0))
        ;; If we matched the empty string, make sure we advance by one char
        (when (= me mb) (setq me (min l (1+ mb))))
        ;; Generate a replacement for the matched substring.
        ;; Operate only on the substring to minimize string consing.
        ;; Set up match data for the substring for replacement;
        ;; presumably this is likely to be faster than munging the
        ;; match data directly in Lisp.
        (string-match regexp (setq str (substring string mb me)))
        (setq matches
              (cons (replace-match (if (stringp rep)
                                       rep
                                     (funcall rep (match-string 0 str)))
                                   fixedcase literal str subexp)

As you see, it first matches the (rest of the) string against REGEXP,
then takes the substring that matched, and matches that substring
again.  But the evident assumption that the match in the substring
will yield the same result is false.  In this case, the substring of
"oo bar" that matches "^.\\| ." is " b", but matching it again against
the same regexp yields just " ", because the first alternative
matches.  So 'upcase' is applied to the blank, and the rest is
history.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]