emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Rationale for split-string?


From: Luc Teirlinck
Subject: Re: Rationale for split-string?
Date: Mon, 21 Apr 2003 22:26:01 -0500 (CDT)

Miles Bader wrote:

   I think Stephen's formulation is very natural, in that you usually
   want OMIT-NULLS to be t if you're splitting on a non-whitespace
   string.

First of all, I am not worried about Stephen's formulation being
unnatural (although the original formulation actually would produce
unnatural results in the default case), but about it breaking existing
code.

I believe you are underestimating the level of generality of
split-string and the wild heterogeneity of its applications.  It is by
no means whatsoever true that except in the whitespace case you would
want to keep all null matches.  If SEPARATORS is a "terminator
character", say newline, then a null match at the beginning counts.
There is no reason you would start the string with a terminator other
than to explicitly terminate an empty string.  The empty match at the
end does not count, because the terminator at that place just
terminates the previous match.  This is, for instance, how you would
want to split a buffer, or a file, or user input, into lines.  The way
you implement that with the current split-string is to first check for
an initial terminator and, if there is one, prepend an empty string to
the split-string output.  With the proposed new split-string, you
delete the empty match at the end from the split-string output.  That
is actually easier.  However...

The "however" is that we are not defining a *new* function but
*re*defining an *existing* function, an often used and extremely
general existing function.  That is all but guaranteed to produce a
wild variety of bugs.

In fact let us assume, for the sake of argument, that Stephen and you
are 100% right.  That would mean that any correct existing code, using
the present Emacs split-string with a non-nil SEPARATORS, checks for
empty matches at the beginning and end and adds any such matches to
the split-string output to correct the "bug" in the present
split-string.  After Stephen's change, any empty match at the
beginning and end of the string will produce not one, but two empty
strings.

Sincerely,

Luc.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]