emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Emacs Lisp's future


From: David Kastrup
Subject: Re: Emacs Lisp's future
Date: Sun, 12 Oct 2014 18:50:42 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.4.50 (gnu/linux)

"Stephen J. Turnbull" <address@hidden> writes:

> Sigh.  It is *Emacs* that assumes the world is full of valid data,

Nonsense.  It would not need to _carefully_ _deal_ with data not fitting
an encoding if it assumed that.

It _carefully_ decodes non-representable data into a code page reserved
for non-representable data.  It will deal _properly_ with that data
while it is under control of its strings (not upper/lowercasing it or
mixing it up with other stuff) and will carefully repackage it when
encoding it.

As a consequence, it is easy to apply _any_ strategy to your data.  If
you want to clean out characters that are invalid for your application,
any respective positive or negative character and coding ranges in a
regexp pattern will carefully deal with it.

> and happily shovels any hazmat it receives on to the next user or
> program without validation.

Emacs has no way to know what input is valid for the next user or
program.  An application programmed in Elisp may know, and it has _all_
the tools to deal _gracefully_ with it since Emacs' string processing
will _not_ get confused by data it decoded itself and will preserve all
information.

> And you're right, it *is* a security problem.  Not just denial of
> service, either.  You say that behavior is what Emacs users want, and
> maybe it is.  Because most of the time the data is "nearly" valid and
> the defects are "insignificant", and hardly a security problem.  It's
> the "worse is better" philosophy.[1]

No, it is the "clueless is useless" philosophy.  Don't second-guess
other systems.  Do your job properly, regardless of what is thrown at
you.  Don't be the weakest chain in a link.

Emacs cannot be a verification engine if it has no clue what it should
be verifying.  If you know what you want, you can get it.  Regardless of
what you want.

libunistring (which is what GUILE currently uses for UTF-8 processing)
has a _closed_ set of recovery strategies.  As it stands, it is useless
for implementing Emacs-like behavior because "encode invalid bytes into
something libunistring can deal with transparently" is not part of its
recovery strategies.  Once you _have_ a useful encoding into the space
of properly working strings, _any_ recovery strategy is easy to
implement on top of that.

For a platform, being forced to a closed set of behaviors is an
extremely limiting choice.

> But the rest of the software development world is going in the
> opposite direction.  "In God we trust.  All others, present photo ID."
> Maybe they have figured something out?  Heck, even Emacs is moving in
> the direction of defending *itself* from invalid data in other ways
> (thank you, Ted Z!)

You don't need to defend yourself from something you are equipped to
deal with.

-- 
David Kastrup



reply via email to

[Prev in Thread] Current Thread [Next in Thread]