[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: bug#23425: master branch: `message' wrongly corrupts ' to curly quot
From: |
Alan Mackenzie |
Subject: |
Re: bug#23425: master branch: `message' wrongly corrupts ' to curly quote. |
Date: |
Mon, 5 Jun 2017 16:27:37 +0000 |
User-agent: |
Mutt/1.5.24 (2015-08-30) |
Hello, Paul.
On Sun, Jun 04, 2017 at 14:01:42 -0700, Paul Eggert wrote:
> Alan Mackenzie wrote:
> > We have moved from a state where everybody knew what
> > `message' did (in Emacs 24), to one with wild special characters which
> > only apply sometimes, and necessitate crazy prolix formulations to work
> > around unwanted translations of quote characters.
> This exaggerates somewhat. We moved from Emacs 24 where only % is special, to
> Emacs 25 where %, ` and ' are special.
Yes. We moved from regularity (where %x, for varying x, and nothing
else was special) to a ragbag (where there are 3 special characters,
with the two "new" ones being syntactically totally different from %).
> Although some people don't know that ` and ' are special, that's also
> true for %.
No. _Anybody_ who's used `message' knows that %s is how you print out
an arbitrary sexp. Anybody who's used printf in C knows this, too. It
is very easy not to know that ` and ' are special, and horribly easy to
get caught out by it, as happened to me.
> And although it can be annoying to write (message "%s" STR) to avoid
> unwanted translation of STR, that annoyance was already present for %.
It is not merely annoying, it is hideously irregular. Having to write
(message "%s" (format "...." arg1 arg2 ....)) screams out "we didn't
think this through properly". A call to message should only need one
format string. The change I am proposing would achieve this.
This was never the case for %. It is and always was trivially easy to
cause a literal % sign to be output by message, and there was never
danger of confusion in this.
We also have `format' and `format-message' which handle format strings
inconsistently. (Yes, I know, `format-message' was introduced
deliberately to create this inconsistency, because `format' was no
longer able to cope on its own.)
> > it makes sense to shift this burden over to the use cases where the
> > programmers need quote translation, and hence will be aware of it.
> When text-quoting-style specifies translation, most instances of ` and ' in
> Emacs messages are better off translated. So it also makes sense to translate
> by
> default in this situation, with a way to avoid translation in the rare cases
> where translation isn't wanted.
I disagree with this, of course. Translating behind people's backs is
not a friendly thing to do. Translation should only be done where it is
specifically specified.
> The question is about which approach makes more sense, not whether one
> approach is sensible and the other nonsense.
OK.
> >> although it simplifies ‘message’ (obviously), this is at the price of
> >> complicating everything else.
> > What is the "everything else" that gets thus complicated?
> I was referring to the hassle of going through hundreds or thousands of
> message
> strings or calls, deciding which instances of ` and ' should be replaced with
> %`
> and %', and replacing the instances accordingly.
Yes. There are quite a lot, but not an unmanageable number.
> It's also possible that at times we'll need two format strings instead
> of one, complicating the code.
We need two strings instead of one at the moment, with (message "%s"
(format "..." .....)). With %` and %' we'd only need one string in each
message invocation. This is simplification.
Can you give an example of something which might need two strings?
> > There are around 17,000 occurrances of "message" in our Lisp
> > sources, and probably a few in our C sources. Only (some of) those
> > containing the quote characters in the format string would need
> > amendment. These will comprise a tiny portion of these ~17,000
> How many lines do you think will be in that "tiny portion"? No matter how you
> count them, it'll be quite a few changes.
By searching for
"(\\(message\\|error\\)\\s +\\([^\"]\\|\"\\(\\\\.\\|[^\"\\]\\)*[`']\\)"
, i.e. an invocation of message or error followed by either something
which isn't a literal string, or a literal string containing ` or ', I
get 2745 matches in our Lisp sources. There'll be a smaller number also
in our C sources. I would have to enhance that regexp to recognise
comments, and maybe a few other things, but 2745 is a good first
approximation.
A very great number of these are "(error ..." handlers in condition-case
forms. A great number of those remaining could be simply and
mechanically translated, for example "don't", "can't", "couldn't", etc.,
and a lot of "`%s'"s and "`%S'"s.
I estimate there will be a few hundred forms remaining which need
decision making to adapt them. For example, where message is used in
macros, and the format string is a macro parameter.
> > and can be found easily enough with a script
> I'm afraid not, because in many cases the string is not a simple literal
> constant argument to the message function. For starters, there's also the
> error
> function; that's another 14,000 text matches in the Elisp source -- many of
> them
> false alarms of course, but not all of them.
See above.
> I'm not saying this sort of change is impossible. It's just that it'd be
> quite a
> bit of work, work that someone would need to volunteer to do. Is this really
> the
> best use of our limited resources?
Clearly, that someone would have to be me. The consequences of
surreptitious unwanted translation are so severe that I think this would
indeed be a good use of resources.
--
Alan Mackenzie (Nuremberg, Germany).
- Re: bug#23425: master branch: `message' wrongly corrupts ' to curly quote., Alan Mackenzie, 2017/06/03
- Re: bug#23425: master branch: `message' wrongly corrupts ' to curly quote., Paul Eggert, 2017/06/04
- Re: bug#23425: master branch: `message' wrongly corrupts ' to curly quote.,
Alan Mackenzie <=
- Re: bug#23425: master branch: `message' wrongly corrupts ' to curly quote., Paul Eggert, 2017/06/05
- Re: bug#23425: master branch: `message' wrongly corrupts ' to curly quote., Andreas Schwab, 2017/06/05
- Re: bug#23425: master branch: `message' wrongly corrupts ' to curly quote., Alan Mackenzie, 2017/06/05
- Re: bug#23425: master branch: `message' wrongly corrupts ' to curly quote., Clément Pit-Claudel, 2017/06/05
- Re: bug#23425: master branch: `message' wrongly corrupts ' to curly quote., Alan Mackenzie, 2017/06/07
- Re: bug#23425: master branch: `message' wrongly corrupts ' to curly quote., Clément Pit-Claudel, 2017/06/07
- Re: bug#23425: master branch: `message' wrongly corrupts ' to curly quote., Paul Eggert, 2017/06/05
- Re: bug#23425: master branch: `message' wrongly corrupts ' to curly quote., Andreas Schwab, 2017/06/06
- Re: bug#23425: master branch: `message' wrongly corrupts ' to curly quote., Clément Pit-Claudel, 2017/06/06
- Re: bug#23425: master branch: `message' wrongly corrupts ' to curly quote., Yuri Khan, 2017/06/06