[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#20707: [PROPOSED PATCH] Use curved quoting in C-generated errors
From: |
Eli Zaretskii |
Subject: |
bug#20707: [PROPOSED PATCH] Use curved quoting in C-generated errors |
Date: |
Mon, 01 Jun 2015 21:29:05 +0300 |
> Date: Mon, 01 Jun 2015 10:55:44 -0700
> From: Paul Eggert <eggert@cs.ucla.edu>
> CC: 20707@debbugs.gnu.org
>
> On 06/01/2015 07:34 AM, Eli Zaretskii wrote:
> > They use UTF-8 encoded characters, and so will require a suitable
> > 'coding:' cookie in the affected files, or some equivalent setting
> > (perhaps in .dir-locals.el?), otherwise they might not be decoded
> > correctly in non-UTF-8 locales.
>
> Good point, and this raises a related issue with all text files in the
> Emacs repository that aren't ASCII. The way I've been testing this sort
> of thing is to visit a source file in a Latin-9 locale, and if it is
> correctly decoded as UTF-8 then I don't bother with adding a coding:
> cookie.
I'm not sure testing this in a single non-UTF-8 locale is enough.
> To be honest I've been hoping that use of non-UTF-8 locales
> would be dying off among Emacs developers, so that we wouldn't need to
> worry about sprinkling coding: cookies everywhere.
There are no UTF-8 locales on Windows, and I don't think there are any
plans to introduce them.
Perhaps teaching .dir-locals.el to support directory-wide default
encoding is the best approach.
> > Doing so might remove at least part of the need for using the u8
> > qualifier, I think.
>
> The u8 prefix is for C compilers, not for Emacs, and the C compilers
> won't know about coding: cookies.
Right, but why do we want u8 there? Isn't it to make sure the string
ends up in UTF-8 in the binary when the source code is encoded in
something other than UTF-8?
> Come to think of it, though, perhaps we can dispense with u8. As far as
> I know u8 is needed only for MS-Windows compilers when the source code
> is in UTF-16 or suchlike but you want the string to be UTF-8.
We no longer support Microsoft compilers anyway.
> > 'error' calls 'verror', which calls 'make_string' to actually produce
> > the message string. However, 'make_string' is not reliable enough wrt
> > whether it produces unibyte or multibyte strings
>
> Hmm, why isn't make_string reliable enough? If the string is validly
> encoded UTF-8 (a safe assumption here), then make_string should produce
> a unibyte string if its ASCII only, and a multibyte string otherwise,
> and either way the string value should be OK. What am I missing?
The convoluted logic of deciding when to produce a multibyte string
sometimes surprised me in the past, so I tend not to trust it. We
know what we want in this case, so why not use make_multibyte_string
directly?
bug#20707: [PROPOSED PATCH] Use curved quoting in C-generated errors, Eli Zaretskii, 2015/06/01
bug#20707: [PROPOSED PATCH] Use curved quoting in C-generated errors, Wolfgang Jenkner, 2015/06/09
bug#20707: Use curved quoting in C-generated errors, Andy Moreton, 2015/06/11