bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#20707: [PROPOSED PATCH] Use curved quoting in C-generated errors


From: Eli Zaretskii
Subject: bug#20707: [PROPOSED PATCH] Use curved quoting in C-generated errors
Date: Mon, 01 Jun 2015 21:29:05 +0300

> Date: Mon, 01 Jun 2015 10:55:44 -0700
> From: Paul Eggert <eggert@cs.ucla.edu>
> CC: 20707@debbugs.gnu.org
> 
> On 06/01/2015 07:34 AM, Eli Zaretskii wrote:
> >  They use UTF-8 encoded characters, and so will require a suitable 
> > 'coding:' cookie in the affected files, or some equivalent setting 
> > (perhaps in .dir-locals.el?), otherwise they might not be decoded 
> > correctly in non-UTF-8 locales.
> 
> Good point, and this raises a related issue with all text files in the 
> Emacs repository that aren't ASCII.  The way I've been testing this sort 
> of thing is to visit a source file in a Latin-9 locale, and if it is 
> correctly decoded as UTF-8 then I don't bother with adding a coding: 
> cookie.

I'm not sure testing this in a single non-UTF-8 locale is enough.

> To be honest I've been hoping that use of non-UTF-8 locales 
> would be dying off among Emacs developers, so that we wouldn't need to 
> worry about sprinkling coding: cookies everywhere.

There are no UTF-8 locales on Windows, and I don't think there are any
plans to introduce them.

Perhaps teaching .dir-locals.el to support directory-wide default
encoding is the best approach.

> > Doing so might remove at least part of the need for using the u8 
> > qualifier, I think.
> 
> The u8 prefix is for C compilers, not for Emacs, and the C compilers 
> won't know about coding: cookies.

Right, but why do we want u8 there?  Isn't it to make sure the string
ends up in UTF-8 in the binary when the source code is encoded in
something other than UTF-8?

> Come to think of it, though, perhaps we can dispense with u8.  As far as 
> I know u8 is needed only for MS-Windows compilers when the source code 
> is in UTF-16 or suchlike but you want the string to be UTF-8.

We no longer support Microsoft compilers anyway.

> > 'error' calls 'verror', which calls 'make_string' to actually produce 
> > the message string. However, 'make_string' is not reliable enough wrt 
> > whether it produces unibyte or multibyte strings
> 
> Hmm, why isn't make_string reliable enough?  If the string is validly 
> encoded UTF-8 (a safe assumption here), then make_string should produce 
> a unibyte string if its ASCII only, and a multibyte string otherwise, 
> and either way the string value should be OK. What am I missing?

The convoluted logic of deciding when to produce a multibyte string
sometimes surprised me in the past, so I tend not to trust it.  We
know what we want in this case, so why not use make_multibyte_string
directly?





reply via email to

[Prev in Thread] Current Thread [Next in Thread]