lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] UTF-16 output from automated GUI test?


From: Vadim Zeitlin
Subject: Re: [lmi] UTF-16 output from automated GUI test?
Date: Thu, 20 Oct 2016 13:23:47 +0200

On Thu, 20 Oct 2016 01:14:29 +0000 Greg Chicares <address@hidden> wrote:

GC> I searched thus:
GC>   git log -S _setmode
GC> and the last change we made involving _setmode() was six years ago.
GC> 
GC> But I was sure we did something related to this recently...

 Your memory is much better than mine (but I'm afraid this is a rather
feeble compliment...) and has directly led you to the right answer.

GC> searching again and omitting the leading underscore:
GC> 
GC> commit b47c9d49177f6a57863184929d63e69fba735bb7
GC> Author: Gregory W. Chicares <address@hidden>
GC> Date:   Mon Aug 22 21:30:56 2016 +0000
GC> 
GC>     Force standard output streams to binary mode
GC>     
GC>     See:
GC>       http://lists.nongnu.org/archive/html/lmi/2016-08/msg00015.html
GC> 
GC> That change adds this code to a couple of main() functions:
GC> 
GC> +#if defined LMI_MSW
GC> +    // Force standard output streams to binary mode.
GC> +    setmode(fileno(stdout), O_BINARY);
GC> +    setmode(fileno(stderr), O_BINARY);
GC> +#endif // defined LMI_MSW
GC> 
GC> ...which contains no _O_WTEXT or _O_U16TEXT.

 Nevertheless, O_BINARY does result in UTF-16 output as I've just tested in
my simple example. Thinking about it, it's not really that surprising: the
string we pass to fputws() is a Unicode (wchar_t, which is UTF-16 under
MSW) string and O_BINARY apparently disables all conversions, including
those to the current code page (O_TEXT) or UTF-8 (O_U8TEXT), and not just
the end-of-line marker replacements.

 So, as it stands, we can either have ASCII (or UTF-8) output with CR LF in
it or UTF-16 output with only LF. And unfortunately I don't see any simple
way to make it work as you'd like, i.e. output ASCII without CRs. Of
course, it could be done by explicitly converting the strings in the code,
but this is not very nice and error-prone. The only global solution I see
would be to build in UTF-8 mode which would do the conversions in wx
itself, but this is a big change with a lot of ramifications and I just
don't see you agreeing to or even considering it in the near future.

 Hence the only resolution I can see is to revert the commit above and live
with CRs in the output under MSW (which are anyhow "natural" there, as I
tried to argue before), as it's almost certainly preferable to having to
deal with UTF-16 instead of simple ASCII.

 Sorry for lack of better ideas,
VZ


reply via email to

[Prev in Thread] Current Thread [Next in Thread]