lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re[2]: [lmi] patch: compilation fix for wx 2.9


From: Vadim Zeitlin
Subject: Re[2]: [lmi] patch: compilation fix for wx 2.9
Date: Sat, 15 May 2010 17:01:44 +0200

On Sat, 15 May 2010 14:27:47 +0000 Greg Chicares <address@hidden> wrote:

GC> Instead, could you just give an example of how you'd detect the problem?

 To test whether the strings consists of 7 bit ASCII characters only you
can use wxString::IsAscii(). To test if it consists entirely of the
characters in ISO-8859-1 (a.k.a. Latin-1 and close, although not identical,
to Windows CP-1252) you need to write your own function:

        bool IsLatin1(wxString const& s)
        {
            for ( wxString::const_iterator i = s.begin(); i != s.end(); ++i )
            {
                if ( (*i) > 0xff )
                    return false;
            }
        
            return true;
        }

And to test whether the string is convertible to a char* in the current
locale encoding without loss of information you need to perform the
conversion and check that the result is not empty. Normally this involves
an extra check that the original string hadn't been empty to begin with
(otherwise the emptiness of the result doesn't indicate a failure) but in
the code dealing with the file names the strings are normally already never
empty and so it should be enough to just use wxString::ToStdString() and
check whether the result is not empty.


 Now I don't know which of these functions do you need. I'd use IsAscii()
because this is what is universally guaranteed to work. You notice below
that Latin-1 characters seem to work but this is only the case as long as
the program uses a locale which uses it as encoding. I strongly suspect
that your tests wouldn't work under Linux where everybody uses UTF-8 now
for example. And neither would they work on a machine with Chinese locale
(as I suspect there are quite a few Chinese speakers in the USA as well
this might be not a completely impossible situation).

 Of course, dealing with the problem of representing Spanish characters in
a string using Chinese locale is exactly the kind of nightmare that Unicode
frees us from. There is really no excuse for working with non-ASCII char*
data nowadays, this is pure masochism. So IMO you should either stick to 7
bit ASCII or just use Unicode/wxString.


GC> So it looks like today we can accommodate any Spanish name--but not
GC> El Greco's real name, or my Γιαγιά's, or any name that requires
GC> non-Latin characters, as long as MinGW libstdc++ lacks std::wstream.

 Sorry for repeating myself but, once again, this has nothing to do with
lack of std::wstream. The "w" in the iostream classes names only refers to
the char type used to represent the contents of the files they're working
with and has nothing to do with their names which is all that we care about
here.

 IOW it is unlikely that this problem will be fixed any time soon in MinGW
because the C++ standard doesn't say anything about this (even in the
latest draft of C++0x there is nothing about opening files with Unicode
names AFAICS, at least 27.9.1.4 doesn't mention it). The only reasonable
solution would be to copy Microsoft extension but I'm far from sure if
glibc folks are really going to do this. To be fair, there are decent
enough reasons to not do it too but OTOH without it there is simply no way
to open a character with Unicode file name from C++ using standard library.

 Regards,
VZ

reply via email to

[Prev in Thread] Current Thread [Next in Thread]