guile-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: mutable interfaces - was: Guile: What's wrong with this?


From: Mark H Weaver
Subject: Re: mutable interfaces - was: Guile: What's wrong with this?
Date: Sat, 07 Jan 2012 13:30:33 -0500
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.0.92 (gnu/linux)

Bruce Korb <address@hidden> writes:

> On 01/07/12 08:13, Mark H Weaver wrote:
>>> Most of the strings that I wind up altering are created with a
>>> scm_from_locale_string() C function call.
>>
>> BTW, beware that scm_from_locale_string() is only appropriate for
>> strings that came from the user (e.g. command-line arguments, reading
>> from a port, etc).  When converting string literals from your own source
>> code, you should use scm_from_latin1_string() or scm_from_utf8_string().
>>
>> Similarly, to make symbols from C string literals, use
>> scm_from_latin1_symbol() or scm_from_utf8_symbol().
>>
>> Caveat: these functions did not exist in Guile 1.8.  If your C string
>> literals are ASCII-only, I guess it won't matter in practice which
>> function you use, although it would be good to spread the understanding
>> that C string literals should not be interpreted according to the user's
>> locale.
>
> I go back to my argument that a facilitation language needs to focus
> on being as helpful as possible.  That means doing what is likely
> wanted instead of throwing errors at every possibility.  It also means
> not changing interfaces.

Sorry, but there's no way to maintain backward compatibility here.  I
know it's a pain, but there's no getting around the fact that in order
to write proper internationalized code, we now need to think carefully
about what encoding a particular string is in.  There's no automatic way
to handle this, not even in principle.

Fortunately, most modern GNU/Linux systems default to a UTF-8 locale, in
which case scm_from_locale_string and scm_from_utf8_string will be the
same anyway.  However, there are still some systems that use a non-UTF-8
locale, and we must strive to support them properly.

> Anyway, this then?  (abbreviated)
>
> #if   GUILE_VERSION < 107000
> # define AG_SCM_STR02SCM(_s)          scm_makfrom0str(_s)
> # define AG_SCM_STR2SCM(_st,_sz)      scm_mem2string(_st,_sz)
>
> #elif   GUILE_VERSION < 200000
> # define AG_SCM_STR02SCM(_s)          scm_from_locale_string(_s)
> # define AG_SCM_STR2SCM(_st,_sz)      scm_from_locale_stringn(_st,_sz)
>
> #elif   GUILE_VERSION < 200004
> #error "autogen does not work with this version of guile"
>   choke me.

This last clause is wrong.  scm_from_utf8_string and
scm_from_utf8_stringn were in Guile 2.0.0.

> #else
> # define AG_SCM_STR02SCM(_s)          scm_from_utf8_string(_s)
> # define AG_SCM_STR2SCM(_st,_sz)      scm_from_utf8_stringn(_st,_sz)
> #endif

Just remember that this change implies that these macros should only be
used for C string literals, and must _not_ be used for strings supplied
by the user (e.g. command-line arguments and I/O).

It could very well be that you're currently overloading these functions
for both purposes, in which case you should split this pair of macros
into two distinct pairs: one pair of macros for user strings (keep using
scm_from_locale_string{,n} for these), and one pair for C string
literals (use scm_from_utf8_string{,n} for Guile 2.0.0 or newer).

Then look at each use of these old overloaded macros in your code, and
figure out whether it's operating on a string that came from the user or
a string that came from your own source code.

Again, I stress that this has nothing to do with Guile.  All software,
if it wishes to be properly internationalized, needs to think about
where a string came from.  In general, your program's source code (and
thus the C string literals it contains) will have a different encoding
than C strings that come from the user.  C strings of different
encodings are essentially of different types (even though C's type
system is too crude to distinguish them), and you must treat them as
such.

      Mark



reply via email to

[Prev in Thread] Current Thread [Next in Thread]