[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: fencepost error in encoding processing
From: |
Mike Gran |
Subject: |
Re: fencepost error in encoding processing |
Date: |
Sun, 15 Nov 2009 00:48:57 -0800 (PST) |
> From: Ken Raeburn address@hidden
> iconv_t
> iconv_open (const char *tocode, const char *fromcode)
> {
> /* Normalize the name. We remove all characters beside alpha-numeric,
> '_', '-', '/', '.', and ':'. */
> ...
>
> If there's reason to believe that all these characters might show up in valid
> encoding names, we might want to borrow that list for scm_i_scan_for_encoding
> too. In fact, since we don't control the iconv implementation, we should
> probably be *at least* as lenient as glibc in accepting random characters.
You're probably right. A rather complete set of aliases for encoding names
can be found by exploring ICU's database of aliases at
http://demo.icu-project.org/icu-bin/convexp
Among those aliases, there are examples of the punctuation in the
iconv_open code snippet above (underscore, hyphen, forward slash, period, and
colon)
as well as three others: equals sign, plus sign, and comma. The specification
on iconv_open doesn't place any limits on the characters allowed as input to
iconv_open; however, these eight symbols above should cover any case we're
likely
to see.
>
> Okay to check in?
FWIW, it looks right to me.
-Mike Gran