[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[GNUnet-developers] encoding: normalization [Was: Re: Music insertion]
From: |
Christian Grothoff |
Subject: |
[GNUnet-developers] encoding: normalization [Was: Re: Music insertion] |
Date: |
Sun, 5 Dec 2004 15:18:17 -0500 |
User-agent: |
KMail/1.7.1 |
On Saturday 04 December 2004 17:20, Alexander Winston wrote:
> Unicode provides 4 normalization forms
> (<http://www.unicode.org/reports/tr15/>):
>
> * Normalization Form D (NFD)
> * Normalization Form C (NFC)
> * Normalization Form KD (NFKD)
> * Normalization Form KC (NFKC)
>
> Given the nature of GNUnet, I suggest normalizing all the proposed
> keywords using NFC and NFKC, removing the duplicate keywords, and then
> adding the remaining keywords.
>
> I still have little experience with normalization, however, so please
> take this advice with a grain of salt.
Right. Even if we use UTF-8, we still have to think about normalization. And
I believe this issue fully applies to UTF-8 (after all, UTF-8 is just a
unicode encoding). Actually, it might be worse: if I recall correctly there
are different UTF-8 encodings for some unicode characters, so we have the
normalization issue for unicode *and* for UTF-8. So if anyone has any
experience here, please speak up. I was thinking of using libiconv to
convert to UTF-8. Will this produce a canonical representation? Which one?
If not, is there some free code available that will do the canonicalization?
Christian
- Re: [GNUnet-developers] Music insertion, (continued)
- Re: [GNUnet-developers] Music insertion, Christian Grothoff, 2004/12/05
- Re: [GNUnet-developers] Music insertion, N. Durner, 2004/12/05
- Re: [GNUnet-developers] Music insertion, Milan, 2004/12/05
- Re: [GNUnet-developers] Music insertion, Christian Grothoff, 2004/12/05
- Re: [GNUnet-developers] Music insertion, Marcos D. Marado Torres, 2004/12/05
Re: [GNUnet-developers] Music insertion, Alexander Winston, 2004/12/04
[GNUnet-developers] meta-data and keyword encoding [Was: Music insertion], Christian Grothoff, 2004/12/05