unac-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Unac-devel] problem encoding ß


From: mark warren bracher
Subject: [Unac-devel] problem encoding ß
Date: Wed, 04 Sep 2002 15:22:56 -0700
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.1b) Gecko/20020904

I downloaded the latest unac lib and the Text::Unaccent perl module, and it all looks great.

I started throwing as much Spanish/French/German as I remember at it, and I've come up with one oddity. The German S-set (or sz ligature, those are the only two names I know it by) ß passes straight through any attempt to unaccent. It should be encoded as

  ß -> ss

in much the same way that the ae ligature

  æ -> ae

is encoded as two characters. I grepped through UnicodeData-3.2.0.txt a bit, but couldn't find a reference to it (at least not with the names with which I am familiar)...

Thoughts?  How would I go about patching this?

many thanks...

- mark





reply via email to

[Prev in Thread] Current Thread [Next in Thread]