[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [bug-gettext] broken handling of unicode code point escapes in Tcl
From: |
Guido Berhoerster |
Subject: |
Re: [bug-gettext] broken handling of unicode code point escapes in Tcl |
Date: |
Tue, 25 Jun 2013 14:23:00 +0200 |
User-agent: |
Mutt/1.5.21 (2010-09-15) |
* Daiki Ueno <address@hidden> [2013-06-25 05:58]:
> Hi Guido,
>
> Guido Berhoerster <address@hidden> writes:
>
> > xgettext parsing of Tcl unicode code point escapes is broken, it tries
> > to replace the escape with the literal unicode character but does not
> > consume the last character of the escape but copies it into the output
> > which results in corrupt .po files, e.g.:
> >
> > $ cat gettext-bug.tcl
> > puts [msgcat::mc "Hello\u200e\u201cWorld\u201d"]
> >
> > $ /usr/bin/xgettext -o- gettext-bug.tcl
> > #: gettext-bug.tcl:5
> > msgid "Helloe“cWorld”d"
> > msgstr ""
>
> Thanks for the report.
>
> > It should probably not try to substitute these escapes at all as it
> > results in fragile .po files with embedded control characters, see
> > e.g. the U+200E left-to-right mark in the above example.
>
> I've just pushed the attached patch (\x fix in the patch is not really
> necessay, sorry; partially reverted in the git).
Thanks for the quick fix, that substitution works correctly now.
I still wonder why you're substituting \u escapes with unicode
characters at all, as that potentially allows unescaped control
sequences which make the .po file quite fragile?
--
Guido Berhoerster